Merge branch 'dev'

2025-04-22 14:38:10 +00:00 · 2019-11-13 13:25:09 +07:00 · 2019-11-13 13:25:09 +07:00 · 9d24d181d7
commit 9d24d181d7
parent 919a6c0cc4 b011b935b4
381 changed files with 43961 additions and 4660 deletions
--- a/.gitignore
+++ b/.gitignore
@ -1,2 +1,4 @@
 /build
 /CMakeLists.txt.user
+/.idea
+/src/backend/opencl/cl/cn/cryptonight_gen.cl
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@ -1,54 +1,25 @@
-# v3.2.0
- Added per pool option `coin` with single possible value `monero` for pools without algorithm negotiation, for upcoming Monero fork.
- [#1183](https://github.com/xmrig/xmrig/issues/1183) Fixed compatibility with systemd.
+# v5.0.0
+This version is first stable unified 3 in 1 GPU+CPU release, OpenCL support built in in miner and not require additional external dependencies on compile time, NVIDIA CUDA available as external [CUDA plugin](https://github.com/xmrig/xmrig-cuda), for convenient, 3 in 1 downloads with recent CUDA version also provided.

-# v3.1.3
- [#1180](https://github.com/xmrig/xmrig/issues/1180) Fixed possible duplicated shares after algorithm switching.
- Fixed wrong config file permissions after write (only gcc builds on recent Windows 10 affected).
+This release based on 4.x.x series and include all features from v4.6.2-beta, changelog below include only the most important changes, [full changelog](doc/CHANGELOG_OLD.md) available separately.

-# v3.1.2
- Many RandomX optimizations and fixes.
-  - [#1132](https://github.com/xmrig/xmrig/issues/1132) Fixed build on CentOS 7.
-  - [#1163](https://github.com/xmrig/xmrig/pull/1163) Optimized soft AES code, up to +30% hashrate on CPU without AES support and other optimizations.
-  - [#1166](https://github.com/xmrig/xmrig/pull/1166) Fixed crash when initialize dataset with big threads count (eg 272).
-  - [#1168](https://github.com/xmrig/xmrig/pull/1168) Optimized loading from scratchpad.
- [#1128](https://github.com/xmrig/xmrig/issues/1128) Fixed CMake 2.8 compatibility.
-
-# v3.1.1
- [#1133](https://github.com/xmrig/xmrig/issues/1133) Fixed syslog regression.
- [#1138](https://github.com/xmrig/xmrig/issues/1138) Fixed multiple network bugs.
- [#1141](https://github.com/xmrig/xmrig/issues/1141) Fixed log in background mode.
- [#1142](https://github.com/xmrig/xmrig/pull/1142) RandomX hashrate improved by 0.5-1.5% depending on variant and CPU.
- [#1146](https://github.com/xmrig/xmrig/pull/1146) Fixed race condition in RandomX thread init.
- [#1148](https://github.com/xmrig/xmrig/pull/1148) Fixed, on Linux linker marking entire executable as having an executable stack.
- Fixed, for Argon2 algorithms command line options like `--threads` was ignored.
- Fixed command line options for single pool, free order allowed again.
-
-# v3.1.0
- [#1107](https://github.com/xmrig/xmrig/issues/1107#issuecomment-522235892) Added Argon2 algorithm family: `argon2/chukwa` and `argon2/wrkz`.
-
-# v3.0.0
- **[#1111](https://github.com/xmrig/xmrig/pull/1111) Added RandomX (`rx/test`) algorithm for testing and benchmarking.**
- **[#1036](https://github.com/xmrig/xmrig/pull/1036) Added RandomWOW (`rx/wow`) algorithm for [Wownero](http://wownero.org/).**
- **[#1050](https://github.com/xmrig/xmrig/pull/1050) Added RandomXL (`rx/loki`) algorithm for [Loki](https://loki.network/).**
- **[#1077](https://github.com/xmrig/xmrig/issues/1077) Added NUMA support via hwloc**.
- **Added flexible [multi algorithm](doc/CPU.md) configuration.**
- **Added unlimited switching between incompatible algorithms, all mining options can be changed in runtime.**
- [#257](https://github.com/xmrig/xmrig-nvidia/pull/257) New logging subsystem, file and syslog now always without colors.
- [#314](https://github.com/xmrig/xmrig-proxy/issues/314) Added donate over proxy feature.
- [#1007](https://github.com/xmrig/xmrig/issues/1007) Old HTTP API backend based on libmicrohttpd, replaced to custom HTTP server (libuv + http_parser).
- [#1010](https://github.com/xmrig/xmrig/pull/1010#issuecomment-482632107) Added daemon support (solo mining).
- [#1066](https://github.com/xmrig/xmrig/issues/1066#issuecomment-518080529) Added error message if pool not ready for RandomX.
- [#1105](https://github.com/xmrig/xmrig/issues/1105) Improved auto configuration for `cn-pico` algorithm.
- Added commands `pause` and `resume` via JSON RPC 2.0 API (`POST /json_rpc`).
- Added command line option `--export-topology` for export hwloc topology to a XML file.
- Breaked backward compatibility with previous configs and command line, `variant` option replaced to `algo`, global option `algo` removed, all CPU related settings moved to `cpu` object.
- Options `av`, `safe` and `max-cpu-usage` removed.
- Algorithm `cn/msr` renamed to `cn/fast`.
- Algorithm `cn/xtl` removed.
- API endpoint `GET /1/threads` replaced to `GET /2/backends`.
- Added global uptime and extended connection information in API.
- API now return current algorithm.
+- [#1272](https://github.com/xmrig/xmrig/pull/1272) Optimized hashrate calculation.
+- [#1263](https://github.com/xmrig/xmrig/pull/1263) Added new option `dataset_host` for NVIDIA GPUs with less than 4 GB memory (RandomX only).
+- [#1068](https://github.com/xmrig/xmrig/pull/1068) Added support for `self-select` stratum protocol extension.
+- [#1227](https://github.com/xmrig/xmrig/pull/1227) Added new algorithm `rx/arq`, RandomX variant for upcoming ArQmA fork.
+- [#808](https://github.com/xmrig/xmrig/issues/808#issuecomment-539297156) Added experimental support for persistent memory for CPU mining threads.
+- [#1221](https://github.com/xmrig/xmrig/issues/1221) Improved RandomX dataset memory usage and initialization speed for NUMA machines.
+- [#1175](https://github.com/xmrig/xmrig/issues/1175) Fixed support for systems where total count of NUMA nodes not equal usable nodes count.
+- Added config option `cpu/max-threads-hint` and command line option `--cpu-max-threads-hint`.
+- [#1185](https://github.com/xmrig/xmrig/pull/1185) Added JIT compiler for RandomX on ARMv8.
+- Improved API endpoint `GET /2/backends` and added support for this endpoint to [workers.xmrig.info](http://workers.xmrig.info).
+- Added command line option `--no-cpu` to disable CPU backend.
+- Added OpenCL specific command line options: `--opencl`, `--opencl-devices`, `--opencl-platform`, `--opencl-loader` and `--opencl-no-cache`.
+- Added CUDA specific command line options: `--cuda`, `--cuda-loader` and `--no-nvml`.
+- Removed command line option `--http-enabled`, HTTP API enabled automatically if any other `--http-*` option provided.
+- [#1172](https://github.com/xmrig/xmrig/issues/1172) **Added OpenCL mining backend.**
+  - [#268](https://github.com/xmrig/xmrig-amd/pull/268) [#270](https://github.com/xmrig/xmrig-amd/pull/270) [#271](https://github.com/xmrig/xmrig-amd/pull/271) [#273](https://github.com/xmrig/xmrig-amd/pull/273) [#274](https://github.com/xmrig/xmrig-amd/pull/274) [#1171](https://github.com/xmrig/xmrig/pull/1171) Added RandomX support for OpenCL, thanks [@SChernykh](https://github.com/SChernykh).
+- Algorithm `cn/wow` removed, as no longer alive. 

 # Previous versions
 [doc/CHANGELOG_OLD.md](doc/CHANGELOG_OLD.md)
--- a/CMakeLists.txt
+++ b/CMakeLists.txt
@ -14,6 +14,11 @@ option(WITH_DEBUG_LOG       "Enable debug log output" OFF)
 option(WITH_TLS             "Enable OpenSSL support" ON)
 option(WITH_ASM             "Enable ASM PoW implementations" ON)
 option(WITH_EMBEDDED_CONFIG "Enable internal embedded JSON config" OFF)
+option(WITH_OPENCL          "Enable OpenCL backend" ON)
+option(WITH_CUDA            "Enable CUDA backend" ON)
+option(WITH_NVML            "Enable NVML (NVIDIA Management Library) support (only if CUDA backend enabled)" ON)
+option(WITH_STRICT_CACHE    "Enable strict checks for OpenCL cache" ON)
+option(WITH_INTERLEAVE_DEBUG_LOG "Enable debug log for threads interleave" OFF)

 option(BUILD_STATIC         "Build static binary" OFF)
 option(ARM_TARGET           "Force use specific ARM target 8 or 7" 0)
@ -52,6 +57,7 @@ set(HEADERS
   )

 set(HEADERS_CRYPTO
+    src/backend/common/interfaces/IMemoryPool.h
    src/crypto/cn/asm/CryptonightR_template.h
    src/crypto/cn/c_blake256.h
    src/crypto/cn/c_groestl.h
@ -70,6 +76,7 @@ set(HEADERS_CRYPTO
    src/crypto/common/Algorithm.h
    src/crypto/common/Coin.h
    src/crypto/common/keccak.h
+    src/crypto/common/MemoryPool.h
    src/crypto/common/Nonce.h
    src/crypto/common/portable/mm_malloc.h
    src/crypto/common/VirtualMemory.h
@ -108,10 +115,22 @@ set(SOURCES_CRYPTO
    src/crypto/common/Algorithm.cpp
    src/crypto/common/Coin.cpp
    src/crypto/common/keccak.cpp
+    src/crypto/common/MemoryPool.cpp
    src/crypto/common/Nonce.cpp
    src/crypto/common/VirtualMemory.cpp
   )

+if (WITH_HWLOC)
+    list(APPEND HEADERS_CRYPTO
+        src/crypto/common/NUMAMemoryPool.h
+        )
+
+    list(APPEND SOURCES_CRYPTO
+        src/crypto/common/NUMAMemoryPool.cpp
+        src/crypto/common/VirtualMemory_hwloc.cpp
+        )
+endif()
+
 if (WIN32)
    set(SOURCES_OS
        "${SOURCES_OS}"
@ -142,7 +161,7 @@ else()
    endif()
 endif()

-if (CMAKE_SYSTEM_NAME MATCHES "Linux")
+if (CMAKE_SYSTEM_NAME MATCHES "Linux" OR CMAKE_SYSTEM_NAME MATCHES "Android")
    EXECUTE_PROCESS(COMMAND uname -o COMMAND tr -d '\n' OUTPUT_VARIABLE OPERATING_SYSTEM)
    if (OPERATING_SYSTEM MATCHES "Android")
        set(EXTRA_LIBS ${EXTRA_LIBS} log)
--- a/README.md
+++ b/README.md
@ -9,19 +9,15 @@
 [![GitHub stars](https://img.shields.io/github/stars/xmrig/xmrig.svg)](https://github.com/xmrig/xmrig/stargazers)
 [![GitHub forks](https://img.shields.io/github/forks/xmrig/xmrig.svg)](https://github.com/xmrig/xmrig/network)

-XMRig is a high performance RandomX and CryptoNight CPU miner, with official support for Windows.
+XMRig High performance, open source, cross platform RandomX, CryptoNight and Argon2 CPU/GPU miner, with official support for Windows.

-* This is the **CPU-mining** version, there is also a [NVIDIA GPU version](https://github.com/xmrig/xmrig-nvidia) and [AMD GPU version]( https://github.com/xmrig/xmrig-amd).
+## Mining backends
+- **CPU** (x64/x86/ARM)
+- **OpenCL** for AMD GPUs.
+- **CUDA** for NVIDIA GPUs via external [CUDA plugin](https://github.com/xmrig/xmrig-cuda).

 <img src="doc/screenshot.png" width="808" >

-#### Table of contents
-* [Download](#download)
-* [Usage](#usage)
-* [Build](https://github.com/xmrig/xmrig/wiki/Build)
-* [Donations](#donations)
-* [Contacts](#contacts)
-
 ## Download
 * Binary releases: https://github.com/xmrig/xmrig/releases
 * Git tree: https://github.com/xmrig/xmrig.git
@ -30,54 +26,80 @@ XMRig is a high performance RandomX and CryptoNight CPU miner, with official sup
 ## Usage
 The preferred way to configure the miner is the [JSON config file](src/config.json) as it is more flexible and human friendly. The command line interface does not cover all features, such as mining profiles for different algorithms. Important options can be changed during runtime without miner restart by editing the config file or executing API calls.

-### Options
+* **[xmrig.com/wizard](https://xmrig.com/wizard)** helps you create initial configuration for the miner.
+* **[workers.xmrig.info](http://workers.xmrig.info)** helps manage your miners via HTTP API.
+
+### Command line options
 ```
-  -a, --algo=ALGO               specify the algorithm to use
-                                  cn/r, cn/2, cn/1, cn/0, cn/double, cn/half, cn/fast,
-                                  cn/rwz, cn/zls, cn/xao, cn/rto, cn/gpu,
-                                  cn-lite/1,
-                                  cn-heavy/xhv, cn-heavy/tube, cn-heavy/0,
-                                  cn-pico,
-                                  rx/wow, rx/loki
+Network:
  -o, --url=URL                 URL of mining server
-  -O, --userpass=U:P            username:password pair for mining server
+  -a, --algo=ALGO               mining algorithm https://xmrig.com/docs/algorithms
+      --coin=COIN               specify coin instead of algorithm
  -u, --user=USERNAME           username for mining server
  -p, --pass=PASSWORD           password for mining server
-      --rig-id=ID               rig identifier for pool-side statistics (needs pool support)
-  -t, --threads=N               number of miner threads
-  -v, --av=N                    algorithm variation, 0 auto select
+  -O, --userpass=U:P            username:password pair for mining server
  -k, --keepalive               send keepalived packet for prevent timeout (needs pool support)
      --nicehash                enable nicehash.com support
+      --rig-id=ID               rig identifier for pool-side statistics (needs pool support)
      --tls                     enable SSL/TLS support (needs pool support)
-      --tls-fingerprint=F       pool TLS certificate fingerprint, if set enable strict certificate pinning
+      --tls-fingerprint=HEX     pool TLS certificate fingerprint for strict certificate pinning
      --daemon                  use daemon RPC instead of pool for solo mining
      --daemon-poll-interval=N  daemon poll interval in milliseconds (default: 1000)
  -r, --retries=N               number of times to retry before switch to backup server (default: 5)
  -R, --retry-pause=N           time to pause between retries (default: 5)
+      --user-agent              set custom user-agent string for pool
+      --donate-level=N          donate level, default 5%% (5 minutes in 100 minutes)
+      --donate-over-proxy=N     control donate over xmrig-proxy feature
+
+CPU backend:
+      --no-cpu                  disable CPU mining backend
+  -t, --threads=N               number of CPU threads
+  -v, --av=N                    algorithm variation, 0 auto select
      --cpu-affinity            set process affinity to CPU core(s), mask 0x3 for cores 0 and 1
      --cpu-priority            set process priority (0 idle, 2 normal to 5 highest)
+      --cpu-max-threads-hint=N  maximum CPU threads count (in percentage) hint for autoconfig
+      --cpu-memory-pool=N       number of 2 MB pages for persistent memory pool, -1 (auto), 0 (disable)
      --no-huge-pages           disable huge pages support
-      --no-color                disable colored output
-      --donate-level=N          donate level, default 5% (5 minutes in 100 minutes)
-      --user-agent              set custom user-agent string for pool
-  -B, --background              run the miner in the background
-  -c, --config=FILE             load a JSON-format configuration file
-  -l, --log-file=FILE           log all output to a file
-      --asm=ASM                 ASM optimizations, possible values: auto, none, intel, ryzen, bulldozer.
-      --print-time=N            print hashrate report every N seconds
+      --asm=ASM                 ASM optimizations, possible values: auto, none, intel, ryzen, bulldozer
+      --randomx-init=N          threads count to initialize RandomX dataset
+      --randomx-no-numa         disable NUMA support for RandomX
+
+API:
      --api-worker-id=ID        custom worker-id for API
      --api-id=ID               custom instance ID for API
-      --http-enabled            enable HTTP API
      --http-host=HOST          bind host for HTTP API (default: 127.0.0.1)
      --http-port=N             bind port for HTTP API
      --http-access-token=T     access token for HTTP API
      --http-no-restricted      enable full remote access to HTTP API (only if access token set)
-      --randomx-init=N          threads count to initialize RandomX dataset
-      --randomx-no-numa         disable NUMA support for RandomX
-      --export-topology         export hwloc topology to a XML file and exit
-      --dry-run                 test configuration and exit
-  -h, --help                    display this help and exit
+
+OpenCL backend:
+      --opencl                  enable OpenCL mining backend
+      --opencl-devices=N        comma separated list of OpenCL devices to use
+      --opencl-platform=N       OpenCL platform index or name
+      --opencl-loader=PATH      path to OpenCL-ICD-Loader (OpenCL.dll or libOpenCL.so)
+      --opencl-no-cache         disable OpenCL cache
+      --print-platforms         print available OpenCL platforms and exit
+
+CUDA backend:
+      --cuda                    enable CUDA mining backend
+      --cuda-loader=PATH        path to CUDA plugin (xmrig-cuda.dll or libxmrig-cuda.so)
+      --cuda-devices=N          comma separated list of CUDA devices to use
+      --no-nvml                 disable NVML (NVIDIA Management Library) support
+
+Logging:
+  -S, --syslog                  use system log for output messages
+  -l, --log-file=FILE           log all output to a file
+      --print-time=N            print hashrate report every N seconds
+      --health-print-time=N     print health report every N seconds
+      --no-color                disable colored output
+
+Misc:
+  -c, --config=FILE             load a JSON-format configuration file
+  -B, --background              run the miner in the background
  -V, --version                 output version information and exit
+  -h, --help                    display this help and exit
+      --dry-run                 test configuration and exit
+      --export-topology         export hwloc topology to a XML file and exit
 ```

 ## Donations
--- a/cmake/flags.cmake
+++ b/cmake/flags.cmake
@ -54,6 +54,8 @@ if (CMAKE_CXX_COMPILER_ID MATCHES GNU)

    #set(CMAKE_C_FLAGS_RELEASE "${CMAKE_CXX_FLAGS_RELEASE} -gdwarf-2")

+    add_definitions(/DHAVE_BUILTIN_CLEAR_CACHE)
+
 elseif (CMAKE_CXX_COMPILER_ID MATCHES MSVC)

    set(CMAKE_C_FLAGS_RELEASE "${CMAKE_C_FLAGS_RELEASE} /Ox /Ot /Oi /MT /GL")
--- a/cmake/randomx.cmake
+++ b/cmake/randomx.cmake
@ -4,9 +4,12 @@ if (WITH_RANDOMX)
    list(APPEND HEADERS_CRYPTO
        src/crypto/rx/Rx.h
        src/crypto/rx/RxAlgo.h
+        src/crypto/rx/RxBasicStorage.h
        src/crypto/rx/RxCache.h
        src/crypto/rx/RxConfig.h
        src/crypto/rx/RxDataset.h
+        src/crypto/rx/RxQueue.h
+        src/crypto/rx/RxSeed.h
        src/crypto/rx/RxVm.h
    )

@ -32,9 +35,11 @@ if (WITH_RANDOMX)
        src/crypto/randomx/vm_interpreted.cpp
        src/crypto/rx/Rx.cpp
        src/crypto/rx/RxAlgo.cpp
+        src/crypto/rx/RxBasicStorage.cpp
        src/crypto/rx/RxCache.cpp
        src/crypto/rx/RxConfig.cpp
        src/crypto/rx/RxDataset.cpp
+        src/crypto/rx/RxQueue.cpp
        src/crypto/rx/RxVm.cpp
    )

@ -51,6 +56,32 @@ if (WITH_RANDOMX)
            )
        # cheat because cmake and ccache hate each other
        set_property(SOURCE src/crypto/randomx/jit_compiler_x86_static.S PROPERTY LANGUAGE C)
+    elseif (XMRIG_ARM AND CMAKE_SIZEOF_VOID_P EQUAL 8)
+        list(APPEND SOURCES_CRYPTO
+             src/crypto/randomx/jit_compiler_a64_static.S
+             src/crypto/randomx/jit_compiler_a64.cpp
+            )
+        # cheat because cmake and ccache hate each other
+        set_property(SOURCE src/crypto/randomx/jit_compiler_a64_static.S PROPERTY LANGUAGE C)
+    endif()
+
+    if (CMAKE_CXX_COMPILER_ID MATCHES Clang)
+        set_source_files_properties(src/crypto/randomx/jit_compiler_x86.cpp PROPERTIES COMPILE_FLAGS -Wno-unused-const-variable)
+    endif()
+
+    if (WITH_HWLOC)
+        list(APPEND HEADERS_CRYPTO
+             src/crypto/rx/RxNUMAStorage.h
+            )
+
+        list(APPEND SOURCES_CRYPTO
+             src/crypto/rx/RxConfig_hwloc.cpp
+             src/crypto/rx/RxNUMAStorage.cpp
+            )
+    else()
+        list(APPEND SOURCES_CRYPTO
+             src/crypto/rx/RxConfig_basic.cpp
+            )
    endif()
 else()
    remove_definitions(/DXMRIG_ALGO_RANDOMX)
--- a/doc/ALGORITHMS.md
+++ b/doc/ALGORITHMS.md
@ -3,7 +3,7 @@
 Algorithm can be defined in 3 ways:

 1. By pool, using algorithm negotiation, in this case no need specify algorithm on miner side.
-2. Per pool `coin` option, currently only usable value for this option is `monero`.
+2. Per pool `coin` option, currently only usable values for this option is `monero` and `arqma`.
 3. Per pool `algo` option.

 Option `coin` useful for pools without algorithm negotiation support or daemon to allow automatically switch algorithm in next hard fork.
@ -12,11 +12,12 @@ Option `coin` useful for pools without algorithm negotiation support or daemon t

 | Name | Memory | Version | Notes |
 |------|--------|---------|-------|
+| `rx/arq` | 256 KB | 4.3.0+ | RandomARQ (RandomX variant for ArQmA). |
 | `rx/0` | 2 MB | 3.2.0+ | RandomX (Monero). |
 | `argon2/chukwa` | 512 KB | 3.1.0+ | Argon2id (Chukwa). |
 | `argon2/wrkz` | 256 KB | 3.1.0+ | Argon2id (WRKZ) |
-| `rx/wow` | 1 MB | 3.0.0+ | RandomWOW. |
-| `rx/loki` | 2 MB | 3.0.0+ | RandomXL. |
+| `rx/wow` | 1 MB | 3.0.0+ | RandomWOW (RandomX variant for Wownero). |
+| `rx/loki` | 2 MB | 3.0.0+ | RandomXL (RandomX variant for Loki). |
 | `cn/fast` | 2 MB | 3.0.0+ | CryptoNight variant 1 with half iterations. |
 | `cn/rwz` | 2 MB | 2.14.0+ | CryptoNight variant 2 with 3/4 iterations and reversed shuffle operation. |
 | `cn/zls` | 2 MB | 2.14.0+ | CryptoNight variant 2 with 3/4 iterations. |
--- a/doc/CHANGELOG_OLD.md
+++ b/doc/CHANGELOG_OLD.md
@ -1,3 +1,116 @@
+# v4.6.2-beta
+- [#1274](https://github.com/xmrig/xmrig/issues/1274) Added `--cuda-devices` command line option.
+- [#1277](https://github.com/xmrig/xmrig/pull/1277) Fixed function names for clang on Apple.
+
+# v4.6.1-beta
+- [#1272](https://github.com/xmrig/xmrig/pull/1272) Optimized hashrate calculation.
+- [#1273](https://github.com/xmrig/xmrig/issues/1273) Fixed crash when use `GET /2/backends` API endpoint with disabled CUDA.
+
+# v4.6.0-beta
+- [#1263](https://github.com/xmrig/xmrig/pull/1263) Added new option `dataset_host` for NVIDIA GPUs with less than 4 GB memory (RandomX only).
+
+# v4.5.0-beta
+- Added NVIDIA CUDA support via external [CUDA plugun](https://github.com/xmrig/xmrig-cuda). XMRig now is unified 3 in 1 miner.
+
+# v4.4.0-beta
+- [#1068](https://github.com/xmrig/xmrig/pull/1068) Added support for `self-select` stratum protocol extension.
+- [#1240](https://github.com/xmrig/xmrig/pull/1240) Sync with the latest RandomX code.
+- [#1241](https://github.com/xmrig/xmrig/issues/1241) Fixed regression with colors on old Windows systems.
+- [#1243](https://github.com/xmrig/xmrig/pull/1243) Fixed incorrect OpenCL memory size detection in some cases.
+- [#1247](https://github.com/xmrig/xmrig/pull/1247) Fixed ARM64 RandomX code alignment.
+- [#1248](https://github.com/xmrig/xmrig/pull/1248) Fixed RandomX code cache cleanup on iOS/Darwin.
+
+# v4.3.1-beta
+- Fixed regression in v4.3.0, miner didn't create `cn` mining profile with default config example.
+
+# v4.3.0-beta
+- [#1227](https://github.com/xmrig/xmrig/pull/1227) Added new algorithm `rx/arq`, RandomX variant for upcoming ArQmA fork.
+- [#808](https://github.com/xmrig/xmrig/issues/808#issuecomment-539297156) Added experimental support for persistent memory for CPU mining threads.
+- [#1221](https://github.com/xmrig/xmrig/issues/1221) Improved RandomX dataset memory usage and initialization speed for NUMA machines.
+
+# v4.2.1-beta
+- [#1150](https://github.com/xmrig/xmrig/issues/1150) Fixed build on FreeBSD.
+- [#1175](https://github.com/xmrig/xmrig/issues/1175) Fixed support for systems where total count of NUMA nodes not equal usable nodes count.
+- [#1199](https://github.com/xmrig/xmrig/issues/1199) Fixed excessive memory allocation for OpenCL threads with low intensity.
+- [#1212](https://github.com/xmrig/xmrig/issues/1212) Fixed low RandomX performance after fast algorithm switching.
+
+# v4.2.0-beta
+- [#1202](https://github.com/xmrig/xmrig/issues/1202) Fixed algorithm verification in donate strategy.
+- Added per pool option `coin` with single possible value `monero` for pools without algorithm negotiation, for upcoming Monero fork.
+- Added config option `cpu/max-threads-hint` and command line option `--cpu-max-threads-hint`.
+
+# v4.1.0-beta
+- **OpenCL backend disabled by default.**.
+- [#1183](https://github.com/xmrig/xmrig/issues/1183) Fixed compatibility with systemd.
+- [#1185](https://github.com/xmrig/xmrig/pull/1185) Added JIT compiler for RandomX on ARMv8.
+- Improved API endpoint `GET /2/backends` and added support for this endpoint to [workers.xmrig.info](http://workers.xmrig.info).
+- Added command line option `--no-cpu` to disable CPU backend.
+- Added OpenCL specific command line options: `--opencl`, `--opencl-devices`, `--opencl-platform`, `--opencl-loader` and `--opencl-no-cache`.
+- Removed command line option `--http-enabled`, HTTP API enabled automatically if any other `--http-*` option provided.
+
+# v4.0.1-beta
+- [#1177](https://github.com/xmrig/xmrig/issues/1177) Fixed compatibility with old AMD drivers.
+- [#1180](https://github.com/xmrig/xmrig/issues/1180) Fixed possible duplicated shares after algorithm switching.
+- Added support for case if not all backend threads successfully started.
+- Fixed wrong config file permissions after write (only gcc builds on recent Windows 10 affected).
+
+# v4.0.0-beta
+- [#1172](https://github.com/xmrig/xmrig/issues/1172) **Added OpenCL mining backend.**
+  - [#268](https://github.com/xmrig/xmrig-amd/pull/268) [#270](https://github.com/xmrig/xmrig-amd/pull/270) [#271](https://github.com/xmrig/xmrig-amd/pull/271) [#273](https://github.com/xmrig/xmrig-amd/pull/273) [#274](https://github.com/xmrig/xmrig-amd/pull/274) [#1171](https://github.com/xmrig/xmrig/pull/1171) Added RandomX support for OpenCL, thanks [@SChernykh](https://github.com/SChernykh).
+- Algorithm `cn/wow` removed, as no longer alive. 
+
+# v3.2.0
+- Added per pool option `coin` with single possible value `monero` for pools without algorithm negotiation, for upcoming Monero fork.
+- [#1183](https://github.com/xmrig/xmrig/issues/1183) Fixed compatibility with systemd.
+
+# v3.1.3
+- [#1180](https://github.com/xmrig/xmrig/issues/1180) Fixed possible duplicated shares after algorithm switching.
+- Fixed wrong config file permissions after write (only gcc builds on recent Windows 10 affected).
+
+# v3.1.2
+- Many RandomX optimizations and fixes.
+  - [#1132](https://github.com/xmrig/xmrig/issues/1132) Fixed build on CentOS 7.
+  - [#1163](https://github.com/xmrig/xmrig/pull/1163) Optimized soft AES code, up to +30% hashrate on CPU without AES support and other optimizations.
+  - [#1166](https://github.com/xmrig/xmrig/pull/1166) Fixed crash when initialize dataset with big threads count (eg 272).
+  - [#1168](https://github.com/xmrig/xmrig/pull/1168) Optimized loading from scratchpad.
+- [#1128](https://github.com/xmrig/xmrig/issues/1128) Fixed CMake 2.8 compatibility.
+
+# v3.1.1
+- [#1133](https://github.com/xmrig/xmrig/issues/1133) Fixed syslog regression.
+- [#1138](https://github.com/xmrig/xmrig/issues/1138) Fixed multiple network bugs.
+- [#1141](https://github.com/xmrig/xmrig/issues/1141) Fixed log in background mode.
+- [#1142](https://github.com/xmrig/xmrig/pull/1142) RandomX hashrate improved by 0.5-1.5% depending on variant and CPU.
+- [#1146](https://github.com/xmrig/xmrig/pull/1146) Fixed race condition in RandomX thread init.
+- [#1148](https://github.com/xmrig/xmrig/pull/1148) Fixed, on Linux linker marking entire executable as having an executable stack.
+- Fixed, for Argon2 algorithms command line options like `--threads` was ignored.
+- Fixed command line options for single pool, free order allowed again.
+
+# v3.1.0
+- [#1107](https://github.com/xmrig/xmrig/issues/1107#issuecomment-522235892) Added Argon2 algorithm family: `argon2/chukwa` and `argon2/wrkz`.
+
+# v3.0.0
+- **[#1111](https://github.com/xmrig/xmrig/pull/1111) Added RandomX (`rx/test`) algorithm for testing and benchmarking.**
+- **[#1036](https://github.com/xmrig/xmrig/pull/1036) Added RandomWOW (`rx/wow`) algorithm for [Wownero](http://wownero.org/).**
+- **[#1050](https://github.com/xmrig/xmrig/pull/1050) Added RandomXL (`rx/loki`) algorithm for [Loki](https://loki.network/).**
+- **[#1077](https://github.com/xmrig/xmrig/issues/1077) Added NUMA support via hwloc**.
+- **Added flexible [multi algorithm](doc/CPU.md) configuration.**
+- **Added unlimited switching between incompatible algorithms, all mining options can be changed in runtime.**
+- [#257](https://github.com/xmrig/xmrig-nvidia/pull/257) New logging subsystem, file and syslog now always without colors.
+- [#314](https://github.com/xmrig/xmrig-proxy/issues/314) Added donate over proxy feature.
+- [#1007](https://github.com/xmrig/xmrig/issues/1007) Old HTTP API backend based on libmicrohttpd, replaced to custom HTTP server (libuv + http_parser).
+- [#1010](https://github.com/xmrig/xmrig/pull/1010#issuecomment-482632107) Added daemon support (solo mining).
+- [#1066](https://github.com/xmrig/xmrig/issues/1066#issuecomment-518080529) Added error message if pool not ready for RandomX.
+- [#1105](https://github.com/xmrig/xmrig/issues/1105) Improved auto configuration for `cn-pico` algorithm.
+- Added commands `pause` and `resume` via JSON RPC 2.0 API (`POST /json_rpc`).
+- Added command line option `--export-topology` for export hwloc topology to a XML file.
+- Breaked backward compatibility with previous configs and command line, `variant` option replaced to `algo`, global option `algo` removed, all CPU related settings moved to `cpu` object.
+- Options `av`, `safe` and `max-cpu-usage` removed.
+- Algorithm `cn/msr` renamed to `cn/fast`.
+- Algorithm `cn/xtl` removed.
+- API endpoint `GET /1/threads` replaced to `GET /2/backends`.
+- Added global uptime and extended connection information in API.
+- API now return current algorithm.
+
 # v2.99.6-beta
 - Added commands `pause` and `resume` via JSON RPC 2.0 API (`POST /json_rpc`).
 - Fixed autoconfig regression (since 2.99.5), mostly `rx/wow` was affected by this bug.
--- a/doc/CPU.md
+++ b/doc/CPU.md
@ -94,3 +94,9 @@ Enable/configure or disable ASM optimizations. Possible values: `true`, `false`,

 #### `argon2-impl` (since v3.1.0)
 Allow override automatically detected Argon2 implementation, this option added mostly for debug purposes, default value `null` means autodetect. Other possible values: `"x86_64"`, `"SSE2"`, `"SSSE3"`, `"XOP"`, `"AVX2"`, `"AVX-512F"`. Manual selection has no safe guards, if you CPU not support required instuctions, miner will crash.
+
+#### `max-threads-hint` (since v4.2.0)
+Maximum CPU threads count (in percentage) hint for autoconfig. [CPU_MAX_USAGE.md](CPU_MAX_USAGE.md)
+
+#### `memory-pool` (since v4.3.0)
+Use continuous, persistent memory block for mining threads, useful for preserve huge pages allocation while algorithm swithing. Default value `false` (feature disabled) or `true` or specific count of 2 MB huge pages.
--- a/doc/CPU_MAX_USAGE.md
+++ b/doc/CPU_MAX_USAGE.md
@ -0,0 +1,26 @@
+# Maximum CPU usage
+
+Please read this document carefully, `max-threads-hint` (was known as `max-cpu-usage`) option is most confusing option in the miner with many myth and legends.
+This option is just hint for automatic configuration and can't precise define CPU usage.
+
+### Option definition
+#### Config file:
+```json
+{
+    ...
+    "cpu": {
+        "max-threads-hint": 100,
+        ...
+    },
+    ...
+}
+```
+
+#### Command line
+`--cpu-max-threads-hint 100`
+
+### Known issues and usage
+
+* This option has no effect if miner already generated CPU configuration, to prevent config generation use `"autosave":false,`.
+* Only threads count can be changed, for 1 core CPU this option has no effect, for 2 core CPU only 2 values possible 50% and 100%, for 4 cores: 25%, 50%, 75%, 100%. etc. 
+* You CPU may limited by other factors, eg cache.
--- a/doc/PERSISTENT_OPTIONS.md
+++ b/doc/PERSISTENT_OPTIONS.md
@ -0,0 +1,9 @@
+# Persistent options
+
+Options in list below can't changed in runtime by watching config file or via API.
+
+* `background`
+* `donate-level`
+* `cpu/argon2-impl`
+* `opencl/loader`
+* `opencl/platform`
--- a/doc/build/CMAKE_OPTIONS.md
+++ b/doc/build/CMAKE_OPTIONS.md
@ -21,6 +21,8 @@ This feature add external dependency to libhwloc (1.10.0+) (except MSVC builds).
 * **`-DWITH_TLS=OFF`** disable SSL/TLS support (secure connections to pool). This feature add external dependency to OpenSSL.
 * **`-DWITH_ASM=OFF`** disable assembly optimizations for modern CryptoNight algorithms.
 * **`-DWITH_EMBEDDED_CONFIG=ON`** Enable [embedded](https://github.com/xmrig/xmrig/issues/957) config support.
+* **`-DWITH_OPENCL=OFF`** Disable OpenCL backend.
+* **`-DWITH_CUDA=OFF`** Disable CUDA backend.

 ## Debug options

--- a/doc/topology/AMD_Opteron_6344_x2_N4_win7_2_0_4_bug.xml
+++ b/doc/topology/AMD_Opteron_6344_x2_N4_win7_2_0_4_bug.xml
@ -0,0 +1,262 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!DOCTYPE topology SYSTEM "hwloc2.dtd">
+<topology version="2.0">
+  <object type="Machine" os_index="0" cpuset="0x00ffffff" complete_cpuset="0x00ffffff" allowed_cpuset="0x00ffffff" nodeset="0x0000000f" complete_nodeset="0x0000000f" allowed_nodeset="0x0000000f" gp_index="1">
+    <info name="Backend" value="Windows"/>
+    <info name="hwlocVersion" value="2.0.4"/>
+    <object type="Package" cpuset="0x00000fff" complete_cpuset="0x00000fff" nodeset="0x00000003" complete_nodeset="0x00000003" gp_index="36">
+      <info name="CPUVendor" value="AuthenticAMD"/>
+      <info name="CPUFamilyNumber" value="21"/>
+      <info name="CPUModelNumber" value="2"/>
+      <info name="CPUModel" value="AMD Opteron(tm) Processor 6344                 "/>
+      <info name="CPUStepping" value="0"/>
+      <object type="L3Cache" cpuset="0x0000003f" complete_cpuset="0x0000003f" nodeset="0x00000001" complete_nodeset="0x00000001" gp_index="20" cache_size="12582912" depth="3" cache_linesize="64" cache_associativity="1" cache_type="0">
+        <info name="Inclusive" value="0"/>
+        <object type="NUMANode" os_index="0" cpuset="0x0000003f" complete_cpuset="0x0000003f" nodeset="0x00000001" complete_nodeset="0x00000001" gp_index="80" local_memory="7009357824">
+          <page_type size="4096" count="0"/>
+        </object>
+        <object type="L2Cache" cpuset="0x00000001" complete_cpuset="0x00000001" nodeset="0x00000001" complete_nodeset="0x00000001" gp_index="4" cache_size="2097152" depth="2" cache_linesize="64" cache_associativity="16" cache_type="0">
+          <info name="Inclusive" value="0"/>
+          <object type="L1Cache" cpuset="0x00000001" complete_cpuset="0x00000001" nodeset="0x00000001" complete_nodeset="0x00000001" gp_index="3" cache_size="16384" depth="1" cache_linesize="64" cache_associativity="4" cache_type="1">
+            <info name="Inclusive" value="0"/>
+            <object type="Core" cpuset="0x00000001" complete_cpuset="0x00000001" nodeset="0x00000001" complete_nodeset="0x00000001" gp_index="2">
+              <object type="PU" os_index="0" cpuset="0x00000001" complete_cpuset="0x00000001" nodeset="0x00000001" complete_nodeset="0x00000001" gp_index="85"/>
+            </object>
+          </object>
+        </object>
+        <object type="L2Cache" cpuset="0x00000002" complete_cpuset="0x00000002" nodeset="0x00000001" complete_nodeset="0x00000001" gp_index="7" cache_size="2097152" depth="2" cache_linesize="64" cache_associativity="16" cache_type="0">
+          <info name="Inclusive" value="0"/>
+          <object type="L1Cache" cpuset="0x00000002" complete_cpuset="0x00000002" nodeset="0x00000001" complete_nodeset="0x00000001" gp_index="6" cache_size="16384" depth="1" cache_linesize="64" cache_associativity="4" cache_type="1">
+            <info name="Inclusive" value="0"/>
+            <object type="Core" cpuset="0x00000002" complete_cpuset="0x00000002" nodeset="0x00000001" complete_nodeset="0x00000001" gp_index="5">
+              <object type="PU" os_index="1" cpuset="0x00000002" complete_cpuset="0x00000002" nodeset="0x00000001" complete_nodeset="0x00000001" gp_index="86"/>
+            </object>
+          </object>
+        </object>
+        <object type="L2Cache" cpuset="0x00000004" complete_cpuset="0x00000004" nodeset="0x00000001" complete_nodeset="0x00000001" gp_index="10" cache_size="2097152" depth="2" cache_linesize="64" cache_associativity="16" cache_type="0">
+          <info name="Inclusive" value="0"/>
+          <object type="L1Cache" cpuset="0x00000004" complete_cpuset="0x00000004" nodeset="0x00000001" complete_nodeset="0x00000001" gp_index="9" cache_size="16384" depth="1" cache_linesize="64" cache_associativity="4" cache_type="1">
+            <info name="Inclusive" value="0"/>
+            <object type="Core" cpuset="0x00000004" complete_cpuset="0x00000004" nodeset="0x00000001" complete_nodeset="0x00000001" gp_index="8">
+              <object type="PU" os_index="2" cpuset="0x00000004" complete_cpuset="0x00000004" nodeset="0x00000001" complete_nodeset="0x00000001" gp_index="87"/>
+            </object>
+          </object>
+        </object>
+        <object type="L2Cache" cpuset="0x00000008" complete_cpuset="0x00000008" nodeset="0x00000001" complete_nodeset="0x00000001" gp_index="13" cache_size="2097152" depth="2" cache_linesize="64" cache_associativity="16" cache_type="0">
+          <info name="Inclusive" value="0"/>
+          <object type="L1Cache" cpuset="0x00000008" complete_cpuset="0x00000008" nodeset="0x00000001" complete_nodeset="0x00000001" gp_index="12" cache_size="16384" depth="1" cache_linesize="64" cache_associativity="4" cache_type="1">
+            <info name="Inclusive" value="0"/>
+            <object type="Core" cpuset="0x00000008" complete_cpuset="0x00000008" nodeset="0x00000001" complete_nodeset="0x00000001" gp_index="11">
+              <object type="PU" os_index="3" cpuset="0x00000008" complete_cpuset="0x00000008" nodeset="0x00000001" complete_nodeset="0x00000001" gp_index="88"/>
+            </object>
+          </object>
+        </object>
+        <object type="L2Cache" cpuset="0x00000010" complete_cpuset="0x00000010" nodeset="0x00000001" complete_nodeset="0x00000001" gp_index="16" cache_size="2097152" depth="2" cache_linesize="64" cache_associativity="16" cache_type="0">
+          <info name="Inclusive" value="0"/>
+          <object type="L1Cache" cpuset="0x00000010" complete_cpuset="0x00000010" nodeset="0x00000001" complete_nodeset="0x00000001" gp_index="15" cache_size="16384" depth="1" cache_linesize="64" cache_associativity="4" cache_type="1">
+            <info name="Inclusive" value="0"/>
+            <object type="Core" cpuset="0x00000010" complete_cpuset="0x00000010" nodeset="0x00000001" complete_nodeset="0x00000001" gp_index="14">
+              <object type="PU" os_index="4" cpuset="0x00000010" complete_cpuset="0x00000010" nodeset="0x00000001" complete_nodeset="0x00000001" gp_index="89"/>
+            </object>
+          </object>
+        </object>
+        <object type="L2Cache" cpuset="0x00000020" complete_cpuset="0x00000020" nodeset="0x00000001" complete_nodeset="0x00000001" gp_index="19" cache_size="2097152" depth="2" cache_linesize="64" cache_associativity="16" cache_type="0">
+          <info name="Inclusive" value="0"/>
+          <object type="L1Cache" cpuset="0x00000020" complete_cpuset="0x00000020" nodeset="0x00000001" complete_nodeset="0x00000001" gp_index="18" cache_size="16384" depth="1" cache_linesize="64" cache_associativity="4" cache_type="1">
+            <info name="Inclusive" value="0"/>
+            <object type="Core" cpuset="0x00000020" complete_cpuset="0x00000020" nodeset="0x00000001" complete_nodeset="0x00000001" gp_index="17">
+              <object type="PU" os_index="5" cpuset="0x00000020" complete_cpuset="0x00000020" nodeset="0x00000001" complete_nodeset="0x00000001" gp_index="90"/>
+            </object>
+          </object>
+        </object>
+      </object>
+      <object type="L3Cache" cpuset="0x00000fc0" complete_cpuset="0x00000fc0" nodeset="0x00000002" complete_nodeset="0x00000002" gp_index="40" cache_size="12582912" depth="3" cache_linesize="64" cache_associativity="1" cache_type="0">
+        <info name="Inclusive" value="0"/>
+        <object type="NUMANode" os_index="1" cpuset="0x00000fc0" complete_cpuset="0x00000fc0" nodeset="0x00000002" complete_nodeset="0x00000002" gp_index="81" local_memory="8018194432">
+          <page_type size="4096" count="0"/>
+        </object>
+        <object type="L2Cache" cpuset="0x00000040" complete_cpuset="0x00000040" nodeset="0x00000002" complete_nodeset="0x00000002" gp_index="23" cache_size="2097152" depth="2" cache_linesize="64" cache_associativity="16" cache_type="0">
+          <info name="Inclusive" value="0"/>
+          <object type="L1Cache" cpuset="0x00000040" complete_cpuset="0x00000040" nodeset="0x00000002" complete_nodeset="0x00000002" gp_index="22" cache_size="16384" depth="1" cache_linesize="64" cache_associativity="4" cache_type="1">
+            <info name="Inclusive" value="0"/>
+            <object type="Core" cpuset="0x00000040" complete_cpuset="0x00000040" nodeset="0x00000002" complete_nodeset="0x00000002" gp_index="21">
+              <object type="PU" os_index="6" cpuset="0x00000040" complete_cpuset="0x00000040" nodeset="0x00000002" complete_nodeset="0x00000002" gp_index="91"/>
+            </object>
+          </object>
+        </object>
+        <object type="L2Cache" cpuset="0x00000080" complete_cpuset="0x00000080" nodeset="0x00000002" complete_nodeset="0x00000002" gp_index="26" cache_size="2097152" depth="2" cache_linesize="64" cache_associativity="16" cache_type="0">
+          <info name="Inclusive" value="0"/>
+          <object type="L1Cache" cpuset="0x00000080" complete_cpuset="0x00000080" nodeset="0x00000002" complete_nodeset="0x00000002" gp_index="25" cache_size="16384" depth="1" cache_linesize="64" cache_associativity="4" cache_type="1">
+            <info name="Inclusive" value="0"/>
+            <object type="Core" cpuset="0x00000080" complete_cpuset="0x00000080" nodeset="0x00000002" complete_nodeset="0x00000002" gp_index="24">
+              <object type="PU" os_index="7" cpuset="0x00000080" complete_cpuset="0x00000080" nodeset="0x00000002" complete_nodeset="0x00000002" gp_index="92"/>
+            </object>
+          </object>
+        </object>
+        <object type="L2Cache" cpuset="0x00000100" complete_cpuset="0x00000100" nodeset="0x00000002" complete_nodeset="0x00000002" gp_index="29" cache_size="2097152" depth="2" cache_linesize="64" cache_associativity="16" cache_type="0">
+          <info name="Inclusive" value="0"/>
+          <object type="L1Cache" cpuset="0x00000100" complete_cpuset="0x00000100" nodeset="0x00000002" complete_nodeset="0x00000002" gp_index="28" cache_size="16384" depth="1" cache_linesize="64" cache_associativity="4" cache_type="1">
+            <info name="Inclusive" value="0"/>
+            <object type="Core" cpuset="0x00000100" complete_cpuset="0x00000100" nodeset="0x00000002" complete_nodeset="0x00000002" gp_index="27">
+              <object type="PU" os_index="8" cpuset="0x00000100" complete_cpuset="0x00000100" nodeset="0x00000002" complete_nodeset="0x00000002" gp_index="93"/>
+            </object>
+          </object>
+        </object>
+        <object type="L2Cache" cpuset="0x00000200" complete_cpuset="0x00000200" nodeset="0x00000002" complete_nodeset="0x00000002" gp_index="32" cache_size="2097152" depth="2" cache_linesize="64" cache_associativity="16" cache_type="0">
+          <info name="Inclusive" value="0"/>
+          <object type="L1Cache" cpuset="0x00000200" complete_cpuset="0x00000200" nodeset="0x00000002" complete_nodeset="0x00000002" gp_index="31" cache_size="16384" depth="1" cache_linesize="64" cache_associativity="4" cache_type="1">
+            <info name="Inclusive" value="0"/>
+            <object type="Core" cpuset="0x00000200" complete_cpuset="0x00000200" nodeset="0x00000002" complete_nodeset="0x00000002" gp_index="30">
+              <object type="PU" os_index="9" cpuset="0x00000200" complete_cpuset="0x00000200" nodeset="0x00000002" complete_nodeset="0x00000002" gp_index="94"/>
+            </object>
+          </object>
+        </object>
+        <object type="L2Cache" cpuset="0x00000400" complete_cpuset="0x00000400" nodeset="0x00000002" complete_nodeset="0x00000002" gp_index="35" cache_size="2097152" depth="2" cache_linesize="64" cache_associativity="16" cache_type="0">
+          <info name="Inclusive" value="0"/>
+          <object type="L1Cache" cpuset="0x00000400" complete_cpuset="0x00000400" nodeset="0x00000002" complete_nodeset="0x00000002" gp_index="34" cache_size="16384" depth="1" cache_linesize="64" cache_associativity="4" cache_type="1">
+            <info name="Inclusive" value="0"/>
+            <object type="Core" cpuset="0x00000400" complete_cpuset="0x00000400" nodeset="0x00000002" complete_nodeset="0x00000002" gp_index="33">
+              <object type="PU" os_index="10" cpuset="0x00000400" complete_cpuset="0x00000400" nodeset="0x00000002" complete_nodeset="0x00000002" gp_index="95"/>
+            </object>
+          </object>
+        </object>
+        <object type="L2Cache" cpuset="0x00000800" complete_cpuset="0x00000800" nodeset="0x00000002" complete_nodeset="0x00000002" gp_index="39" cache_size="2097152" depth="2" cache_linesize="64" cache_associativity="16" cache_type="0">
+          <info name="Inclusive" value="0"/>
+          <object type="L1Cache" cpuset="0x00000800" complete_cpuset="0x00000800" nodeset="0x00000002" complete_nodeset="0x00000002" gp_index="38" cache_size="16384" depth="1" cache_linesize="64" cache_associativity="4" cache_type="1">
+            <info name="Inclusive" value="0"/>
+            <object type="Core" cpuset="0x00000800" complete_cpuset="0x00000800" nodeset="0x00000002" complete_nodeset="0x00000002" gp_index="37">
+              <object type="PU" os_index="11" cpuset="0x00000800" complete_cpuset="0x00000800" nodeset="0x00000002" complete_nodeset="0x00000002" gp_index="96"/>
+            </object>
+          </object>
+        </object>
+      </object>
+    </object>
+    <object type="Package" cpuset="0x00fff000" complete_cpuset="0x00fff000" nodeset="0x0000000c" complete_nodeset="0x0000000c" gp_index="75">
+      <info name="CPUVendor" value="AuthenticAMD"/>
+      <info name="CPUFamilyNumber" value="21"/>
+      <info name="CPUModelNumber" value="2"/>
+      <info name="CPUModel" value="AMD Opteron(tm) Processor 6344                 "/>
+      <info name="CPUStepping" value="0"/>
+      <object type="L3Cache" cpuset="0x0003f000" complete_cpuset="0x0003f000" nodeset="0x00000004" complete_nodeset="0x00000004" gp_index="59" cache_size="12582912" depth="3" cache_linesize="64" cache_associativity="1" cache_type="0">
+        <info name="Inclusive" value="0"/>
+        <object type="NUMANode" os_index="2" cpuset="0x0003f000" complete_cpuset="0x0003f000" nodeset="0x00000004" complete_nodeset="0x00000004" gp_index="82" local_memory="8035020800">
+          <page_type size="4096" count="0"/>
+        </object>
+        <object type="L2Cache" cpuset="0x00001000" complete_cpuset="0x00001000" nodeset="0x00000004" complete_nodeset="0x00000004" gp_index="43" cache_size="2097152" depth="2" cache_linesize="64" cache_associativity="16" cache_type="0">
+          <info name="Inclusive" value="0"/>
+          <object type="L1Cache" cpuset="0x00001000" complete_cpuset="0x00001000" nodeset="0x00000004" complete_nodeset="0x00000004" gp_index="42" cache_size="16384" depth="1" cache_linesize="64" cache_associativity="4" cache_type="1">
+            <info name="Inclusive" value="0"/>
+            <object type="Core" cpuset="0x00001000" complete_cpuset="0x00001000" nodeset="0x00000004" complete_nodeset="0x00000004" gp_index="41">
+              <object type="PU" os_index="12" cpuset="0x00001000" complete_cpuset="0x00001000" nodeset="0x00000004" complete_nodeset="0x00000004" gp_index="97"/>
+            </object>
+          </object>
+        </object>
+        <object type="L2Cache" cpuset="0x00002000" complete_cpuset="0x00002000" nodeset="0x00000004" complete_nodeset="0x00000004" gp_index="46" cache_size="2097152" depth="2" cache_linesize="64" cache_associativity="16" cache_type="0">
+          <info name="Inclusive" value="0"/>
+          <object type="L1Cache" cpuset="0x00002000" complete_cpuset="0x00002000" nodeset="0x00000004" complete_nodeset="0x00000004" gp_index="45" cache_size="16384" depth="1" cache_linesize="64" cache_associativity="4" cache_type="1">
+            <info name="Inclusive" value="0"/>
+            <object type="Core" cpuset="0x00002000" complete_cpuset="0x00002000" nodeset="0x00000004" complete_nodeset="0x00000004" gp_index="44">
+              <object type="PU" os_index="13" cpuset="0x00002000" complete_cpuset="0x00002000" nodeset="0x00000004" complete_nodeset="0x00000004" gp_index="98"/>
+            </object>
+          </object>
+        </object>
+        <object type="L2Cache" cpuset="0x00004000" complete_cpuset="0x00004000" nodeset="0x00000004" complete_nodeset="0x00000004" gp_index="49" cache_size="2097152" depth="2" cache_linesize="64" cache_associativity="16" cache_type="0">
+          <info name="Inclusive" value="0"/>
+          <object type="L1Cache" cpuset="0x00004000" complete_cpuset="0x00004000" nodeset="0x00000004" complete_nodeset="0x00000004" gp_index="48" cache_size="16384" depth="1" cache_linesize="64" cache_associativity="4" cache_type="1">
+            <info name="Inclusive" value="0"/>
+            <object type="Core" cpuset="0x00004000" complete_cpuset="0x00004000" nodeset="0x00000004" complete_nodeset="0x00000004" gp_index="47">
+              <object type="PU" os_index="14" cpuset="0x00004000" complete_cpuset="0x00004000" nodeset="0x00000004" complete_nodeset="0x00000004" gp_index="99"/>
+            </object>
+          </object>
+        </object>
+        <object type="L2Cache" cpuset="0x00008000" complete_cpuset="0x00008000" nodeset="0x00000004" complete_nodeset="0x00000004" gp_index="52" cache_size="2097152" depth="2" cache_linesize="64" cache_associativity="16" cache_type="0">
+          <info name="Inclusive" value="0"/>
+          <object type="L1Cache" cpuset="0x00008000" complete_cpuset="0x00008000" nodeset="0x00000004" complete_nodeset="0x00000004" gp_index="51" cache_size="16384" depth="1" cache_linesize="64" cache_associativity="4" cache_type="1">
+            <info name="Inclusive" value="0"/>
+            <object type="Core" cpuset="0x00008000" complete_cpuset="0x00008000" nodeset="0x00000004" complete_nodeset="0x00000004" gp_index="50">
+              <object type="PU" os_index="15" cpuset="0x00008000" complete_cpuset="0x00008000" nodeset="0x00000004" complete_nodeset="0x00000004" gp_index="100"/>
+            </object>
+          </object>
+        </object>
+        <object type="L2Cache" cpuset="0x00010000" complete_cpuset="0x00010000" nodeset="0x00000004" complete_nodeset="0x00000004" gp_index="55" cache_size="2097152" depth="2" cache_linesize="64" cache_associativity="16" cache_type="0">
+          <info name="Inclusive" value="0"/>
+          <object type="L1Cache" cpuset="0x00010000" complete_cpuset="0x00010000" nodeset="0x00000004" complete_nodeset="0x00000004" gp_index="54" cache_size="16384" depth="1" cache_linesize="64" cache_associativity="4" cache_type="1">
+            <info name="Inclusive" value="0"/>
+            <object type="Core" cpuset="0x00010000" complete_cpuset="0x00010000" nodeset="0x00000004" complete_nodeset="0x00000004" gp_index="53">
+              <object type="PU" os_index="16" cpuset="0x00010000" complete_cpuset="0x00010000" nodeset="0x00000004" complete_nodeset="0x00000004" gp_index="101"/>
+            </object>
+          </object>
+        </object>
+        <object type="L2Cache" cpuset="0x00020000" complete_cpuset="0x00020000" nodeset="0x00000004" complete_nodeset="0x00000004" gp_index="58" cache_size="2097152" depth="2" cache_linesize="64" cache_associativity="16" cache_type="0">
+          <info name="Inclusive" value="0"/>
+          <object type="L1Cache" cpuset="0x00020000" complete_cpuset="0x00020000" nodeset="0x00000004" complete_nodeset="0x00000004" gp_index="57" cache_size="16384" depth="1" cache_linesize="64" cache_associativity="4" cache_type="1">
+            <info name="Inclusive" value="0"/>
+            <object type="Core" cpuset="0x00020000" complete_cpuset="0x00020000" nodeset="0x00000004" complete_nodeset="0x00000004" gp_index="56">
+              <object type="PU" os_index="17" cpuset="0x00020000" complete_cpuset="0x00020000" nodeset="0x00000004" complete_nodeset="0x00000004" gp_index="102"/>
+            </object>
+          </object>
+        </object>
+      </object>
+      <object type="L3Cache" cpuset="0x00fc0000" complete_cpuset="0x00fc0000" nodeset="0x00000008" complete_nodeset="0x00000008" gp_index="79" cache_size="12582912" depth="3" cache_linesize="64" cache_associativity="1" cache_type="0">
+        <info name="Inclusive" value="0"/>
+        <object type="NUMANode" os_index="3" cpuset="0x00fc0000" complete_cpuset="0x00fc0000" nodeset="0x00000008" complete_nodeset="0x00000008" gp_index="83" local_memory="8097337344">
+          <page_type size="4096" count="0"/>
+        </object>
+        <object type="L2Cache" cpuset="0x00040000" complete_cpuset="0x00040000" nodeset="0x00000008" complete_nodeset="0x00000008" gp_index="62" cache_size="2097152" depth="2" cache_linesize="64" cache_associativity="16" cache_type="0">
+          <info name="Inclusive" value="0"/>
+          <object type="L1Cache" cpuset="0x00040000" complete_cpuset="0x00040000" nodeset="0x00000008" complete_nodeset="0x00000008" gp_index="61" cache_size="16384" depth="1" cache_linesize="64" cache_associativity="4" cache_type="1">
+            <info name="Inclusive" value="0"/>
+            <object type="Core" cpuset="0x00040000" complete_cpuset="0x00040000" nodeset="0x00000008" complete_nodeset="0x00000008" gp_index="60">
+              <object type="PU" os_index="18" cpuset="0x00040000" complete_cpuset="0x00040000" nodeset="0x00000008" complete_nodeset="0x00000008" gp_index="103"/>
+            </object>
+          </object>
+        </object>
+        <object type="L2Cache" cpuset="0x00080000" complete_cpuset="0x00080000" nodeset="0x00000008" complete_nodeset="0x00000008" gp_index="65" cache_size="2097152" depth="2" cache_linesize="64" cache_associativity="16" cache_type="0">
+          <info name="Inclusive" value="0"/>
+          <object type="L1Cache" cpuset="0x00080000" complete_cpuset="0x00080000" nodeset="0x00000008" complete_nodeset="0x00000008" gp_index="64" cache_size="16384" depth="1" cache_linesize="64" cache_associativity="4" cache_type="1">
+            <info name="Inclusive" value="0"/>
+            <object type="Core" cpuset="0x00080000" complete_cpuset="0x00080000" nodeset="0x00000008" complete_nodeset="0x00000008" gp_index="63">
+              <object type="PU" os_index="19" cpuset="0x00080000" complete_cpuset="0x00080000" nodeset="0x00000008" complete_nodeset="0x00000008" gp_index="104"/>
+            </object>
+          </object>
+        </object>
+        <object type="L2Cache" cpuset="0x00100000" complete_cpuset="0x00100000" nodeset="0x00000008" complete_nodeset="0x00000008" gp_index="68" cache_size="2097152" depth="2" cache_linesize="64" cache_associativity="16" cache_type="0">
+          <info name="Inclusive" value="0"/>
+          <object type="L1Cache" cpuset="0x00100000" complete_cpuset="0x00100000" nodeset="0x00000008" complete_nodeset="0x00000008" gp_index="67" cache_size="16384" depth="1" cache_linesize="64" cache_associativity="4" cache_type="1">
+            <info name="Inclusive" value="0"/>
+            <object type="Core" cpuset="0x00100000" complete_cpuset="0x00100000" nodeset="0x00000008" complete_nodeset="0x00000008" gp_index="66">
+              <object type="PU" os_index="20" cpuset="0x00100000" complete_cpuset="0x00100000" nodeset="0x00000008" complete_nodeset="0x00000008" gp_index="105"/>
+            </object>
+          </object>
+        </object>
+        <object type="L2Cache" cpuset="0x00200000" complete_cpuset="0x00200000" nodeset="0x00000008" complete_nodeset="0x00000008" gp_index="71" cache_size="2097152" depth="2" cache_linesize="64" cache_associativity="16" cache_type="0">
+          <info name="Inclusive" value="0"/>
+          <object type="L1Cache" cpuset="0x00200000" complete_cpuset="0x00200000" nodeset="0x00000008" complete_nodeset="0x00000008" gp_index="70" cache_size="16384" depth="1" cache_linesize="64" cache_associativity="4" cache_type="1">
+            <info name="Inclusive" value="0"/>
+            <object type="Core" cpuset="0x00200000" complete_cpuset="0x00200000" nodeset="0x00000008" complete_nodeset="0x00000008" gp_index="69">
+              <object type="PU" os_index="21" cpuset="0x00200000" complete_cpuset="0x00200000" nodeset="0x00000008" complete_nodeset="0x00000008" gp_index="106"/>
+            </object>
+          </object>
+        </object>
+        <object type="L2Cache" cpuset="0x00400000" complete_cpuset="0x00400000" nodeset="0x00000008" complete_nodeset="0x00000008" gp_index="74" cache_size="2097152" depth="2" cache_linesize="64" cache_associativity="16" cache_type="0">
+          <info name="Inclusive" value="0"/>
+          <object type="L1Cache" cpuset="0x00400000" complete_cpuset="0x00400000" nodeset="0x00000008" complete_nodeset="0x00000008" gp_index="73" cache_size="16384" depth="1" cache_linesize="64" cache_associativity="4" cache_type="1">
+            <info name="Inclusive" value="0"/>
+            <object type="Core" cpuset="0x00400000" complete_cpuset="0x00400000" nodeset="0x00000008" complete_nodeset="0x00000008" gp_index="72">
+              <object type="PU" os_index="22" cpuset="0x00400000" complete_cpuset="0x00400000" nodeset="0x00000008" complete_nodeset="0x00000008" gp_index="107"/>
+            </object>
+          </object>
+        </object>
+        <object type="L2Cache" cpuset="0x00800000" complete_cpuset="0x00800000" nodeset="0x00000008" complete_nodeset="0x00000008" gp_index="78" cache_size="2097152" depth="2" cache_linesize="64" cache_associativity="16" cache_type="0">
+          <info name="Inclusive" value="0"/>
+          <object type="L1Cache" cpuset="0x00800000" complete_cpuset="0x00800000" nodeset="0x00000008" complete_nodeset="0x00000008" gp_index="77" cache_size="16384" depth="1" cache_linesize="64" cache_associativity="4" cache_type="1">
+            <info name="Inclusive" value="0"/>
+            <object type="Core" cpuset="0x00800000" complete_cpuset="0x00800000" nodeset="0x00000008" complete_nodeset="0x00000008" gp_index="76">
+              <object type="PU" os_index="23" cpuset="0x00800000" complete_cpuset="0x00800000" nodeset="0x00000008" complete_nodeset="0x00000008" gp_index="108"/>
+            </object>
+          </object>
+        </object>
+      </object>
+    </object>
+  </object>
+</topology>
--- a/doc/topology/AMD_Opteron_6348_x4_N8_linux_2_0_4.xml
+++ b/doc/topology/AMD_Opteron_6348_x4_N8_linux_2_0_4.xml
@ -0,0 +1,399 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!DOCTYPE topology SYSTEM "hwloc2.dtd">
+<topology version="2.0">
+  <object type="Machine" os_index="0" cpuset="0xffffffff" complete_cpuset="0xffffffff" allowed_cpuset="0xffffffff" nodeset="0x00000066" complete_nodeset="0x000000ff" allowed_nodeset="0x00000066" gp_index="1">
+    <info name="DMIProductName" value="H8QG6"/>
+    <info name="DMIProductVersion" value="1234567890"/>
+    <info name="DMIProductSerial" value="1234567890"/>
+    <info name="DMIProductUUID" value="0"/>
+    <info name="DMIBoardVendor" value="Supermicro"/>
+    <info name="DMIBoardName" value="H8QG6"/>
+    <info name="DMIBoardVersion" value="1234567890"/>
+    <info name="DMIBoardSerial" value="0"/>
+    <info name="DMIBoardAssetTag" value="1234567890"/>
+    <info name="DMIChassisVendor" value="Supermicro"/>
+    <info name="DMIChassisType" value="3"/>
+    <info name="DMIChassisVersion" value="1234567890"/>
+    <info name="DMIChassisSerial" value="1234567890."/>
+    <info name="DMIChassisAssetTag" value="1234567890"/>
+    <info name="DMIBIOSVendor" value="American Megatrends Inc."/>
+    <info name="DMIBIOSVersion" value="080016 "/>
+    <info name="DMIBIOSDate" value="10/11/2010"/>
+    <info name="DMISysVendor" value="Supermicro"/>
+    <info name="Backend" value="Linux"/>
+    <info name="LinuxCgroup" value="/"/>
+    <info name="OSName" value="Linux"/>
+    <info name="OSRelease" value="4.15.0-20-generic"/>
+    <info name="OSVersion" value="#21-Ubuntu SMP Tue Apr 24 06:16:15 UTC 2018"/>
+    <info name="HostName" value="host"/>
+    <info name="Architecture" value="x86_64"/>
+    <info name="hwlocVersion" value="2.0.4"/>
+    <info name="ProcessName" value="xmrig"/>
+    <object type="Package" os_index="0" cpuset="0x000000ff" complete_cpuset="0x000000ff" nodeset="0x00000002" complete_nodeset="0x00000003" gp_index="2">
+      <info name="CPUVendor" value="AuthenticAMD"/>
+      <info name="CPUFamilyNumber" value="16"/>
+      <info name="CPUModelNumber" value="9"/>
+      <info name="CPUModel" value="AMD Opteron(tm) Processor 6128"/>
+      <info name="CPUStepping" value="1"/>
+      <object type="L3Cache" cpuset="0x0000000f" complete_cpuset="0x0000000f" nodeset="0x0" complete_nodeset="0x00000001" gp_index="7" cache_size="5240832" depth="3" cache_linesize="64" cache_associativity="48" cache_type="0">
+        <info name="Inclusive" value="0"/>
+        <object type="L2Cache" cpuset="0x00000001" complete_cpuset="0x00000001" nodeset="0x0" complete_nodeset="0x00000001" gp_index="6" cache_size="524288" depth="2" cache_linesize="64" cache_associativity="16" cache_type="0">
+          <info name="Inclusive" value="0"/>
+          <object type="L1Cache" cpuset="0x00000001" complete_cpuset="0x00000001" nodeset="0x0" complete_nodeset="0x00000001" gp_index="5" cache_size="65536" depth="1" cache_linesize="64" cache_associativity="2" cache_type="1">
+            <info name="Inclusive" value="0"/>
+            <object type="Core" os_index="0" cpuset="0x00000001" complete_cpuset="0x00000001" nodeset="0x0" complete_nodeset="0x00000001" gp_index="3">
+              <object type="PU" os_index="0" cpuset="0x00000001" complete_cpuset="0x00000001" nodeset="0x0" complete_nodeset="0x00000001" gp_index="4"/>
+            </object>
+          </object>
+        </object>
+        <object type="L2Cache" cpuset="0x00000002" complete_cpuset="0x00000002" nodeset="0x0" complete_nodeset="0x00000001" gp_index="11" cache_size="524288" depth="2" cache_linesize="64" cache_associativity="16" cache_type="0">
+          <info name="Inclusive" value="0"/>
+          <object type="L1Cache" cpuset="0x00000002" complete_cpuset="0x00000002" nodeset="0x0" complete_nodeset="0x00000001" gp_index="10" cache_size="65536" depth="1" cache_linesize="64" cache_associativity="2" cache_type="1">
+            <info name="Inclusive" value="0"/>
+            <object type="Core" os_index="1" cpuset="0x00000002" complete_cpuset="0x00000002" nodeset="0x0" complete_nodeset="0x00000001" gp_index="8">
+              <object type="PU" os_index="1" cpuset="0x00000002" complete_cpuset="0x00000002" nodeset="0x0" complete_nodeset="0x00000001" gp_index="9"/>
+            </object>
+          </object>
+        </object>
+        <object type="L2Cache" cpuset="0x00000004" complete_cpuset="0x00000004" nodeset="0x0" complete_nodeset="0x00000001" gp_index="15" cache_size="524288" depth="2" cache_linesize="64" cache_associativity="16" cache_type="0">
+          <info name="Inclusive" value="0"/>
+          <object type="L1Cache" cpuset="0x00000004" complete_cpuset="0x00000004" nodeset="0x0" complete_nodeset="0x00000001" gp_index="14" cache_size="65536" depth="1" cache_linesize="64" cache_associativity="2" cache_type="1">
+            <info name="Inclusive" value="0"/>
+            <object type="Core" os_index="2" cpuset="0x00000004" complete_cpuset="0x00000004" nodeset="0x0" complete_nodeset="0x00000001" gp_index="12">
+              <object type="PU" os_index="2" cpuset="0x00000004" complete_cpuset="0x00000004" nodeset="0x0" complete_nodeset="0x00000001" gp_index="13"/>
+            </object>
+          </object>
+        </object>
+        <object type="L2Cache" cpuset="0x00000008" complete_cpuset="0x00000008" nodeset="0x0" complete_nodeset="0x00000001" gp_index="19" cache_size="524288" depth="2" cache_linesize="64" cache_associativity="16" cache_type="0">
+          <info name="Inclusive" value="0"/>
+          <object type="L1Cache" cpuset="0x00000008" complete_cpuset="0x00000008" nodeset="0x0" complete_nodeset="0x00000001" gp_index="18" cache_size="65536" depth="1" cache_linesize="64" cache_associativity="2" cache_type="1">
+            <info name="Inclusive" value="0"/>
+            <object type="Core" os_index="3" cpuset="0x00000008" complete_cpuset="0x00000008" nodeset="0x0" complete_nodeset="0x00000001" gp_index="16">
+              <object type="PU" os_index="3" cpuset="0x00000008" complete_cpuset="0x00000008" nodeset="0x0" complete_nodeset="0x00000001" gp_index="17"/>
+            </object>
+          </object>
+        </object>
+      </object>
+      <object type="L3Cache" cpuset="0x000000f0" complete_cpuset="0x000000f0" nodeset="0x00000002" complete_nodeset="0x00000002" gp_index="24" cache_size="5240832" depth="3" cache_linesize="64" cache_associativity="48" cache_type="0">
+        <info name="Inclusive" value="0"/>
+        <object type="NUMANode" os_index="1" cpuset="0x000000f0" complete_cpuset="0x000000f0" nodeset="0x00000002" complete_nodeset="0x00000002" gp_index="143" local_memory="4156817408">
+          <page_type size="4096" count="854592"/>
+          <page_type size="2097152" count="313"/>
+          <page_type size="1073741824" count="0"/>
+        </object>
+        <object type="L2Cache" cpuset="0x00000010" complete_cpuset="0x00000010" nodeset="0x00000002" complete_nodeset="0x00000002" gp_index="23" cache_size="524288" depth="2" cache_linesize="64" cache_associativity="16" cache_type="0">
+          <info name="Inclusive" value="0"/>
+          <object type="L1Cache" cpuset="0x00000010" complete_cpuset="0x00000010" nodeset="0x00000002" complete_nodeset="0x00000002" gp_index="22" cache_size="65536" depth="1" cache_linesize="64" cache_associativity="2" cache_type="1">
+            <info name="Inclusive" value="0"/>
+            <object type="Core" os_index="0" cpuset="0x00000010" complete_cpuset="0x00000010" nodeset="0x00000002" complete_nodeset="0x00000002" gp_index="20">
+              <object type="PU" os_index="4" cpuset="0x00000010" complete_cpuset="0x00000010" nodeset="0x00000002" complete_nodeset="0x00000002" gp_index="21"/>
+            </object>
+          </object>
+        </object>
+        <object type="L2Cache" cpuset="0x00000020" complete_cpuset="0x00000020" nodeset="0x00000002" complete_nodeset="0x00000002" gp_index="28" cache_size="524288" depth="2" cache_linesize="64" cache_associativity="16" cache_type="0">
+          <info name="Inclusive" value="0"/>
+          <object type="L1Cache" cpuset="0x00000020" complete_cpuset="0x00000020" nodeset="0x00000002" complete_nodeset="0x00000002" gp_index="27" cache_size="65536" depth="1" cache_linesize="64" cache_associativity="2" cache_type="1">
+            <info name="Inclusive" value="0"/>
+            <object type="Core" os_index="1" cpuset="0x00000020" complete_cpuset="0x00000020" nodeset="0x00000002" complete_nodeset="0x00000002" gp_index="25">
+              <object type="PU" os_index="5" cpuset="0x00000020" complete_cpuset="0x00000020" nodeset="0x00000002" complete_nodeset="0x00000002" gp_index="26"/>
+            </object>
+          </object>
+        </object>
+        <object type="L2Cache" cpuset="0x00000040" complete_cpuset="0x00000040" nodeset="0x00000002" complete_nodeset="0x00000002" gp_index="32" cache_size="524288" depth="2" cache_linesize="64" cache_associativity="16" cache_type="0">
+          <info name="Inclusive" value="0"/>
+          <object type="L1Cache" cpuset="0x00000040" complete_cpuset="0x00000040" nodeset="0x00000002" complete_nodeset="0x00000002" gp_index="31" cache_size="65536" depth="1" cache_linesize="64" cache_associativity="2" cache_type="1">
+            <info name="Inclusive" value="0"/>
+            <object type="Core" os_index="2" cpuset="0x00000040" complete_cpuset="0x00000040" nodeset="0x00000002" complete_nodeset="0x00000002" gp_index="29">
+              <object type="PU" os_index="6" cpuset="0x00000040" complete_cpuset="0x00000040" nodeset="0x00000002" complete_nodeset="0x00000002" gp_index="30"/>
+            </object>
+          </object>
+        </object>
+        <object type="L2Cache" cpuset="0x00000080" complete_cpuset="0x00000080" nodeset="0x00000002" complete_nodeset="0x00000002" gp_index="36" cache_size="524288" depth="2" cache_linesize="64" cache_associativity="16" cache_type="0">
+          <info name="Inclusive" value="0"/>
+          <object type="L1Cache" cpuset="0x00000080" complete_cpuset="0x00000080" nodeset="0x00000002" complete_nodeset="0x00000002" gp_index="35" cache_size="65536" depth="1" cache_linesize="64" cache_associativity="2" cache_type="1">
+            <info name="Inclusive" value="0"/>
+            <object type="Core" os_index="3" cpuset="0x00000080" complete_cpuset="0x00000080" nodeset="0x00000002" complete_nodeset="0x00000002" gp_index="33">
+              <object type="PU" os_index="7" cpuset="0x00000080" complete_cpuset="0x00000080" nodeset="0x00000002" complete_nodeset="0x00000002" gp_index="34"/>
+            </object>
+          </object>
+        </object>
+      </object>
+    </object>
+    <object type="Package" os_index="1" cpuset="0x0000ff00" complete_cpuset="0x0000ff00" nodeset="0x00000004" complete_nodeset="0x0000000c" gp_index="37">
+      <info name="CPUVendor" value="AuthenticAMD"/>
+      <info name="CPUFamilyNumber" value="16"/>
+      <info name="CPUModelNumber" value="9"/>
+      <info name="CPUModel" value="AMD Opteron(tm) Processor 6128"/>
+      <info name="CPUStepping" value="1"/>
+      <object type="L3Cache" cpuset="0x00000f00" complete_cpuset="0x00000f00" nodeset="0x00000004" complete_nodeset="0x00000004" gp_index="42" cache_size="5240832" depth="3" cache_linesize="64" cache_associativity="48" cache_type="0">
+        <info name="Inclusive" value="0"/>
+        <object type="NUMANode" os_index="2" cpuset="0x00000f00" complete_cpuset="0x00000f00" nodeset="0x00000004" complete_nodeset="0x00000004" gp_index="144" local_memory="4204060672">
+          <page_type size="4096" count="866126"/>
+          <page_type size="2097152" count="313"/>
+          <page_type size="1073741824" count="0"/>
+        </object>
+        <object type="L2Cache" cpuset="0x00000100" complete_cpuset="0x00000100" nodeset="0x00000004" complete_nodeset="0x00000004" gp_index="41" cache_size="524288" depth="2" cache_linesize="64" cache_associativity="16" cache_type="0">
+          <info name="Inclusive" value="0"/>
+          <object type="L1Cache" cpuset="0x00000100" complete_cpuset="0x00000100" nodeset="0x00000004" complete_nodeset="0x00000004" gp_index="40" cache_size="65536" depth="1" cache_linesize="64" cache_associativity="2" cache_type="1">
+            <info name="Inclusive" value="0"/>
+            <object type="Core" os_index="0" cpuset="0x00000100" complete_cpuset="0x00000100" nodeset="0x00000004" complete_nodeset="0x00000004" gp_index="38">
+              <object type="PU" os_index="8" cpuset="0x00000100" complete_cpuset="0x00000100" nodeset="0x00000004" complete_nodeset="0x00000004" gp_index="39"/>
+            </object>
+          </object>
+        </object>
+        <object type="L2Cache" cpuset="0x00000200" complete_cpuset="0x00000200" nodeset="0x00000004" complete_nodeset="0x00000004" gp_index="46" cache_size="524288" depth="2" cache_linesize="64" cache_associativity="16" cache_type="0">
+          <info name="Inclusive" value="0"/>
+          <object type="L1Cache" cpuset="0x00000200" complete_cpuset="0x00000200" nodeset="0x00000004" complete_nodeset="0x00000004" gp_index="45" cache_size="65536" depth="1" cache_linesize="64" cache_associativity="2" cache_type="1">
+            <info name="Inclusive" value="0"/>
+            <object type="Core" os_index="1" cpuset="0x00000200" complete_cpuset="0x00000200" nodeset="0x00000004" complete_nodeset="0x00000004" gp_index="43">
+              <object type="PU" os_index="9" cpuset="0x00000200" complete_cpuset="0x00000200" nodeset="0x00000004" complete_nodeset="0x00000004" gp_index="44"/>
+            </object>
+          </object>
+        </object>
+        <object type="L2Cache" cpuset="0x00000400" complete_cpuset="0x00000400" nodeset="0x00000004" complete_nodeset="0x00000004" gp_index="50" cache_size="524288" depth="2" cache_linesize="64" cache_associativity="16" cache_type="0">
+          <info name="Inclusive" value="0"/>
+          <object type="L1Cache" cpuset="0x00000400" complete_cpuset="0x00000400" nodeset="0x00000004" complete_nodeset="0x00000004" gp_index="49" cache_size="65536" depth="1" cache_linesize="64" cache_associativity="2" cache_type="1">
+            <info name="Inclusive" value="0"/>
+            <object type="Core" os_index="2" cpuset="0x00000400" complete_cpuset="0x00000400" nodeset="0x00000004" complete_nodeset="0x00000004" gp_index="47">
+              <object type="PU" os_index="10" cpuset="0x00000400" complete_cpuset="0x00000400" nodeset="0x00000004" complete_nodeset="0x00000004" gp_index="48"/>
+            </object>
+          </object>
+        </object>
+        <object type="L2Cache" cpuset="0x00000800" complete_cpuset="0x00000800" nodeset="0x00000004" complete_nodeset="0x00000004" gp_index="54" cache_size="524288" depth="2" cache_linesize="64" cache_associativity="16" cache_type="0">
+          <info name="Inclusive" value="0"/>
+          <object type="L1Cache" cpuset="0x00000800" complete_cpuset="0x00000800" nodeset="0x00000004" complete_nodeset="0x00000004" gp_index="53" cache_size="65536" depth="1" cache_linesize="64" cache_associativity="2" cache_type="1">
+            <info name="Inclusive" value="0"/>
+            <object type="Core" os_index="3" cpuset="0x00000800" complete_cpuset="0x00000800" nodeset="0x00000004" complete_nodeset="0x00000004" gp_index="51">
+              <object type="PU" os_index="11" cpuset="0x00000800" complete_cpuset="0x00000800" nodeset="0x00000004" complete_nodeset="0x00000004" gp_index="52"/>
+            </object>
+          </object>
+        </object>
+      </object>
+      <object type="L3Cache" cpuset="0x0000f000" complete_cpuset="0x0000f000" nodeset="0x0" complete_nodeset="0x00000008" gp_index="59" cache_size="5240832" depth="3" cache_linesize="64" cache_associativity="48" cache_type="0">
+        <info name="Inclusive" value="0"/>
+        <object type="L2Cache" cpuset="0x00001000" complete_cpuset="0x00001000" nodeset="0x0" complete_nodeset="0x00000008" gp_index="58" cache_size="524288" depth="2" cache_linesize="64" cache_associativity="16" cache_type="0">
+          <info name="Inclusive" value="0"/>
+          <object type="L1Cache" cpuset="0x00001000" complete_cpuset="0x00001000" nodeset="0x0" complete_nodeset="0x00000008" gp_index="57" cache_size="65536" depth="1" cache_linesize="64" cache_associativity="2" cache_type="1">
+            <info name="Inclusive" value="0"/>
+            <object type="Core" os_index="0" cpuset="0x00001000" complete_cpuset="0x00001000" nodeset="0x0" complete_nodeset="0x00000008" gp_index="55">
+              <object type="PU" os_index="12" cpuset="0x00001000" complete_cpuset="0x00001000" nodeset="0x0" complete_nodeset="0x00000008" gp_index="56"/>
+            </object>
+          </object>
+        </object>
+        <object type="L2Cache" cpuset="0x00002000" complete_cpuset="0x00002000" nodeset="0x0" complete_nodeset="0x00000008" gp_index="63" cache_size="524288" depth="2" cache_linesize="64" cache_associativity="16" cache_type="0">
+          <info name="Inclusive" value="0"/>
+          <object type="L1Cache" cpuset="0x00002000" complete_cpuset="0x00002000" nodeset="0x0" complete_nodeset="0x00000008" gp_index="62" cache_size="65536" depth="1" cache_linesize="64" cache_associativity="2" cache_type="1">
+            <info name="Inclusive" value="0"/>
+            <object type="Core" os_index="1" cpuset="0x00002000" complete_cpuset="0x00002000" nodeset="0x0" complete_nodeset="0x00000008" gp_index="60">
+              <object type="PU" os_index="13" cpuset="0x00002000" complete_cpuset="0x00002000" nodeset="0x0" complete_nodeset="0x00000008" gp_index="61"/>
+            </object>
+          </object>
+        </object>
+        <object type="L2Cache" cpuset="0x00004000" complete_cpuset="0x00004000" nodeset="0x0" complete_nodeset="0x00000008" gp_index="67" cache_size="524288" depth="2" cache_linesize="64" cache_associativity="16" cache_type="0">
+          <info name="Inclusive" value="0"/>
+          <object type="L1Cache" cpuset="0x00004000" complete_cpuset="0x00004000" nodeset="0x0" complete_nodeset="0x00000008" gp_index="66" cache_size="65536" depth="1" cache_linesize="64" cache_associativity="2" cache_type="1">
+            <info name="Inclusive" value="0"/>
+            <object type="Core" os_index="2" cpuset="0x00004000" complete_cpuset="0x00004000" nodeset="0x0" complete_nodeset="0x00000008" gp_index="64">
+              <object type="PU" os_index="14" cpuset="0x00004000" complete_cpuset="0x00004000" nodeset="0x0" complete_nodeset="0x00000008" gp_index="65"/>
+            </object>
+          </object>
+        </object>
+        <object type="L2Cache" cpuset="0x00008000" complete_cpuset="0x00008000" nodeset="0x0" complete_nodeset="0x00000008" gp_index="71" cache_size="524288" depth="2" cache_linesize="64" cache_associativity="16" cache_type="0">
+          <info name="Inclusive" value="0"/>
+          <object type="L1Cache" cpuset="0x00008000" complete_cpuset="0x00008000" nodeset="0x0" complete_nodeset="0x00000008" gp_index="70" cache_size="65536" depth="1" cache_linesize="64" cache_associativity="2" cache_type="1">
+            <info name="Inclusive" value="0"/>
+            <object type="Core" os_index="3" cpuset="0x00008000" complete_cpuset="0x00008000" nodeset="0x0" complete_nodeset="0x00000008" gp_index="68">
+              <object type="PU" os_index="15" cpuset="0x00008000" complete_cpuset="0x00008000" nodeset="0x0" complete_nodeset="0x00000008" gp_index="69"/>
+            </object>
+          </object>
+        </object>
+      </object>
+    </object>
+    <object type="Package" os_index="2" cpuset="0x00ff0000" complete_cpuset="0x00ff0000" nodeset="0x00000020" complete_nodeset="0x00000030" gp_index="72">
+      <info name="CPUVendor" value="AuthenticAMD"/>
+      <info name="CPUFamilyNumber" value="16"/>
+      <info name="CPUModelNumber" value="9"/>
+      <info name="CPUModel" value="AMD Opteron(tm) Processor 6128"/>
+      <info name="CPUStepping" value="1"/>
+      <object type="L3Cache" cpuset="0x000f0000" complete_cpuset="0x000f0000" nodeset="0x0" complete_nodeset="0x00000010" gp_index="77" cache_size="5240832" depth="3" cache_linesize="64" cache_associativity="48" cache_type="0">
+        <info name="Inclusive" value="0"/>
+        <object type="L2Cache" cpuset="0x00010000" complete_cpuset="0x00010000" nodeset="0x0" complete_nodeset="0x00000010" gp_index="76" cache_size="524288" depth="2" cache_linesize="64" cache_associativity="16" cache_type="0">
+          <info name="Inclusive" value="0"/>
+          <object type="L1Cache" cpuset="0x00010000" complete_cpuset="0x00010000" nodeset="0x0" complete_nodeset="0x00000010" gp_index="75" cache_size="65536" depth="1" cache_linesize="64" cache_associativity="2" cache_type="1">
+            <info name="Inclusive" value="0"/>
+            <object type="Core" os_index="0" cpuset="0x00010000" complete_cpuset="0x00010000" nodeset="0x0" complete_nodeset="0x00000010" gp_index="73">
+              <object type="PU" os_index="16" cpuset="0x00010000" complete_cpuset="0x00010000" nodeset="0x0" complete_nodeset="0x00000010" gp_index="74"/>
+            </object>
+          </object>
+        </object>
+        <object type="L2Cache" cpuset="0x00020000" complete_cpuset="0x00020000" nodeset="0x0" complete_nodeset="0x00000010" gp_index="81" cache_size="524288" depth="2" cache_linesize="64" cache_associativity="16" cache_type="0">
+          <info name="Inclusive" value="0"/>
+          <object type="L1Cache" cpuset="0x00020000" complete_cpuset="0x00020000" nodeset="0x0" complete_nodeset="0x00000010" gp_index="80" cache_size="65536" depth="1" cache_linesize="64" cache_associativity="2" cache_type="1">
+            <info name="Inclusive" value="0"/>
+            <object type="Core" os_index="1" cpuset="0x00020000" complete_cpuset="0x00020000" nodeset="0x0" complete_nodeset="0x00000010" gp_index="78">
+              <object type="PU" os_index="17" cpuset="0x00020000" complete_cpuset="0x00020000" nodeset="0x0" complete_nodeset="0x00000010" gp_index="79"/>
+            </object>
+          </object>
+        </object>
+        <object type="L2Cache" cpuset="0x00040000" complete_cpuset="0x00040000" nodeset="0x0" complete_nodeset="0x00000010" gp_index="85" cache_size="524288" depth="2" cache_linesize="64" cache_associativity="16" cache_type="0">
+          <info name="Inclusive" value="0"/>
+          <object type="L1Cache" cpuset="0x00040000" complete_cpuset="0x00040000" nodeset="0x0" complete_nodeset="0x00000010" gp_index="84" cache_size="65536" depth="1" cache_linesize="64" cache_associativity="2" cache_type="1">
+            <info name="Inclusive" value="0"/>
+            <object type="Core" os_index="2" cpuset="0x00040000" complete_cpuset="0x00040000" nodeset="0x0" complete_nodeset="0x00000010" gp_index="82">
+              <object type="PU" os_index="18" cpuset="0x00040000" complete_cpuset="0x00040000" nodeset="0x0" complete_nodeset="0x00000010" gp_index="83"/>
+            </object>
+          </object>
+        </object>
+        <object type="L2Cache" cpuset="0x00080000" complete_cpuset="0x00080000" nodeset="0x0" complete_nodeset="0x00000010" gp_index="89" cache_size="524288" depth="2" cache_linesize="64" cache_associativity="16" cache_type="0">
+          <info name="Inclusive" value="0"/>
+          <object type="L1Cache" cpuset="0x00080000" complete_cpuset="0x00080000" nodeset="0x0" complete_nodeset="0x00000010" gp_index="88" cache_size="65536" depth="1" cache_linesize="64" cache_associativity="2" cache_type="1">
+            <info name="Inclusive" value="0"/>
+            <object type="Core" os_index="3" cpuset="0x00080000" complete_cpuset="0x00080000" nodeset="0x0" complete_nodeset="0x00000010" gp_index="86">
+              <object type="PU" os_index="19" cpuset="0x00080000" complete_cpuset="0x00080000" nodeset="0x0" complete_nodeset="0x00000010" gp_index="87"/>
+            </object>
+          </object>
+        </object>
+      </object>
+      <object type="L3Cache" cpuset="0x00f00000" complete_cpuset="0x00f00000" nodeset="0x00000020" complete_nodeset="0x00000020" gp_index="94" cache_size="5240832" depth="3" cache_linesize="64" cache_associativity="48" cache_type="0">
+        <info name="Inclusive" value="0"/>
+        <object type="NUMANode" os_index="5" cpuset="0x00f00000" complete_cpuset="0x00f00000" nodeset="0x00000020" complete_nodeset="0x00000020" gp_index="147" local_memory="4226170880">
+          <page_type size="4096" count="872036"/>
+          <page_type size="2097152" count="312"/>
+          <page_type size="1073741824" count="0"/>
+        </object>
+        <object type="L2Cache" cpuset="0x00100000" complete_cpuset="0x00100000" nodeset="0x00000020" complete_nodeset="0x00000020" gp_index="93" cache_size="524288" depth="2" cache_linesize="64" cache_associativity="16" cache_type="0">
+          <info name="Inclusive" value="0"/>
+          <object type="L1Cache" cpuset="0x00100000" complete_cpuset="0x00100000" nodeset="0x00000020" complete_nodeset="0x00000020" gp_index="92" cache_size="65536" depth="1" cache_linesize="64" cache_associativity="2" cache_type="1">
+            <info name="Inclusive" value="0"/>
+            <object type="Core" os_index="0" cpuset="0x00100000" complete_cpuset="0x00100000" nodeset="0x00000020" complete_nodeset="0x00000020" gp_index="90">
+              <object type="PU" os_index="20" cpuset="0x00100000" complete_cpuset="0x00100000" nodeset="0x00000020" complete_nodeset="0x00000020" gp_index="91"/>
+            </object>
+          </object>
+        </object>
+        <object type="L2Cache" cpuset="0x00200000" complete_cpuset="0x00200000" nodeset="0x00000020" complete_nodeset="0x00000020" gp_index="98" cache_size="524288" depth="2" cache_linesize="64" cache_associativity="16" cache_type="0">
+          <info name="Inclusive" value="0"/>
+          <object type="L1Cache" cpuset="0x00200000" complete_cpuset="0x00200000" nodeset="0x00000020" complete_nodeset="0x00000020" gp_index="97" cache_size="65536" depth="1" cache_linesize="64" cache_associativity="2" cache_type="1">
+            <info name="Inclusive" value="0"/>
+            <object type="Core" os_index="1" cpuset="0x00200000" complete_cpuset="0x00200000" nodeset="0x00000020" complete_nodeset="0x00000020" gp_index="95">
+              <object type="PU" os_index="21" cpuset="0x00200000" complete_cpuset="0x00200000" nodeset="0x00000020" complete_nodeset="0x00000020" gp_index="96"/>
+            </object>
+          </object>
+        </object>
+        <object type="L2Cache" cpuset="0x00400000" complete_cpuset="0x00400000" nodeset="0x00000020" complete_nodeset="0x00000020" gp_index="102" cache_size="524288" depth="2" cache_linesize="64" cache_associativity="16" cache_type="0">
+          <info name="Inclusive" value="0"/>
+          <object type="L1Cache" cpuset="0x00400000" complete_cpuset="0x00400000" nodeset="0x00000020" complete_nodeset="0x00000020" gp_index="101" cache_size="65536" depth="1" cache_linesize="64" cache_associativity="2" cache_type="1">
+            <info name="Inclusive" value="0"/>
+            <object type="Core" os_index="2" cpuset="0x00400000" complete_cpuset="0x00400000" nodeset="0x00000020" complete_nodeset="0x00000020" gp_index="99">
+              <object type="PU" os_index="22" cpuset="0x00400000" complete_cpuset="0x00400000" nodeset="0x00000020" complete_nodeset="0x00000020" gp_index="100"/>
+            </object>
+          </object>
+        </object>
+        <object type="L2Cache" cpuset="0x00800000" complete_cpuset="0x00800000" nodeset="0x00000020" complete_nodeset="0x00000020" gp_index="106" cache_size="524288" depth="2" cache_linesize="64" cache_associativity="16" cache_type="0">
+          <info name="Inclusive" value="0"/>
+          <object type="L1Cache" cpuset="0x00800000" complete_cpuset="0x00800000" nodeset="0x00000020" complete_nodeset="0x00000020" gp_index="105" cache_size="65536" depth="1" cache_linesize="64" cache_associativity="2" cache_type="1">
+            <info name="Inclusive" value="0"/>
+            <object type="Core" os_index="3" cpuset="0x00800000" complete_cpuset="0x00800000" nodeset="0x00000020" complete_nodeset="0x00000020" gp_index="103">
+              <object type="PU" os_index="23" cpuset="0x00800000" complete_cpuset="0x00800000" nodeset="0x00000020" complete_nodeset="0x00000020" gp_index="104"/>
+            </object>
+          </object>
+        </object>
+      </object>
+    </object>
+    <object type="Package" os_index="3" cpuset="0xff000000" complete_cpuset="0xff000000" nodeset="0x00000040" complete_nodeset="0x000000c0" gp_index="107">
+      <info name="CPUVendor" value="AuthenticAMD"/>
+      <info name="CPUFamilyNumber" value="16"/>
+      <info name="CPUModelNumber" value="9"/>
+      <info name="CPUModel" value="AMD Opteron(tm) Processor 6128"/>
+      <info name="CPUStepping" value="1"/>
+      <object type="L3Cache" cpuset="0x0f000000" complete_cpuset="0x0f000000" nodeset="0x00000040" complete_nodeset="0x00000040" gp_index="112" cache_size="5240832" depth="3" cache_linesize="64" cache_associativity="48" cache_type="0">
+        <info name="Inclusive" value="0"/>
+        <object type="NUMANode" os_index="6" cpuset="0x0f000000" complete_cpuset="0x0f000000" nodeset="0x00000040" complete_nodeset="0x00000040" gp_index="148" local_memory="4221870080">
+          <page_type size="4096" count="870986"/>
+          <page_type size="2097152" count="312"/>
+          <page_type size="1073741824" count="0"/>
+        </object>
+        <object type="L2Cache" cpuset="0x01000000" complete_cpuset="0x01000000" nodeset="0x00000040" complete_nodeset="0x00000040" gp_index="111" cache_size="524288" depth="2" cache_linesize="64" cache_associativity="16" cache_type="0">
+          <info name="Inclusive" value="0"/>
+          <object type="L1Cache" cpuset="0x01000000" complete_cpuset="0x01000000" nodeset="0x00000040" complete_nodeset="0x00000040" gp_index="110" cache_size="65536" depth="1" cache_linesize="64" cache_associativity="2" cache_type="1">
+            <info name="Inclusive" value="0"/>
+            <object type="Core" os_index="0" cpuset="0x01000000" complete_cpuset="0x01000000" nodeset="0x00000040" complete_nodeset="0x00000040" gp_index="108">
+              <object type="PU" os_index="24" cpuset="0x01000000" complete_cpuset="0x01000000" nodeset="0x00000040" complete_nodeset="0x00000040" gp_index="109"/>
+            </object>
+          </object>
+        </object>
+        <object type="L2Cache" cpuset="0x02000000" complete_cpuset="0x02000000" nodeset="0x00000040" complete_nodeset="0x00000040" gp_index="116" cache_size="524288" depth="2" cache_linesize="64" cache_associativity="16" cache_type="0">
+          <info name="Inclusive" value="0"/>
+          <object type="L1Cache" cpuset="0x02000000" complete_cpuset="0x02000000" nodeset="0x00000040" complete_nodeset="0x00000040" gp_index="115" cache_size="65536" depth="1" cache_linesize="64" cache_associativity="2" cache_type="1">
+            <info name="Inclusive" value="0"/>
+            <object type="Core" os_index="1" cpuset="0x02000000" complete_cpuset="0x02000000" nodeset="0x00000040" complete_nodeset="0x00000040" gp_index="113">
+              <object type="PU" os_index="25" cpuset="0x02000000" complete_cpuset="0x02000000" nodeset="0x00000040" complete_nodeset="0x00000040" gp_index="114"/>
+            </object>
+          </object>
+        </object>
+        <object type="L2Cache" cpuset="0x04000000" complete_cpuset="0x04000000" nodeset="0x00000040" complete_nodeset="0x00000040" gp_index="120" cache_size="524288" depth="2" cache_linesize="64" cache_associativity="16" cache_type="0">
+          <info name="Inclusive" value="0"/>
+          <object type="L1Cache" cpuset="0x04000000" complete_cpuset="0x04000000" nodeset="0x00000040" complete_nodeset="0x00000040" gp_index="119" cache_size="65536" depth="1" cache_linesize="64" cache_associativity="2" cache_type="1">
+            <info name="Inclusive" value="0"/>
+            <object type="Core" os_index="2" cpuset="0x04000000" complete_cpuset="0x04000000" nodeset="0x00000040" complete_nodeset="0x00000040" gp_index="117">
+              <object type="PU" os_index="26" cpuset="0x04000000" complete_cpuset="0x04000000" nodeset="0x00000040" complete_nodeset="0x00000040" gp_index="118"/>
+            </object>
+          </object>
+        </object>
+        <object type="L2Cache" cpuset="0x08000000" complete_cpuset="0x08000000" nodeset="0x00000040" complete_nodeset="0x00000040" gp_index="124" cache_size="524288" depth="2" cache_linesize="64" cache_associativity="16" cache_type="0">
+          <info name="Inclusive" value="0"/>
+          <object type="L1Cache" cpuset="0x08000000" complete_cpuset="0x08000000" nodeset="0x00000040" complete_nodeset="0x00000040" gp_index="123" cache_size="65536" depth="1" cache_linesize="64" cache_associativity="2" cache_type="1">
+            <info name="Inclusive" value="0"/>
+            <object type="Core" os_index="3" cpuset="0x08000000" complete_cpuset="0x08000000" nodeset="0x00000040" complete_nodeset="0x00000040" gp_index="121">
+              <object type="PU" os_index="27" cpuset="0x08000000" complete_cpuset="0x08000000" nodeset="0x00000040" complete_nodeset="0x00000040" gp_index="122"/>
+            </object>
+          </object>
+        </object>
+      </object>
+      <object type="L3Cache" cpuset="0xf0000000" complete_cpuset="0xf0000000" nodeset="0x0" complete_nodeset="0x00000080" gp_index="129" cache_size="5240832" depth="3" cache_linesize="64" cache_associativity="48" cache_type="0">
+        <info name="Inclusive" value="0"/>
+        <object type="L2Cache" cpuset="0x10000000" complete_cpuset="0x10000000" nodeset="0x0" complete_nodeset="0x00000080" gp_index="128" cache_size="524288" depth="2" cache_linesize="64" cache_associativity="16" cache_type="0">
+          <info name="Inclusive" value="0"/>
+          <object type="L1Cache" cpuset="0x10000000" complete_cpuset="0x10000000" nodeset="0x0" complete_nodeset="0x00000080" gp_index="127" cache_size="65536" depth="1" cache_linesize="64" cache_associativity="2" cache_type="1">
+            <info name="Inclusive" value="0"/>
+            <object type="Core" os_index="0" cpuset="0x10000000" complete_cpuset="0x10000000" nodeset="0x0" complete_nodeset="0x00000080" gp_index="125">
+              <object type="PU" os_index="28" cpuset="0x10000000" complete_cpuset="0x10000000" nodeset="0x0" complete_nodeset="0x00000080" gp_index="126"/>
+            </object>
+          </object>
+        </object>
+        <object type="L2Cache" cpuset="0x20000000" complete_cpuset="0x20000000" nodeset="0x0" complete_nodeset="0x00000080" gp_index="133" cache_size="524288" depth="2" cache_linesize="64" cache_associativity="16" cache_type="0">
+          <info name="Inclusive" value="0"/>
+          <object type="L1Cache" cpuset="0x20000000" complete_cpuset="0x20000000" nodeset="0x0" complete_nodeset="0x00000080" gp_index="132" cache_size="65536" depth="1" cache_linesize="64" cache_associativity="2" cache_type="1">
+            <info name="Inclusive" value="0"/>
+            <object type="Core" os_index="1" cpuset="0x20000000" complete_cpuset="0x20000000" nodeset="0x0" complete_nodeset="0x00000080" gp_index="130">
+              <object type="PU" os_index="29" cpuset="0x20000000" complete_cpuset="0x20000000" nodeset="0x0" complete_nodeset="0x00000080" gp_index="131"/>
+            </object>
+          </object>
+        </object>
+        <object type="L2Cache" cpuset="0x40000000" complete_cpuset="0x40000000" nodeset="0x0" complete_nodeset="0x00000080" gp_index="137" cache_size="524288" depth="2" cache_linesize="64" cache_associativity="16" cache_type="0">
+          <info name="Inclusive" value="0"/>
+          <object type="L1Cache" cpuset="0x40000000" complete_cpuset="0x40000000" nodeset="0x0" complete_nodeset="0x00000080" gp_index="136" cache_size="65536" depth="1" cache_linesize="64" cache_associativity="2" cache_type="1">
+            <info name="Inclusive" value="0"/>
+            <object type="Core" os_index="2" cpuset="0x40000000" complete_cpuset="0x40000000" nodeset="0x0" complete_nodeset="0x00000080" gp_index="134">
+              <object type="PU" os_index="30" cpuset="0x40000000" complete_cpuset="0x40000000" nodeset="0x0" complete_nodeset="0x00000080" gp_index="135"/>
+            </object>
+          </object>
+        </object>
+        <object type="L2Cache" cpuset="0x80000000" complete_cpuset="0x80000000" nodeset="0x0" complete_nodeset="0x00000080" gp_index="141" cache_size="524288" depth="2" cache_linesize="64" cache_associativity="16" cache_type="0">
+          <info name="Inclusive" value="0"/>
+          <object type="L1Cache" cpuset="0x80000000" complete_cpuset="0x80000000" nodeset="0x0" complete_nodeset="0x00000080" gp_index="140" cache_size="65536" depth="1" cache_linesize="64" cache_associativity="2" cache_type="1">
+            <info name="Inclusive" value="0"/>
+            <object type="Core" os_index="3" cpuset="0x80000000" complete_cpuset="0x80000000" nodeset="0x0" complete_nodeset="0x00000080" gp_index="138">
+              <object type="PU" os_index="31" cpuset="0x80000000" complete_cpuset="0x80000000" nodeset="0x0" complete_nodeset="0x00000080" gp_index="139"/>
+            </object>
+          </object>
+        </object>
+      </object>
+    </object>
+  </object>
+  <distances2 type="NUMANode" nbobjs="4" kind="5" indexing="os">
+    <indexes length="8">1 2 5 6 </indexes>
+    <u64values length="30">10 22 16 22 22 10 22 16 16 22 </u64values>
+    <u64values length="18">10 22 22 16 22 10 </u64values>
+  </distances2>
+</topology>
+
--- a/package.json
+++ b/package.json
@ -0,0 +1,23 @@
+{
+  "name": "xmrig",
+  "version": "3.0.0",
+  "description": "RandomX, CryptoNight and Argon2 miner",
+  "main": "index.js",
+  "directories": {
+    "doc": "doc"
+  },
+  "scripts": {
+    "build": "node scripts/generate_cl.js"
+  },
+  "repository": {
+    "type": "git",
+    "url": "git+https://github.com/xmrig/xmrig.git"
+  },
+  "keywords": [],
+  "author": "",
+  "license": "GPLv3",
+  "bugs": {
+    "url": "https://github.com/xmrig/xmrig/issues"
+  },
+  "homepage": "https://github.com/xmrig/xmrig#readme"
+}
--- a/scripts/generate_cl.js
+++ b/scripts/generate_cl.js
@ -0,0 +1,87 @@
+#!/usr/bin/env node
+
+'use strict';
+
+const fs = require('fs');
+const path = require('path');
+const { text2h, text2h_bundle, addIncludes } = require('./js/opencl');
+const { opencl_minify } = require('./js/opencl_minify');
+const cwd = process.cwd();
+
+
+function cn()
+{
+    const cn = opencl_minify(addIncludes('cryptonight.cl', [
+        'algorithm.cl',
+        'wolf-aes.cl',
+        'wolf-skein.cl',
+        'jh.cl',
+        'blake256.cl',
+        'groestl256.cl',
+        'fast_int_math_v2.cl',
+        'fast_div_heavy.cl',
+        'keccak.cl'
+    ]));
+
+    // fs.writeFileSync('cryptonight_gen.cl', cn);
+    fs.writeFileSync('cryptonight_cl.h', text2h(cn, 'xmrig', 'cryptonight_cl'));
+}
+
+
+function cn_r()
+{
+    const items = {};
+
+    items.cryptonight_r_defines_cl = opencl_minify(addIncludes('cryptonight_r_defines.cl', [ 'wolf-aes.cl' ]));
+    items.cryptonight_r_cl         = opencl_minify(fs.readFileSync('cryptonight_r.cl', 'utf8'));
+
+    // for (let key in items) {
+    //      fs.writeFileSync(key + '_gen.cl', items[key]);
+    // }
+
+    fs.writeFileSync('cryptonight_r_cl.h', text2h_bundle('xmrig', items));
+}
+
+
+function cn_gpu()
+{
+    const cn_gpu = opencl_minify(addIncludes('cryptonight_gpu.cl', [ 'wolf-aes.cl', 'keccak.cl' ]));
+
+    // fs.writeFileSync('cryptonight_gpu_gen.cl', cn_gpu);
+    fs.writeFileSync('cryptonight_gpu_cl.h', text2h(cn_gpu, 'xmrig', 'cryptonight_gpu_cl'));
+}
+
+
+function rx()
+{
+    let rx = addIncludes('randomx.cl', [
+        '../cn/algorithm.cl',
+        'randomx_constants_monero.h',
+        'randomx_constants_wow.h',
+        'randomx_constants_loki.h',
+        'randomx_constants_arqma.h',
+        'aes.cl',
+        'blake2b.cl',
+        'randomx_vm.cl',
+        'randomx_jit.cl'
+    ]);
+
+    rx = rx.replace(/(\t| )*#include "fillAes1Rx4.cl"/g, fs.readFileSync('fillAes1Rx4.cl', 'utf8'));
+    rx = rx.replace(/(\t| )*#include "blake2b_double_block.cl"/g, fs.readFileSync('blake2b_double_block.cl', 'utf8'));
+    rx = opencl_minify(rx);
+
+    //fs.writeFileSync('randomx_gen.cl', rx);
+    fs.writeFileSync('randomx_cl.h', text2h(rx, 'xmrig', 'randomx_cl'));
+}
+
+
+process.chdir(path.resolve('src/backend/opencl/cl/cn'));
+
+cn();
+cn_r();
+cn_gpu();
+
+process.chdir(cwd);
+process.chdir(path.resolve('src/backend/opencl/cl/rx'));
+
+rx();
--- a/scripts/js/opencl.js
+++ b/scripts/js/opencl.js
@ -0,0 +1,91 @@
+'use strict';
+
+const fs = require('fs');
+
+
+function bin2h(buf, namespace, name)
+{
+    const size = buf.byteLength;
+    let out    = `#pragma once\n\nnamespace ${namespace} {\n\nstatic unsigned char ${name}[${size}] = {\n    `;
+
+    let b = 32;
+    for (let i = 0; i < size; i++) {
+        out += `0x${buf.readUInt8(i).toString(16).padStart(2, '0')}${size - i > 1 ? ',' : ''}`;
+
+        if (--b === 0) {
+            b = 32;
+            out += '\n    ';
+        }
+    }
+
+    out += `\n};\n\n} // namespace ${namespace}\n`;
+
+    return out;
+}
+
+
+function text2h_internal(text, name)
+{
+    const buf  = Buffer.from(text);
+    const size = buf.byteLength;
+    let out    = `\nstatic char ${name}[${size + 1}] = {\n    `;
+
+    let b = 32;
+    for (let i = 0; i < size; i++) {
+        out += `0x${buf.readUInt8(i).toString(16).padStart(2, '0')},`;
+
+        if (--b === 0) {
+            b = 32;
+            out += '\n    ';
+        }
+    }
+
+    out += '0x00';
+
+    out += '\n};\n';
+
+    return out;
+}
+
+
+function text2h(text, namespace, name)
+{
+    return `#pragma once\n\nnamespace ${namespace} {\n` + text2h_internal(text, name) + `\n} // namespace ${namespace}\n`;
+}
+
+
+function text2h_bundle(namespace, items)
+{
+    let out = `#pragma once\n\nnamespace ${namespace} {\n`;
+
+    for (let key in items) {
+        out += text2h_internal(items[key], key);
+    }
+
+    return out + `\n} // namespace ${namespace}\n`;
+}
+
+
+function addInclude(input, name)
+{
+    return input.replace(`#include "${name}"`, fs.readFileSync(name, 'utf8'));
+}
+
+
+function addIncludes(inputFileName, names)
+{
+    let data = fs.readFileSync(inputFileName, 'utf8');
+
+    for (let name of names) {
+        data = addInclude(data, name);
+    }
+
+    return data;
+}
+
+
+module.exports.bin2h         = bin2h;
+module.exports.text2h        = text2h;
+module.exports.text2h_bundle = text2h_bundle;
+module.exports.addInclude    = addInclude;
+module.exports.addIncludes   = addIncludes;
--- a/scripts/js/opencl_minify.js
+++ b/scripts/js/opencl_minify.js
@ -0,0 +1,50 @@
+'use strict';
+
+function opencl_minify(input)
+{
+    let out = input.replace(/\/\*[\s\S]*?\*\/|\/\/.*$/gm, '');  // comments
+    out = out.replace(/^#\s+/gm, '#');        // macros with spaces
+    out = out.replace(/\n{2,}/g, '\n');       // empty lines
+    out = out.replace(/^\s+/gm, '');          // leading whitespace
+    out = out.replace(/ {2,}/g, ' ');         // extra whitespace
+
+    let array = out.split('\n').map(line => {
+        if (line[0] === '#') {
+            return line
+        }
+
+        line = line.replace(/, /g, ',');
+        line = line.replace(/ \? /g, '?');
+        line = line.replace(/ : /g, ':');
+        line = line.replace(/ = /g, '=');
+        line = line.replace(/ != /g, '!=');
+        line = line.replace(/ >= /g, '>=');
+        line = line.replace(/ <= /g, '<=');
+        line = line.replace(/ == /g, '==');
+        line = line.replace(/ \+= /g, '+=');
+        line = line.replace(/ -= /g, '-=');
+        line = line.replace(/ \|= /g, '|=');
+        line = line.replace(/ \| /g, '|');
+        line = line.replace(/ \|\| /g, '||');
+        line = line.replace(/ & /g, '&');
+        line = line.replace(/ && /g, '&&');
+        line = line.replace(/ > /g, '>');
+        line = line.replace(/ < /g, '<');
+        line = line.replace(/ \+ /g, '+');
+        line = line.replace(/ - /g, '-');
+        line = line.replace(/ \* /g, '*');
+        line = line.replace(/ \^ /g, '^');
+        line = line.replace(/ & /g, '&');
+        line = line.replace(/ \/ /g, '/');
+        line = line.replace(/ << /g, '<<');
+        line = line.replace(/ >> /g, '>>');
+        line = line.replace(/if \(/g, 'if(');
+
+        return line;
+    });
+
+    return array.join('\n');
+}
+
+
+module.exports.opencl_minify = opencl_minify;
--- a/src/3rdparty/CL/LICENSE
+++ b/src/3rdparty/CL/LICENSE
@ -0,0 +1,25 @@
+Copyright (c) 2008-2015 The Khronos Group Inc.
+
+Permission is hereby granted, free of charge, to any person obtaining a
+copy of this software and/or associated documentation files (the
+"Materials"), to deal in the Materials without restriction, including
+without limitation the rights to use, copy, modify, merge, publish,
+distribute, sublicense, and/or sell copies of the Materials, and to
+permit persons to whom the Materials are furnished to do so, subject to
+the following conditions:
+
+The above copyright notice and this permission notice shall be included
+in all copies or substantial portions of the Materials.
+
+MODIFICATIONS TO THIS FILE MAY MEAN IT NO LONGER ACCURATELY REFLECTS
+KHRONOS STANDARDS. THE UNMODIFIED, NORMATIVE VERSIONS OF KHRONOS
+SPECIFICATIONS AND HEADER INFORMATION ARE LOCATED AT
+   https://www.khronos.org/registry/
+
+THE MATERIALS ARE PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
+IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
+CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
+TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
+MATERIALS OR THE USE OR OTHER DEALINGS IN THE MATERIALS.
--- a/src/3rdparty/CL/README.md
+++ b/src/3rdparty/CL/README.md
@ -0,0 +1,50 @@
+# OpenCL<sup>TM</sup> API Headers
+
+This repository contains C language headers for the OpenCL API.
+
+The authoritative public repository for these headers is located at:
+
+https://github.com/KhronosGroup/OpenCL-Headers
+
+Issues, proposed fixes for issues, and other suggested changes should be
+created using Github.
+
+## Branch Structure
+
+The OpenCL API headers in this repository are Unified headers and are designed
+to work with all released OpenCL versions.  This differs from previous OpenCL
+API headers, where version-specific API headers either existed in separate
+branches, or in separate folders in a branch.
+
+## Compiling for a Specific OpenCL Version
+
+By default, the OpenCL API headers in this repository are for the latest
+OpenCL version (currently OpenCL 2.2).  To use these API headers to target
+a different OpenCL version, an application may `#define` the preprocessor
+value `CL_TARGET_OPENCL_VERSION` before including the OpenCL API headers.
+The `CL_TARGET_OPENCL_VERSION` is a three digit decimal value representing
+the OpenCL API version.
+
+For example, to enforce usage of no more than the OpenCL 1.2 APIs, you may
+include the OpenCL API headers as follows:
+
+```
+#define CL_TARGET_OPENCL_VERSION 120
+#include <CL/opencl.h>
+```
+
+## Directory Structure
+
+```
+README.md               This file
+LICENSE                 Source license for the OpenCL API headers
+CL/                     Unified OpenCL API headers tree
+```
+
+## License
+
+See [LICENSE](LICENSE).
+
+---
+
+OpenCL and the OpenCL logo are trademarks of Apple Inc. used by permission by Khronos.
--- a/src/3rdparty/CL/cl.h
+++ b/src/3rdparty/CL/cl.h
--- a/src/3rdparty/CL/cl_d3d10.h
+++ b/src/3rdparty/CL/cl_d3d10.h
@ -0,0 +1,131 @@
+/**********************************************************************************
+ * Copyright (c) 2008-2015 The Khronos Group Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and/or associated documentation files (the
+ * "Materials"), to deal in the Materials without restriction, including
+ * without limitation the rights to use, copy, modify, merge, publish,
+ * distribute, sublicense, and/or sell copies of the Materials, and to
+ * permit persons to whom the Materials are furnished to do so, subject to
+ * the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included
+ * in all copies or substantial portions of the Materials.
+ *
+ * MODIFICATIONS TO THIS FILE MAY MEAN IT NO LONGER ACCURATELY REFLECTS
+ * KHRONOS STANDARDS. THE UNMODIFIED, NORMATIVE VERSIONS OF KHRONOS
+ * SPECIFICATIONS AND HEADER INFORMATION ARE LOCATED AT
+ *    https://www.khronos.org/registry/
+ *
+ * THE MATERIALS ARE PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
+ * IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
+ * CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
+ * TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
+ * MATERIALS OR THE USE OR OTHER DEALINGS IN THE MATERIALS.
+ **********************************************************************************/
+
+/* $Revision: 11708 $ on $Date: 2010-06-13 23:36:24 -0700 (Sun, 13 Jun 2010) $ */
+
+#ifndef __OPENCL_CL_D3D10_H
+#define __OPENCL_CL_D3D10_H
+
+#include <d3d10.h>
+#include <CL/cl.h>
+#include <CL/cl_platform.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/******************************************************************************
+ * cl_khr_d3d10_sharing                                                       */
+#define cl_khr_d3d10_sharing 1
+
+typedef cl_uint cl_d3d10_device_source_khr;
+typedef cl_uint cl_d3d10_device_set_khr;
+
+/******************************************************************************/
+
+/* Error Codes */
+#define CL_INVALID_D3D10_DEVICE_KHR                  -1002
+#define CL_INVALID_D3D10_RESOURCE_KHR                -1003
+#define CL_D3D10_RESOURCE_ALREADY_ACQUIRED_KHR       -1004
+#define CL_D3D10_RESOURCE_NOT_ACQUIRED_KHR           -1005
+
+/* cl_d3d10_device_source_nv */
+#define CL_D3D10_DEVICE_KHR                          0x4010
+#define CL_D3D10_DXGI_ADAPTER_KHR                    0x4011
+
+/* cl_d3d10_device_set_nv */
+#define CL_PREFERRED_DEVICES_FOR_D3D10_KHR           0x4012
+#define CL_ALL_DEVICES_FOR_D3D10_KHR                 0x4013
+
+/* cl_context_info */
+#define CL_CONTEXT_D3D10_DEVICE_KHR                  0x4014
+#define CL_CONTEXT_D3D10_PREFER_SHARED_RESOURCES_KHR 0x402C
+
+/* cl_mem_info */
+#define CL_MEM_D3D10_RESOURCE_KHR                    0x4015
+
+/* cl_image_info */
+#define CL_IMAGE_D3D10_SUBRESOURCE_KHR               0x4016
+
+/* cl_command_type */
+#define CL_COMMAND_ACQUIRE_D3D10_OBJECTS_KHR         0x4017
+#define CL_COMMAND_RELEASE_D3D10_OBJECTS_KHR         0x4018
+
+/******************************************************************************/
+
+typedef CL_API_ENTRY cl_int (CL_API_CALL *clGetDeviceIDsFromD3D10KHR_fn)(
+    cl_platform_id             platform,
+    cl_d3d10_device_source_khr d3d_device_source,
+    void *                     d3d_object,
+    cl_d3d10_device_set_khr    d3d_device_set,
+    cl_uint                    num_entries,
+    cl_device_id *             devices,
+    cl_uint *                  num_devices) CL_API_SUFFIX__VERSION_1_0;
+
+typedef CL_API_ENTRY cl_mem (CL_API_CALL *clCreateFromD3D10BufferKHR_fn)(
+    cl_context     context,
+    cl_mem_flags   flags,
+    ID3D10Buffer * resource,
+    cl_int *       errcode_ret) CL_API_SUFFIX__VERSION_1_0;
+
+typedef CL_API_ENTRY cl_mem (CL_API_CALL *clCreateFromD3D10Texture2DKHR_fn)(
+    cl_context        context,
+    cl_mem_flags      flags,
+    ID3D10Texture2D * resource,
+    UINT              subresource,
+    cl_int *          errcode_ret) CL_API_SUFFIX__VERSION_1_0;
+
+typedef CL_API_ENTRY cl_mem (CL_API_CALL *clCreateFromD3D10Texture3DKHR_fn)(
+    cl_context        context,
+    cl_mem_flags      flags,
+    ID3D10Texture3D * resource,
+    UINT              subresource,
+    cl_int *          errcode_ret) CL_API_SUFFIX__VERSION_1_0;
+
+typedef CL_API_ENTRY cl_int (CL_API_CALL *clEnqueueAcquireD3D10ObjectsKHR_fn)(
+    cl_command_queue command_queue,
+    cl_uint          num_objects,
+    const cl_mem *   mem_objects,
+    cl_uint          num_events_in_wait_list,
+    const cl_event * event_wait_list,
+    cl_event *       event) CL_API_SUFFIX__VERSION_1_0;
+
+typedef CL_API_ENTRY cl_int (CL_API_CALL *clEnqueueReleaseD3D10ObjectsKHR_fn)(
+    cl_command_queue command_queue,
+    cl_uint          num_objects,
+    const cl_mem *   mem_objects,
+    cl_uint          num_events_in_wait_list,
+    const cl_event * event_wait_list,
+    cl_event *       event) CL_API_SUFFIX__VERSION_1_0;
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif  /* __OPENCL_CL_D3D10_H */
+
--- a/src/3rdparty/CL/cl_d3d11.h
+++ b/src/3rdparty/CL/cl_d3d11.h
@ -0,0 +1,131 @@
+/**********************************************************************************
+ * Copyright (c) 2008-2015 The Khronos Group Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and/or associated documentation files (the
+ * "Materials"), to deal in the Materials without restriction, including
+ * without limitation the rights to use, copy, modify, merge, publish,
+ * distribute, sublicense, and/or sell copies of the Materials, and to
+ * permit persons to whom the Materials are furnished to do so, subject to
+ * the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included
+ * in all copies or substantial portions of the Materials.
+ *
+ * MODIFICATIONS TO THIS FILE MAY MEAN IT NO LONGER ACCURATELY REFLECTS
+ * KHRONOS STANDARDS. THE UNMODIFIED, NORMATIVE VERSIONS OF KHRONOS
+ * SPECIFICATIONS AND HEADER INFORMATION ARE LOCATED AT
+ *    https://www.khronos.org/registry/
+ *
+ * THE MATERIALS ARE PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
+ * IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
+ * CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
+ * TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
+ * MATERIALS OR THE USE OR OTHER DEALINGS IN THE MATERIALS.
+ **********************************************************************************/
+
+/* $Revision: 11708 $ on $Date: 2010-06-13 23:36:24 -0700 (Sun, 13 Jun 2010) $ */
+
+#ifndef __OPENCL_CL_D3D11_H
+#define __OPENCL_CL_D3D11_H
+
+#include <d3d11.h>
+#include <CL/cl.h>
+#include <CL/cl_platform.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/******************************************************************************
+ * cl_khr_d3d11_sharing                                                       */
+#define cl_khr_d3d11_sharing 1
+
+typedef cl_uint cl_d3d11_device_source_khr;
+typedef cl_uint cl_d3d11_device_set_khr;
+
+/******************************************************************************/
+
+/* Error Codes */
+#define CL_INVALID_D3D11_DEVICE_KHR                  -1006
+#define CL_INVALID_D3D11_RESOURCE_KHR                -1007
+#define CL_D3D11_RESOURCE_ALREADY_ACQUIRED_KHR       -1008
+#define CL_D3D11_RESOURCE_NOT_ACQUIRED_KHR           -1009
+
+/* cl_d3d11_device_source */
+#define CL_D3D11_DEVICE_KHR                          0x4019
+#define CL_D3D11_DXGI_ADAPTER_KHR                    0x401A
+
+/* cl_d3d11_device_set */
+#define CL_PREFERRED_DEVICES_FOR_D3D11_KHR           0x401B
+#define CL_ALL_DEVICES_FOR_D3D11_KHR                 0x401C
+
+/* cl_context_info */
+#define CL_CONTEXT_D3D11_DEVICE_KHR                  0x401D
+#define CL_CONTEXT_D3D11_PREFER_SHARED_RESOURCES_KHR 0x402D
+
+/* cl_mem_info */
+#define CL_MEM_D3D11_RESOURCE_KHR                    0x401E
+
+/* cl_image_info */
+#define CL_IMAGE_D3D11_SUBRESOURCE_KHR               0x401F
+
+/* cl_command_type */
+#define CL_COMMAND_ACQUIRE_D3D11_OBJECTS_KHR         0x4020
+#define CL_COMMAND_RELEASE_D3D11_OBJECTS_KHR         0x4021
+
+/******************************************************************************/
+
+typedef CL_API_ENTRY cl_int (CL_API_CALL *clGetDeviceIDsFromD3D11KHR_fn)(
+    cl_platform_id             platform,
+    cl_d3d11_device_source_khr d3d_device_source,
+    void *                     d3d_object,
+    cl_d3d11_device_set_khr    d3d_device_set,
+    cl_uint                    num_entries,
+    cl_device_id *             devices,
+    cl_uint *                  num_devices) CL_API_SUFFIX__VERSION_1_2;
+
+typedef CL_API_ENTRY cl_mem (CL_API_CALL *clCreateFromD3D11BufferKHR_fn)(
+    cl_context     context,
+    cl_mem_flags   flags,
+    ID3D11Buffer * resource,
+    cl_int *       errcode_ret) CL_API_SUFFIX__VERSION_1_2;
+
+typedef CL_API_ENTRY cl_mem (CL_API_CALL *clCreateFromD3D11Texture2DKHR_fn)(
+    cl_context        context,
+    cl_mem_flags      flags,
+    ID3D11Texture2D * resource,
+    UINT              subresource,
+    cl_int *          errcode_ret) CL_API_SUFFIX__VERSION_1_2;
+
+typedef CL_API_ENTRY cl_mem (CL_API_CALL *clCreateFromD3D11Texture3DKHR_fn)(
+    cl_context        context,
+    cl_mem_flags      flags,
+    ID3D11Texture3D * resource,
+    UINT              subresource,
+    cl_int *          errcode_ret) CL_API_SUFFIX__VERSION_1_2;
+
+typedef CL_API_ENTRY cl_int (CL_API_CALL *clEnqueueAcquireD3D11ObjectsKHR_fn)(
+    cl_command_queue command_queue,
+    cl_uint          num_objects,
+    const cl_mem *   mem_objects,
+    cl_uint          num_events_in_wait_list,
+    const cl_event * event_wait_list,
+    cl_event *       event) CL_API_SUFFIX__VERSION_1_2;
+
+typedef CL_API_ENTRY cl_int (CL_API_CALL *clEnqueueReleaseD3D11ObjectsKHR_fn)(
+    cl_command_queue command_queue,
+    cl_uint          num_objects,
+    const cl_mem *   mem_objects,
+    cl_uint          num_events_in_wait_list,
+    const cl_event * event_wait_list,
+    cl_event *       event) CL_API_SUFFIX__VERSION_1_2;
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif  /* __OPENCL_CL_D3D11_H */
+
--- a/src/3rdparty/CL/cl_dx9_media_sharing.h
+++ b/src/3rdparty/CL/cl_dx9_media_sharing.h
@ -0,0 +1,132 @@
+/**********************************************************************************
+ * Copyright (c) 2008-2015 The Khronos Group Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and/or associated documentation files (the
+ * "Materials"), to deal in the Materials without restriction, including
+ * without limitation the rights to use, copy, modify, merge, publish,
+ * distribute, sublicense, and/or sell copies of the Materials, and to
+ * permit persons to whom the Materials are furnished to do so, subject to
+ * the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included
+ * in all copies or substantial portions of the Materials.
+ *
+ * MODIFICATIONS TO THIS FILE MAY MEAN IT NO LONGER ACCURATELY REFLECTS
+ * KHRONOS STANDARDS. THE UNMODIFIED, NORMATIVE VERSIONS OF KHRONOS
+ * SPECIFICATIONS AND HEADER INFORMATION ARE LOCATED AT
+ *    https://www.khronos.org/registry/
+ *
+ * THE MATERIALS ARE PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
+ * IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
+ * CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
+ * TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
+ * MATERIALS OR THE USE OR OTHER DEALINGS IN THE MATERIALS.
+ **********************************************************************************/
+
+/* $Revision: 11708 $ on $Date: 2010-06-13 23:36:24 -0700 (Sun, 13 Jun 2010) $ */
+
+#ifndef __OPENCL_CL_DX9_MEDIA_SHARING_H
+#define __OPENCL_CL_DX9_MEDIA_SHARING_H
+
+#include <CL/cl.h>
+#include <CL/cl_platform.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/******************************************************************************/
+/* cl_khr_dx9_media_sharing                                                   */
+#define cl_khr_dx9_media_sharing 1
+
+typedef cl_uint             cl_dx9_media_adapter_type_khr;
+typedef cl_uint             cl_dx9_media_adapter_set_khr;
+    
+#if defined(_WIN32)
+#include <d3d9.h>
+typedef struct _cl_dx9_surface_info_khr
+{
+    IDirect3DSurface9 *resource;
+    HANDLE shared_handle;
+} cl_dx9_surface_info_khr;
+#endif
+
+
+/******************************************************************************/
+
+/* Error Codes */
+#define CL_INVALID_DX9_MEDIA_ADAPTER_KHR                -1010
+#define CL_INVALID_DX9_MEDIA_SURFACE_KHR                -1011
+#define CL_DX9_MEDIA_SURFACE_ALREADY_ACQUIRED_KHR       -1012
+#define CL_DX9_MEDIA_SURFACE_NOT_ACQUIRED_KHR           -1013
+
+/* cl_media_adapter_type_khr */
+#define CL_ADAPTER_D3D9_KHR                              0x2020
+#define CL_ADAPTER_D3D9EX_KHR                            0x2021
+#define CL_ADAPTER_DXVA_KHR                              0x2022
+
+/* cl_media_adapter_set_khr */
+#define CL_PREFERRED_DEVICES_FOR_DX9_MEDIA_ADAPTER_KHR   0x2023
+#define CL_ALL_DEVICES_FOR_DX9_MEDIA_ADAPTER_KHR         0x2024
+
+/* cl_context_info */
+#define CL_CONTEXT_ADAPTER_D3D9_KHR                      0x2025
+#define CL_CONTEXT_ADAPTER_D3D9EX_KHR                    0x2026
+#define CL_CONTEXT_ADAPTER_DXVA_KHR                      0x2027
+
+/* cl_mem_info */
+#define CL_MEM_DX9_MEDIA_ADAPTER_TYPE_KHR                0x2028
+#define CL_MEM_DX9_MEDIA_SURFACE_INFO_KHR                0x2029
+
+/* cl_image_info */
+#define CL_IMAGE_DX9_MEDIA_PLANE_KHR                     0x202A
+
+/* cl_command_type */
+#define CL_COMMAND_ACQUIRE_DX9_MEDIA_SURFACES_KHR        0x202B
+#define CL_COMMAND_RELEASE_DX9_MEDIA_SURFACES_KHR        0x202C
+
+/******************************************************************************/
+
+typedef CL_API_ENTRY cl_int (CL_API_CALL *clGetDeviceIDsFromDX9MediaAdapterKHR_fn)(
+    cl_platform_id                   platform,
+    cl_uint                          num_media_adapters,
+    cl_dx9_media_adapter_type_khr *  media_adapter_type,
+    void *                           media_adapters,
+    cl_dx9_media_adapter_set_khr     media_adapter_set,
+    cl_uint                          num_entries,
+    cl_device_id *                   devices,
+    cl_uint *                        num_devices) CL_API_SUFFIX__VERSION_1_2;
+
+typedef CL_API_ENTRY cl_mem (CL_API_CALL *clCreateFromDX9MediaSurfaceKHR_fn)(
+    cl_context                    context,
+    cl_mem_flags                  flags,
+    cl_dx9_media_adapter_type_khr adapter_type,
+    void *                        surface_info,
+    cl_uint                       plane,                                                                          
+    cl_int *                      errcode_ret) CL_API_SUFFIX__VERSION_1_2;
+
+typedef CL_API_ENTRY cl_int (CL_API_CALL *clEnqueueAcquireDX9MediaSurfacesKHR_fn)(
+    cl_command_queue command_queue,
+    cl_uint          num_objects,
+    const cl_mem *   mem_objects,
+    cl_uint          num_events_in_wait_list,
+    const cl_event * event_wait_list,
+    cl_event *       event) CL_API_SUFFIX__VERSION_1_2;
+
+typedef CL_API_ENTRY cl_int (CL_API_CALL *clEnqueueReleaseDX9MediaSurfacesKHR_fn)(
+    cl_command_queue command_queue,
+    cl_uint          num_objects,
+    const cl_mem *   mem_objects,
+    cl_uint          num_events_in_wait_list,
+    const cl_event * event_wait_list,
+    cl_event *       event) CL_API_SUFFIX__VERSION_1_2;
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif  /* __OPENCL_CL_DX9_MEDIA_SHARING_H */
+
--- a/src/3rdparty/CL/cl_dx9_media_sharing_intel.h
+++ b/src/3rdparty/CL/cl_dx9_media_sharing_intel.h
@ -0,0 +1,182 @@
+/**********************************************************************************
+ * Copyright (c) 2008-2019 The Khronos Group Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and/or associated documentation files (the
+ * "Materials"), to deal in the Materials without restriction, including
+ * without limitation the rights to use, copy, modify, merge, publish,
+ * distribute, sublicense, and/or sell copies of the Materials, and to
+ * permit persons to whom the Materials are furnished to do so, subject to
+ * the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included
+ * in all copies or substantial portions of the Materials.
+ *
+ * MODIFICATIONS TO THIS FILE MAY MEAN IT NO LONGER ACCURATELY REFLECTS
+ * KHRONOS STANDARDS. THE UNMODIFIED, NORMATIVE VERSIONS OF KHRONOS
+ * SPECIFICATIONS AND HEADER INFORMATION ARE LOCATED AT
+ *    https://www.khronos.org/registry/
+ *
+ * THE MATERIALS ARE PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
+ * IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
+ * CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
+ * TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
+ * MATERIALS OR THE USE OR OTHER DEALINGS IN THE MATERIALS.
+ **********************************************************************************/
+/*****************************************************************************\
+
+Copyright (c) 2013-2019 Intel Corporation All Rights Reserved.
+
+THESE MATERIALS ARE PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL INTEL OR ITS
+CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
+EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
+PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
+PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY
+OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY OR TORT (INCLUDING
+NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THESE
+MATERIALS, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+File Name: cl_dx9_media_sharing_intel.h
+
+Abstract:
+
+Notes:
+
+\*****************************************************************************/
+
+#ifndef __OPENCL_CL_DX9_MEDIA_SHARING_INTEL_H
+#define __OPENCL_CL_DX9_MEDIA_SHARING_INTEL_H
+
+#include <CL/cl.h>
+#include <CL/cl_platform.h>
+#include <d3d9.h>
+#include <dxvahd.h>
+#include <wtypes.h>
+#include <d3d9types.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/***************************************
+* cl_intel_dx9_media_sharing extension *
+****************************************/
+
+#define cl_intel_dx9_media_sharing 1
+
+typedef cl_uint cl_dx9_device_source_intel;
+typedef cl_uint cl_dx9_device_set_intel;
+
+/* error codes */
+#define CL_INVALID_DX9_DEVICE_INTEL                   -1010
+#define CL_INVALID_DX9_RESOURCE_INTEL                 -1011
+#define CL_DX9_RESOURCE_ALREADY_ACQUIRED_INTEL        -1012
+#define CL_DX9_RESOURCE_NOT_ACQUIRED_INTEL            -1013
+
+/* cl_dx9_device_source_intel */
+#define CL_D3D9_DEVICE_INTEL                          0x4022
+#define CL_D3D9EX_DEVICE_INTEL                        0x4070
+#define CL_DXVA_DEVICE_INTEL                          0x4071
+
+/* cl_dx9_device_set_intel */
+#define CL_PREFERRED_DEVICES_FOR_DX9_INTEL            0x4024
+#define CL_ALL_DEVICES_FOR_DX9_INTEL                  0x4025
+
+/* cl_context_info */
+#define CL_CONTEXT_D3D9_DEVICE_INTEL                  0x4026
+#define CL_CONTEXT_D3D9EX_DEVICE_INTEL                0x4072
+#define CL_CONTEXT_DXVA_DEVICE_INTEL                  0x4073
+
+/* cl_mem_info */
+#define CL_MEM_DX9_RESOURCE_INTEL                     0x4027
+#define CL_MEM_DX9_SHARED_HANDLE_INTEL                0x4074
+
+/* cl_image_info */
+#define CL_IMAGE_DX9_PLANE_INTEL                      0x4075
+
+/* cl_command_type */
+#define CL_COMMAND_ACQUIRE_DX9_OBJECTS_INTEL          0x402A
+#define CL_COMMAND_RELEASE_DX9_OBJECTS_INTEL          0x402B
+/******************************************************************************/
+
+extern CL_API_ENTRY cl_int CL_API_CALL
+clGetDeviceIDsFromDX9INTEL(
+    cl_platform_id              platform,
+    cl_dx9_device_source_intel  dx9_device_source,
+    void*                       dx9_object,
+    cl_dx9_device_set_intel     dx9_device_set,
+    cl_uint                     num_entries,
+    cl_device_id*               devices,
+    cl_uint*                    num_devices) CL_EXT_SUFFIX__VERSION_1_1;
+
+typedef CL_API_ENTRY cl_int (CL_API_CALL* clGetDeviceIDsFromDX9INTEL_fn)(
+    cl_platform_id              platform,
+    cl_dx9_device_source_intel  dx9_device_source,
+    void*                       dx9_object,
+    cl_dx9_device_set_intel     dx9_device_set,
+    cl_uint                     num_entries,
+    cl_device_id*               devices,
+    cl_uint*                    num_devices) CL_EXT_SUFFIX__VERSION_1_1;
+
+extern CL_API_ENTRY cl_mem CL_API_CALL
+clCreateFromDX9MediaSurfaceINTEL(
+    cl_context                  context,
+    cl_mem_flags                flags,
+    IDirect3DSurface9*          resource,
+    HANDLE                      sharedHandle,
+    UINT                        plane,
+    cl_int*                     errcode_ret) CL_EXT_SUFFIX__VERSION_1_1;
+
+typedef CL_API_ENTRY cl_mem (CL_API_CALL *clCreateFromDX9MediaSurfaceINTEL_fn)(
+    cl_context                  context,
+    cl_mem_flags                flags,
+    IDirect3DSurface9*          resource,
+    HANDLE                      sharedHandle,
+    UINT                        plane,
+    cl_int*                     errcode_ret) CL_EXT_SUFFIX__VERSION_1_1;
+
+extern CL_API_ENTRY cl_int CL_API_CALL
+clEnqueueAcquireDX9ObjectsINTEL(
+    cl_command_queue            command_queue,
+    cl_uint                     num_objects,
+    const cl_mem*               mem_objects,
+    cl_uint                     num_events_in_wait_list,
+    const cl_event*             event_wait_list,
+    cl_event*                   event) CL_EXT_SUFFIX__VERSION_1_1;
+
+typedef CL_API_ENTRY cl_int (CL_API_CALL *clEnqueueAcquireDX9ObjectsINTEL_fn)(
+    cl_command_queue            command_queue,
+    cl_uint                     num_objects,
+    const cl_mem*               mem_objects,
+    cl_uint                     num_events_in_wait_list,
+    const cl_event*             event_wait_list,
+    cl_event*                   event) CL_EXT_SUFFIX__VERSION_1_1;
+
+extern CL_API_ENTRY cl_int CL_API_CALL
+clEnqueueReleaseDX9ObjectsINTEL(
+    cl_command_queue            command_queue,
+    cl_uint                     num_objects,
+    cl_mem*                     mem_objects,
+    cl_uint                     num_events_in_wait_list,
+    const cl_event*             event_wait_list,
+    cl_event*                   event) CL_EXT_SUFFIX__VERSION_1_1;
+
+typedef CL_API_ENTRY cl_int (CL_API_CALL *clEnqueueReleaseDX9ObjectsINTEL_fn)(
+    cl_command_queue            command_queue,
+    cl_uint                     num_objects,
+    cl_mem*                     mem_objects,
+    cl_uint                     num_events_in_wait_list,
+    const cl_event*             event_wait_list,
+    cl_event*                   event) CL_EXT_SUFFIX__VERSION_1_1;
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif  /* __OPENCL_CL_DX9_MEDIA_SHARING_INTEL_H */
+
--- a/src/3rdparty/CL/cl_egl.h
+++ b/src/3rdparty/CL/cl_egl.h
@ -0,0 +1,132 @@
+/*******************************************************************************
+ * Copyright (c) 2008-2019 The Khronos Group Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and/or associated documentation files (the
+ * "Materials"), to deal in the Materials without restriction, including
+ * without limitation the rights to use, copy, modify, merge, publish,
+ * distribute, sublicense, and/or sell copies of the Materials, and to
+ * permit persons to whom the Materials are furnished to do so, subject to
+ * the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included
+ * in all copies or substantial portions of the Materials.
+ *
+ * MODIFICATIONS TO THIS FILE MAY MEAN IT NO LONGER ACCURATELY REFLECTS
+ * KHRONOS STANDARDS. THE UNMODIFIED, NORMATIVE VERSIONS OF KHRONOS
+ * SPECIFICATIONS AND HEADER INFORMATION ARE LOCATED AT
+ *    https://www.khronos.org/registry/
+ *
+ * THE MATERIALS ARE PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
+ * IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
+ * CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
+ * TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
+ * MATERIALS OR THE USE OR OTHER DEALINGS IN THE MATERIALS.
+ ******************************************************************************/
+
+#ifndef __OPENCL_CL_EGL_H
+#define __OPENCL_CL_EGL_H
+
+#include <CL/cl.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+
+/* Command type for events created with clEnqueueAcquireEGLObjectsKHR */
+#define CL_COMMAND_EGL_FENCE_SYNC_OBJECT_KHR  0x202F
+#define CL_COMMAND_ACQUIRE_EGL_OBJECTS_KHR    0x202D
+#define CL_COMMAND_RELEASE_EGL_OBJECTS_KHR    0x202E
+
+/* Error type for clCreateFromEGLImageKHR */
+#define CL_INVALID_EGL_OBJECT_KHR             -1093
+#define CL_EGL_RESOURCE_NOT_ACQUIRED_KHR      -1092
+
+/* CLeglImageKHR is an opaque handle to an EGLImage */
+typedef void* CLeglImageKHR;
+
+/* CLeglDisplayKHR is an opaque handle to an EGLDisplay */
+typedef void* CLeglDisplayKHR;
+
+/* CLeglSyncKHR is an opaque handle to an EGLSync object */
+typedef void* CLeglSyncKHR;
+
+/* properties passed to clCreateFromEGLImageKHR */
+typedef intptr_t cl_egl_image_properties_khr;
+
+
+#define cl_khr_egl_image 1
+
+extern CL_API_ENTRY cl_mem CL_API_CALL
+clCreateFromEGLImageKHR(cl_context                  context,
+                        CLeglDisplayKHR             egldisplay,
+                        CLeglImageKHR               eglimage,
+                        cl_mem_flags                flags,
+                        const cl_egl_image_properties_khr * properties,
+                        cl_int *                    errcode_ret) CL_API_SUFFIX__VERSION_1_0;
+
+typedef CL_API_ENTRY cl_mem (CL_API_CALL *clCreateFromEGLImageKHR_fn)(
+    cl_context                  context,
+    CLeglDisplayKHR             egldisplay,
+    CLeglImageKHR               eglimage,
+    cl_mem_flags                flags,
+    const cl_egl_image_properties_khr * properties,
+    cl_int *                    errcode_ret);
+
+
+extern CL_API_ENTRY cl_int CL_API_CALL
+clEnqueueAcquireEGLObjectsKHR(cl_command_queue command_queue,
+                              cl_uint          num_objects,
+                              const cl_mem *   mem_objects,
+                              cl_uint          num_events_in_wait_list,
+                              const cl_event * event_wait_list,
+                              cl_event *       event) CL_API_SUFFIX__VERSION_1_0;
+
+typedef CL_API_ENTRY cl_int (CL_API_CALL *clEnqueueAcquireEGLObjectsKHR_fn)(
+    cl_command_queue command_queue,
+    cl_uint          num_objects,
+    const cl_mem *   mem_objects,
+    cl_uint          num_events_in_wait_list,
+    const cl_event * event_wait_list,
+    cl_event *       event);
+
+
+extern CL_API_ENTRY cl_int CL_API_CALL
+clEnqueueReleaseEGLObjectsKHR(cl_command_queue command_queue,
+                              cl_uint          num_objects,
+                              const cl_mem *   mem_objects,
+                              cl_uint          num_events_in_wait_list,
+                              const cl_event * event_wait_list,
+                              cl_event *       event) CL_API_SUFFIX__VERSION_1_0;
+
+typedef CL_API_ENTRY cl_int (CL_API_CALL *clEnqueueReleaseEGLObjectsKHR_fn)(
+    cl_command_queue command_queue,
+    cl_uint          num_objects,
+    const cl_mem *   mem_objects,
+    cl_uint          num_events_in_wait_list,
+    const cl_event * event_wait_list,
+    cl_event *       event);
+
+
+#define cl_khr_egl_event 1
+
+extern CL_API_ENTRY cl_event CL_API_CALL
+clCreateEventFromEGLSyncKHR(cl_context      context,
+                            CLeglSyncKHR    sync,
+                            CLeglDisplayKHR display,
+                            cl_int *        errcode_ret) CL_API_SUFFIX__VERSION_1_0;
+
+typedef CL_API_ENTRY cl_event (CL_API_CALL *clCreateEventFromEGLSyncKHR_fn)(
+    cl_context      context,
+    CLeglSyncKHR    sync,
+    CLeglDisplayKHR display,
+    cl_int *        errcode_ret);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* __OPENCL_CL_EGL_H */
--- a/src/3rdparty/CL/cl_ext.h
+++ b/src/3rdparty/CL/cl_ext.h
@ -0,0 +1,762 @@
+/*******************************************************************************
+ * Copyright (c) 2008-2019 The Khronos Group Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and/or associated documentation files (the
+ * "Materials"), to deal in the Materials without restriction, including
+ * without limitation the rights to use, copy, modify, merge, publish,
+ * distribute, sublicense, and/or sell copies of the Materials, and to
+ * permit persons to whom the Materials are furnished to do so, subject to
+ * the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included
+ * in all copies or substantial portions of the Materials.
+ *
+ * MODIFICATIONS TO THIS FILE MAY MEAN IT NO LONGER ACCURATELY REFLECTS
+ * KHRONOS STANDARDS. THE UNMODIFIED, NORMATIVE VERSIONS OF KHRONOS
+ * SPECIFICATIONS AND HEADER INFORMATION ARE LOCATED AT
+ *    https://www.khronos.org/registry/
+ *
+ * THE MATERIALS ARE PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
+ * IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
+ * CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
+ * TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
+ * MATERIALS OR THE USE OR OTHER DEALINGS IN THE MATERIALS.
+ ******************************************************************************/
+
+/* cl_ext.h contains OpenCL extensions which don't have external */
+/* (OpenGL, D3D) dependencies.                                   */
+
+#ifndef __CL_EXT_H
+#define __CL_EXT_H
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <CL/cl.h>
+
+/* cl_khr_fp64 extension - no extension #define since it has no functions  */
+/* CL_DEVICE_DOUBLE_FP_CONFIG is defined in CL.h for OpenCL >= 120 */
+
+#if CL_TARGET_OPENCL_VERSION <= 110
+#define CL_DEVICE_DOUBLE_FP_CONFIG                       0x1032
+#endif
+
+/* cl_khr_fp16 extension - no extension #define since it has no functions  */
+#define CL_DEVICE_HALF_FP_CONFIG                    0x1033
+
+/* Memory object destruction
+ *
+ * Apple extension for use to manage externally allocated buffers used with cl_mem objects with CL_MEM_USE_HOST_PTR
+ *
+ * Registers a user callback function that will be called when the memory object is deleted and its resources
+ * freed. Each call to clSetMemObjectCallbackFn registers the specified user callback function on a callback
+ * stack associated with memobj. The registered user callback functions are called in the reverse order in
+ * which they were registered. The user callback functions are called and then the memory object is deleted
+ * and its resources freed. This provides a mechanism for the application (and libraries) using memobj to be
+ * notified when the memory referenced by host_ptr, specified when the memory object is created and used as
+ * the storage bits for the memory object, can be reused or freed.
+ *
+ * The application may not call CL api's with the cl_mem object passed to the pfn_notify.
+ *
+ * Please check for the "cl_APPLE_SetMemObjectDestructor" extension using clGetDeviceInfo(CL_DEVICE_EXTENSIONS)
+ * before using.
+ */
+#define cl_APPLE_SetMemObjectDestructor 1
+cl_int  CL_API_ENTRY clSetMemObjectDestructorAPPLE(  cl_mem memobj,
+                                        void (* pfn_notify)(cl_mem memobj, void * user_data),
+                                        void * user_data)             CL_EXT_SUFFIX__VERSION_1_0;
+
+
+/* Context Logging Functions
+ *
+ * The next three convenience functions are intended to be used as the pfn_notify parameter to clCreateContext().
+ * Please check for the "cl_APPLE_ContextLoggingFunctions" extension using clGetDeviceInfo(CL_DEVICE_EXTENSIONS)
+ * before using.
+ *
+ * clLogMessagesToSystemLog forwards on all log messages to the Apple System Logger
+ */
+#define cl_APPLE_ContextLoggingFunctions 1
+extern void CL_API_ENTRY clLogMessagesToSystemLogAPPLE(  const char * errstr,
+                                            const void * private_info,
+                                            size_t       cb,
+                                            void *       user_data)  CL_EXT_SUFFIX__VERSION_1_0;
+
+/* clLogMessagesToStdout sends all log messages to the file descriptor stdout */
+extern void CL_API_ENTRY clLogMessagesToStdoutAPPLE(   const char * errstr,
+                                          const void * private_info,
+                                          size_t       cb,
+                                          void *       user_data)    CL_EXT_SUFFIX__VERSION_1_0;
+
+/* clLogMessagesToStderr sends all log messages to the file descriptor stderr */
+extern void CL_API_ENTRY clLogMessagesToStderrAPPLE(   const char * errstr,
+                                          const void * private_info,
+                                          size_t       cb,
+                                          void *       user_data)    CL_EXT_SUFFIX__VERSION_1_0;
+
+
+/************************
+* cl_khr_icd extension *
+************************/
+#define cl_khr_icd 1
+
+/* cl_platform_info                                                        */
+#define CL_PLATFORM_ICD_SUFFIX_KHR                  0x0920
+
+/* Additional Error Codes                                                  */
+#define CL_PLATFORM_NOT_FOUND_KHR                   -1001
+
+extern CL_API_ENTRY cl_int CL_API_CALL
+clIcdGetPlatformIDsKHR(cl_uint          num_entries,
+                       cl_platform_id * platforms,
+                       cl_uint *        num_platforms);
+
+typedef CL_API_ENTRY cl_int
+(CL_API_CALL *clIcdGetPlatformIDsKHR_fn)(cl_uint          num_entries,
+                                         cl_platform_id * platforms,
+                                         cl_uint *        num_platforms);
+
+
+/*******************************
+ * cl_khr_il_program extension *
+ *******************************/
+#define cl_khr_il_program 1
+
+/* New property to clGetDeviceInfo for retrieving supported intermediate
+ * languages
+ */
+#define CL_DEVICE_IL_VERSION_KHR                    0x105B
+
+/* New property to clGetProgramInfo for retrieving for retrieving the IL of a
+ * program
+ */
+#define CL_PROGRAM_IL_KHR                           0x1169
+
+extern CL_API_ENTRY cl_program CL_API_CALL
+clCreateProgramWithILKHR(cl_context   context,
+                         const void * il,
+                         size_t       length,
+                         cl_int *     errcode_ret);
+
+typedef CL_API_ENTRY cl_program
+(CL_API_CALL *clCreateProgramWithILKHR_fn)(cl_context   context,
+                                           const void * il,
+                                           size_t       length,
+                                           cl_int *     errcode_ret) CL_EXT_SUFFIX__VERSION_1_2;
+
+/* Extension: cl_khr_image2d_from_buffer
+ *
+ * This extension allows a 2D image to be created from a cl_mem buffer without
+ * a copy. The type associated with a 2D image created from a buffer in an
+ * OpenCL program is image2d_t. Both the sampler and sampler-less read_image
+ * built-in functions are supported for 2D images and 2D images created from
+ * a buffer.  Similarly, the write_image built-ins are also supported for 2D
+ * images created from a buffer.
+ *
+ * When the 2D image from buffer is created, the client must specify the
+ * width, height, image format (i.e. channel order and channel data type)
+ * and optionally the row pitch.
+ *
+ * The pitch specified must be a multiple of
+ * CL_DEVICE_IMAGE_PITCH_ALIGNMENT_KHR pixels.
+ * The base address of the buffer must be aligned to
+ * CL_DEVICE_IMAGE_BASE_ADDRESS_ALIGNMENT_KHR pixels.
+ */
+
+#define CL_DEVICE_IMAGE_PITCH_ALIGNMENT_KHR              0x104A
+#define CL_DEVICE_IMAGE_BASE_ADDRESS_ALIGNMENT_KHR       0x104B
+
+
+/**************************************
+ * cl_khr_initialize_memory extension *
+ **************************************/
+
+#define CL_CONTEXT_MEMORY_INITIALIZE_KHR            0x2030
+
+
+/**************************************
+ * cl_khr_terminate_context extension *
+ **************************************/
+
+#define CL_DEVICE_TERMINATE_CAPABILITY_KHR          0x2031
+#define CL_CONTEXT_TERMINATE_KHR                    0x2032
+
+#define cl_khr_terminate_context 1
+extern CL_API_ENTRY cl_int CL_API_CALL
+clTerminateContextKHR(cl_context context) CL_EXT_SUFFIX__VERSION_1_2;
+
+typedef CL_API_ENTRY cl_int
+(CL_API_CALL *clTerminateContextKHR_fn)(cl_context context) CL_EXT_SUFFIX__VERSION_1_2;
+
+
+/*
+ * Extension: cl_khr_spir
+ *
+ * This extension adds support to create an OpenCL program object from a
+ * Standard Portable Intermediate Representation (SPIR) instance
+ */
+
+#define CL_DEVICE_SPIR_VERSIONS                     0x40E0
+#define CL_PROGRAM_BINARY_TYPE_INTERMEDIATE         0x40E1
+
+
+/*****************************************
+ * cl_khr_create_command_queue extension *
+ *****************************************/
+#define cl_khr_create_command_queue 1
+
+typedef cl_bitfield cl_queue_properties_khr;
+
+extern CL_API_ENTRY cl_command_queue CL_API_CALL
+clCreateCommandQueueWithPropertiesKHR(cl_context context,
+                                      cl_device_id device,
+                                      const cl_queue_properties_khr* properties,
+                                      cl_int* errcode_ret) CL_EXT_SUFFIX__VERSION_1_2;
+
+typedef CL_API_ENTRY cl_command_queue
+(CL_API_CALL *clCreateCommandQueueWithPropertiesKHR_fn)(cl_context context,
+                                                        cl_device_id device,
+                                                        const cl_queue_properties_khr* properties,
+                                                        cl_int* errcode_ret) CL_EXT_SUFFIX__VERSION_1_2;
+
+
+/******************************************
+* cl_nv_device_attribute_query extension *
+******************************************/
+
+/* cl_nv_device_attribute_query extension - no extension #define since it has no functions */
+#define CL_DEVICE_COMPUTE_CAPABILITY_MAJOR_NV       0x4000
+#define CL_DEVICE_COMPUTE_CAPABILITY_MINOR_NV       0x4001
+#define CL_DEVICE_REGISTERS_PER_BLOCK_NV            0x4002
+#define CL_DEVICE_WARP_SIZE_NV                      0x4003
+#define CL_DEVICE_GPU_OVERLAP_NV                    0x4004
+#define CL_DEVICE_KERNEL_EXEC_TIMEOUT_NV            0x4005
+#define CL_DEVICE_INTEGRATED_MEMORY_NV              0x4006
+
+
+/*********************************
+* cl_amd_device_attribute_query *
+*********************************/
+
+#define CL_DEVICE_PROFILING_TIMER_OFFSET_AMD        0x4036
+
+
+/*********************************
+* cl_arm_printf extension
+*********************************/
+
+#define CL_PRINTF_CALLBACK_ARM                      0x40B0
+#define CL_PRINTF_BUFFERSIZE_ARM                    0x40B1
+
+
+/***********************************
+* cl_ext_device_fission extension
+***********************************/
+#define cl_ext_device_fission   1
+
+extern CL_API_ENTRY cl_int CL_API_CALL
+clReleaseDeviceEXT(cl_device_id device) CL_EXT_SUFFIX__VERSION_1_1;
+
+typedef CL_API_ENTRY cl_int
+(CL_API_CALL *clReleaseDeviceEXT_fn)(cl_device_id device) CL_EXT_SUFFIX__VERSION_1_1;
+
+extern CL_API_ENTRY cl_int CL_API_CALL
+clRetainDeviceEXT(cl_device_id device) CL_EXT_SUFFIX__VERSION_1_1;
+
+typedef CL_API_ENTRY cl_int
+(CL_API_CALL *clRetainDeviceEXT_fn)(cl_device_id device) CL_EXT_SUFFIX__VERSION_1_1;
+
+typedef cl_ulong  cl_device_partition_property_ext;
+extern CL_API_ENTRY cl_int CL_API_CALL
+clCreateSubDevicesEXT(cl_device_id   in_device,
+                      const cl_device_partition_property_ext * properties,
+                      cl_uint        num_entries,
+                      cl_device_id * out_devices,
+                      cl_uint *      num_devices) CL_EXT_SUFFIX__VERSION_1_1;
+
+typedef CL_API_ENTRY cl_int
+(CL_API_CALL * clCreateSubDevicesEXT_fn)(cl_device_id   in_device,
+                                         const cl_device_partition_property_ext * properties,
+                                         cl_uint        num_entries,
+                                         cl_device_id * out_devices,
+                                         cl_uint *      num_devices) CL_EXT_SUFFIX__VERSION_1_1;
+
+/* cl_device_partition_property_ext */
+#define CL_DEVICE_PARTITION_EQUALLY_EXT             0x4050
+#define CL_DEVICE_PARTITION_BY_COUNTS_EXT           0x4051
+#define CL_DEVICE_PARTITION_BY_NAMES_EXT            0x4052
+#define CL_DEVICE_PARTITION_BY_AFFINITY_DOMAIN_EXT  0x4053
+
+/* clDeviceGetInfo selectors */
+#define CL_DEVICE_PARENT_DEVICE_EXT                 0x4054
+#define CL_DEVICE_PARTITION_TYPES_EXT               0x4055
+#define CL_DEVICE_AFFINITY_DOMAINS_EXT              0x4056
+#define CL_DEVICE_REFERENCE_COUNT_EXT               0x4057
+#define CL_DEVICE_PARTITION_STYLE_EXT               0x4058
+
+/* error codes */
+#define CL_DEVICE_PARTITION_FAILED_EXT              -1057
+#define CL_INVALID_PARTITION_COUNT_EXT              -1058
+#define CL_INVALID_PARTITION_NAME_EXT               -1059
+
+/* CL_AFFINITY_DOMAINs */
+#define CL_AFFINITY_DOMAIN_L1_CACHE_EXT             0x1
+#define CL_AFFINITY_DOMAIN_L2_CACHE_EXT             0x2
+#define CL_AFFINITY_DOMAIN_L3_CACHE_EXT             0x3
+#define CL_AFFINITY_DOMAIN_L4_CACHE_EXT             0x4
+#define CL_AFFINITY_DOMAIN_NUMA_EXT                 0x10
+#define CL_AFFINITY_DOMAIN_NEXT_FISSIONABLE_EXT     0x100
+
+/* cl_device_partition_property_ext list terminators */
+#define CL_PROPERTIES_LIST_END_EXT                  ((cl_device_partition_property_ext) 0)
+#define CL_PARTITION_BY_COUNTS_LIST_END_EXT         ((cl_device_partition_property_ext) 0)
+#define CL_PARTITION_BY_NAMES_LIST_END_EXT          ((cl_device_partition_property_ext) 0 - 1)
+
+
+/***********************************
+ * cl_ext_migrate_memobject extension definitions
+ ***********************************/
+#define cl_ext_migrate_memobject 1
+
+typedef cl_bitfield cl_mem_migration_flags_ext;
+
+#define CL_MIGRATE_MEM_OBJECT_HOST_EXT              0x1
+
+#define CL_COMMAND_MIGRATE_MEM_OBJECT_EXT           0x4040
+
+extern CL_API_ENTRY cl_int CL_API_CALL
+clEnqueueMigrateMemObjectEXT(cl_command_queue command_queue,
+                             cl_uint          num_mem_objects,
+                             const cl_mem *   mem_objects,
+                             cl_mem_migration_flags_ext flags,
+                             cl_uint          num_events_in_wait_list,
+                             const cl_event * event_wait_list,
+                             cl_event *       event);
+
+typedef CL_API_ENTRY cl_int
+(CL_API_CALL *clEnqueueMigrateMemObjectEXT_fn)(cl_command_queue command_queue,
+                                               cl_uint          num_mem_objects,
+                                               const cl_mem *   mem_objects,
+                                               cl_mem_migration_flags_ext flags,
+                                               cl_uint          num_events_in_wait_list,
+                                               const cl_event * event_wait_list,
+                                               cl_event *       event);
+
+
+/*********************************
+* cl_qcom_ext_host_ptr extension
+*********************************/
+#define cl_qcom_ext_host_ptr 1
+
+#define CL_MEM_EXT_HOST_PTR_QCOM                  (1 << 29)
+
+#define CL_DEVICE_EXT_MEM_PADDING_IN_BYTES_QCOM   0x40A0
+#define CL_DEVICE_PAGE_SIZE_QCOM                  0x40A1
+#define CL_IMAGE_ROW_ALIGNMENT_QCOM               0x40A2
+#define CL_IMAGE_SLICE_ALIGNMENT_QCOM             0x40A3
+#define CL_MEM_HOST_UNCACHED_QCOM                 0x40A4
+#define CL_MEM_HOST_WRITEBACK_QCOM                0x40A5
+#define CL_MEM_HOST_WRITETHROUGH_QCOM             0x40A6
+#define CL_MEM_HOST_WRITE_COMBINING_QCOM          0x40A7
+
+typedef cl_uint                                   cl_image_pitch_info_qcom;
+
+extern CL_API_ENTRY cl_int CL_API_CALL
+clGetDeviceImageInfoQCOM(cl_device_id             device,
+                         size_t                   image_width,
+                         size_t                   image_height,
+                         const cl_image_format   *image_format,
+                         cl_image_pitch_info_qcom param_name,
+                         size_t                   param_value_size,
+                         void                    *param_value,
+                         size_t                  *param_value_size_ret);
+
+typedef struct _cl_mem_ext_host_ptr
+{
+    /* Type of external memory allocation. */
+    /* Legal values will be defined in layered extensions. */
+    cl_uint  allocation_type;
+
+    /* Host cache policy for this external memory allocation. */
+    cl_uint  host_cache_policy;
+
+} cl_mem_ext_host_ptr;
+
+
+/*******************************************
+* cl_qcom_ext_host_ptr_iocoherent extension
+********************************************/
+
+/* Cache policy specifying io-coherence */
+#define CL_MEM_HOST_IOCOHERENT_QCOM               0x40A9
+
+
+/*********************************
+* cl_qcom_ion_host_ptr extension
+*********************************/
+
+#define CL_MEM_ION_HOST_PTR_QCOM                  0x40A8
+
+typedef struct _cl_mem_ion_host_ptr
+{
+    /* Type of external memory allocation. */
+    /* Must be CL_MEM_ION_HOST_PTR_QCOM for ION allocations. */
+    cl_mem_ext_host_ptr  ext_host_ptr;
+
+    /* ION file descriptor */
+    int                  ion_filedesc;
+
+    /* Host pointer to the ION allocated memory */
+    void*                ion_hostptr;
+
+} cl_mem_ion_host_ptr;
+
+
+/*********************************
+* cl_qcom_android_native_buffer_host_ptr extension
+*********************************/
+
+#define CL_MEM_ANDROID_NATIVE_BUFFER_HOST_PTR_QCOM                  0x40C6
+
+typedef struct _cl_mem_android_native_buffer_host_ptr
+{
+    /* Type of external memory allocation. */
+    /* Must be CL_MEM_ANDROID_NATIVE_BUFFER_HOST_PTR_QCOM for Android native buffers. */
+    cl_mem_ext_host_ptr  ext_host_ptr;
+
+    /* Virtual pointer to the android native buffer */
+    void*                anb_ptr;
+
+} cl_mem_android_native_buffer_host_ptr;
+
+
+/******************************************
+ * cl_img_yuv_image extension *
+ ******************************************/
+
+/* Image formats used in clCreateImage */
+#define CL_NV21_IMG                                 0x40D0
+#define CL_YV12_IMG                                 0x40D1
+
+
+/******************************************
+ * cl_img_cached_allocations extension *
+ ******************************************/
+
+/* Flag values used by clCreateBuffer */
+#define CL_MEM_USE_UNCACHED_CPU_MEMORY_IMG          (1 << 26)
+#define CL_MEM_USE_CACHED_CPU_MEMORY_IMG            (1 << 27)
+
+
+/******************************************
+ * cl_img_use_gralloc_ptr extension *
+ ******************************************/
+#define cl_img_use_gralloc_ptr 1
+
+/* Flag values used by clCreateBuffer */
+#define CL_MEM_USE_GRALLOC_PTR_IMG                  (1 << 28)
+
+/* To be used by clGetEventInfo: */
+#define CL_COMMAND_ACQUIRE_GRALLOC_OBJECTS_IMG      0x40D2
+#define CL_COMMAND_RELEASE_GRALLOC_OBJECTS_IMG      0x40D3
+
+/* Error code from clEnqueueReleaseGrallocObjectsIMG */
+#define CL_GRALLOC_RESOURCE_NOT_ACQUIRED_IMG        0x40D4
+
+extern CL_API_ENTRY cl_int CL_API_CALL
+clEnqueueAcquireGrallocObjectsIMG(cl_command_queue      command_queue,
+                                  cl_uint               num_objects,
+                                  const cl_mem *        mem_objects,
+                                  cl_uint               num_events_in_wait_list,
+                                  const cl_event *      event_wait_list,
+                                  cl_event *            event) CL_EXT_SUFFIX__VERSION_1_2;
+
+extern CL_API_ENTRY cl_int CL_API_CALL
+clEnqueueReleaseGrallocObjectsIMG(cl_command_queue      command_queue,
+                                  cl_uint               num_objects,
+                                  const cl_mem *        mem_objects,
+                                  cl_uint               num_events_in_wait_list,
+                                  const cl_event *      event_wait_list,
+                                  cl_event *            event) CL_EXT_SUFFIX__VERSION_1_2;
+
+
+/*********************************
+* cl_khr_subgroups extension
+*********************************/
+#define cl_khr_subgroups 1
+
+#if !defined(CL_VERSION_2_1)
+/* For OpenCL 2.1 and newer, cl_kernel_sub_group_info is declared in CL.h.
+   In hindsight, there should have been a khr suffix on this type for
+   the extension, but keeping it un-suffixed to maintain backwards
+   compatibility. */
+typedef cl_uint             cl_kernel_sub_group_info;
+#endif
+
+/* cl_kernel_sub_group_info */
+#define CL_KERNEL_MAX_SUB_GROUP_SIZE_FOR_NDRANGE_KHR    0x2033
+#define CL_KERNEL_SUB_GROUP_COUNT_FOR_NDRANGE_KHR       0x2034
+
+extern CL_API_ENTRY cl_int CL_API_CALL
+clGetKernelSubGroupInfoKHR(cl_kernel    in_kernel,
+                           cl_device_id in_device,
+                           cl_kernel_sub_group_info param_name,
+                           size_t       input_value_size,
+                           const void * input_value,
+                           size_t       param_value_size,
+                           void *       param_value,
+                           size_t *     param_value_size_ret) CL_EXT_SUFFIX__VERSION_2_0_DEPRECATED;
+
+typedef CL_API_ENTRY cl_int
+(CL_API_CALL * clGetKernelSubGroupInfoKHR_fn)(cl_kernel    in_kernel,
+                                              cl_device_id in_device,
+                                              cl_kernel_sub_group_info param_name,
+                                              size_t       input_value_size,
+                                              const void * input_value,
+                                              size_t       param_value_size,
+                                              void *       param_value,
+                                              size_t *     param_value_size_ret) CL_EXT_SUFFIX__VERSION_2_0_DEPRECATED;
+
+
+/*********************************
+* cl_khr_mipmap_image extension
+*********************************/
+
+/* cl_sampler_properties */
+#define CL_SAMPLER_MIP_FILTER_MODE_KHR              0x1155
+#define CL_SAMPLER_LOD_MIN_KHR                      0x1156
+#define CL_SAMPLER_LOD_MAX_KHR                      0x1157
+
+
+/*********************************
+* cl_khr_priority_hints extension
+*********************************/
+/* This extension define is for backwards compatibility.
+   It shouldn't be required since this extension has no new functions. */
+#define cl_khr_priority_hints 1
+
+typedef cl_uint  cl_queue_priority_khr;
+
+/* cl_command_queue_properties */
+#define CL_QUEUE_PRIORITY_KHR 0x1096
+
+/* cl_queue_priority_khr */
+#define CL_QUEUE_PRIORITY_HIGH_KHR (1<<0)
+#define CL_QUEUE_PRIORITY_MED_KHR (1<<1)
+#define CL_QUEUE_PRIORITY_LOW_KHR (1<<2)
+
+
+/*********************************
+* cl_khr_throttle_hints extension
+*********************************/
+/* This extension define is for backwards compatibility.
+   It shouldn't be required since this extension has no new functions. */
+#define cl_khr_throttle_hints 1
+
+typedef cl_uint  cl_queue_throttle_khr;
+
+/* cl_command_queue_properties */
+#define CL_QUEUE_THROTTLE_KHR 0x1097
+
+/* cl_queue_throttle_khr */
+#define CL_QUEUE_THROTTLE_HIGH_KHR (1<<0)
+#define CL_QUEUE_THROTTLE_MED_KHR (1<<1)
+#define CL_QUEUE_THROTTLE_LOW_KHR (1<<2)
+
+
+/*********************************
+* cl_khr_subgroup_named_barrier
+*********************************/
+/* This extension define is for backwards compatibility.
+   It shouldn't be required since this extension has no new functions. */
+#define cl_khr_subgroup_named_barrier 1
+
+/* cl_device_info */
+#define CL_DEVICE_MAX_NAMED_BARRIER_COUNT_KHR       0x2035
+
+
+/**********************************
+ * cl_arm_import_memory extension *
+ **********************************/
+#define cl_arm_import_memory 1
+
+typedef intptr_t cl_import_properties_arm;
+
+/* Default and valid proporties name for cl_arm_import_memory */
+#define CL_IMPORT_TYPE_ARM                        0x40B2
+
+/* Host process memory type default value for CL_IMPORT_TYPE_ARM property */
+#define CL_IMPORT_TYPE_HOST_ARM                   0x40B3
+
+/* DMA BUF memory type value for CL_IMPORT_TYPE_ARM property */
+#define CL_IMPORT_TYPE_DMA_BUF_ARM                0x40B4
+
+/* Protected DMA BUF memory type value for CL_IMPORT_TYPE_ARM property */
+#define CL_IMPORT_TYPE_PROTECTED_ARM              0x40B5
+
+/* This extension adds a new function that allows for direct memory import into
+ * OpenCL via the clImportMemoryARM function.
+ *
+ * Memory imported through this interface will be mapped into the device's page
+ * tables directly, providing zero copy access. It will never fall back to copy
+ * operations and aliased buffers.
+ *
+ * Types of memory supported for import are specified as additional extension
+ * strings.
+ *
+ * This extension produces cl_mem allocations which are compatible with all other
+ * users of cl_mem in the standard API.
+ *
+ * This extension maps pages with the same properties as the normal buffer creation
+ * function clCreateBuffer.
+ */
+extern CL_API_ENTRY cl_mem CL_API_CALL
+clImportMemoryARM( cl_context context,
+                   cl_mem_flags flags,
+                   const cl_import_properties_arm *properties,
+                   void *memory,
+                   size_t size,
+                   cl_int *errcode_ret) CL_EXT_SUFFIX__VERSION_1_0;
+
+
+/******************************************
+ * cl_arm_shared_virtual_memory extension *
+ ******************************************/
+#define cl_arm_shared_virtual_memory 1
+
+/* Used by clGetDeviceInfo */
+#define CL_DEVICE_SVM_CAPABILITIES_ARM                  0x40B6
+
+/* Used by clGetMemObjectInfo */
+#define CL_MEM_USES_SVM_POINTER_ARM                     0x40B7
+
+/* Used by clSetKernelExecInfoARM: */
+#define CL_KERNEL_EXEC_INFO_SVM_PTRS_ARM                0x40B8
+#define CL_KERNEL_EXEC_INFO_SVM_FINE_GRAIN_SYSTEM_ARM   0x40B9
+
+/* To be used by clGetEventInfo: */
+#define CL_COMMAND_SVM_FREE_ARM                         0x40BA
+#define CL_COMMAND_SVM_MEMCPY_ARM                       0x40BB
+#define CL_COMMAND_SVM_MEMFILL_ARM                      0x40BC
+#define CL_COMMAND_SVM_MAP_ARM                          0x40BD
+#define CL_COMMAND_SVM_UNMAP_ARM                        0x40BE
+
+/* Flag values returned by clGetDeviceInfo with CL_DEVICE_SVM_CAPABILITIES_ARM as the param_name. */
+#define CL_DEVICE_SVM_COARSE_GRAIN_BUFFER_ARM           (1 << 0)
+#define CL_DEVICE_SVM_FINE_GRAIN_BUFFER_ARM             (1 << 1)
+#define CL_DEVICE_SVM_FINE_GRAIN_SYSTEM_ARM             (1 << 2)
+#define CL_DEVICE_SVM_ATOMICS_ARM                       (1 << 3)
+
+/* Flag values used by clSVMAllocARM: */
+#define CL_MEM_SVM_FINE_GRAIN_BUFFER_ARM                (1 << 10)
+#define CL_MEM_SVM_ATOMICS_ARM                          (1 << 11)
+
+typedef cl_bitfield cl_svm_mem_flags_arm;
+typedef cl_uint     cl_kernel_exec_info_arm;
+typedef cl_bitfield cl_device_svm_capabilities_arm;
+
+extern CL_API_ENTRY void * CL_API_CALL
+clSVMAllocARM(cl_context       context,
+              cl_svm_mem_flags_arm flags,
+              size_t           size,
+              cl_uint          alignment) CL_EXT_SUFFIX__VERSION_1_2;
+
+extern CL_API_ENTRY void CL_API_CALL
+clSVMFreeARM(cl_context        context,
+             void *            svm_pointer) CL_EXT_SUFFIX__VERSION_1_2;
+
+extern CL_API_ENTRY cl_int CL_API_CALL
+clEnqueueSVMFreeARM(cl_command_queue  command_queue,
+                    cl_uint           num_svm_pointers,
+                    void *            svm_pointers[],
+                    void (CL_CALLBACK * pfn_free_func)(cl_command_queue queue,
+                                                       cl_uint          num_svm_pointers,
+                                                       void *           svm_pointers[],
+                                                       void *           user_data),
+                    void *            user_data,
+                    cl_uint           num_events_in_wait_list,
+                    const cl_event *  event_wait_list,
+                    cl_event *        event) CL_EXT_SUFFIX__VERSION_1_2;
+
+extern CL_API_ENTRY cl_int CL_API_CALL
+clEnqueueSVMMemcpyARM(cl_command_queue  command_queue,
+                      cl_bool           blocking_copy,
+                      void *            dst_ptr,
+                      const void *      src_ptr,
+                      size_t            size,
+                      cl_uint           num_events_in_wait_list,
+                      const cl_event *  event_wait_list,
+                      cl_event *        event) CL_EXT_SUFFIX__VERSION_1_2;
+
+extern CL_API_ENTRY cl_int CL_API_CALL
+clEnqueueSVMMemFillARM(cl_command_queue  command_queue,
+                       void *            svm_ptr,
+                       const void *      pattern,
+                       size_t            pattern_size,
+                       size_t            size,
+                       cl_uint           num_events_in_wait_list,
+                       const cl_event *  event_wait_list,
+                       cl_event *        event) CL_EXT_SUFFIX__VERSION_1_2;
+
+extern CL_API_ENTRY cl_int CL_API_CALL
+clEnqueueSVMMapARM(cl_command_queue  command_queue,
+                   cl_bool           blocking_map,
+                   cl_map_flags      flags,
+                   void *            svm_ptr,
+                   size_t            size,
+                   cl_uint           num_events_in_wait_list,
+                   const cl_event *  event_wait_list,
+                   cl_event *        event) CL_EXT_SUFFIX__VERSION_1_2;
+
+extern CL_API_ENTRY cl_int CL_API_CALL
+clEnqueueSVMUnmapARM(cl_command_queue  command_queue,
+                     void *            svm_ptr,
+                     cl_uint           num_events_in_wait_list,
+                     const cl_event *  event_wait_list,
+                     cl_event *        event) CL_EXT_SUFFIX__VERSION_1_2;
+
+extern CL_API_ENTRY cl_int CL_API_CALL
+clSetKernelArgSVMPointerARM(cl_kernel    kernel,
+                            cl_uint      arg_index,
+                            const void * arg_value) CL_EXT_SUFFIX__VERSION_1_2;
+
+extern CL_API_ENTRY cl_int CL_API_CALL
+clSetKernelExecInfoARM(cl_kernel            kernel,
+                       cl_kernel_exec_info_arm  param_name,
+                       size_t               param_value_size,
+                       const void *         param_value) CL_EXT_SUFFIX__VERSION_1_2;
+
+/********************************
+ * cl_arm_get_core_id extension *
+ ********************************/
+
+#ifdef CL_VERSION_1_2
+
+#define cl_arm_get_core_id 1
+
+/* Device info property for bitfield of cores present */
+#define CL_DEVICE_COMPUTE_UNITS_BITFIELD_ARM      0x40BF
+
+#endif  /* CL_VERSION_1_2 */
+
+/*********************************
+* cl_arm_job_slot_selection
+*********************************/
+
+#define cl_arm_job_slot_selection 1
+
+/* cl_device_info */
+#define CL_DEVICE_JOB_SLOTS_ARM                   0x41E0
+
+/* cl_command_queue_properties */
+#define CL_QUEUE_JOB_SLOT_ARM                     0x41E1
+
+#ifdef __cplusplus
+}
+#endif
+
+
+#endif /* __CL_EXT_H */
--- a/src/3rdparty/CL/cl_ext_intel.h
+++ b/src/3rdparty/CL/cl_ext_intel.h
@ -0,0 +1,423 @@
+/*******************************************************************************
+ * Copyright (c) 2008-2019 The Khronos Group Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and/or associated documentation files (the
+ * "Materials"), to deal in the Materials without restriction, including
+ * without limitation the rights to use, copy, modify, merge, publish,
+ * distribute, sublicense, and/or sell copies of the Materials, and to
+ * permit persons to whom the Materials are furnished to do so, subject to
+ * the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included
+ * in all copies or substantial portions of the Materials.
+ *
+ * MODIFICATIONS TO THIS FILE MAY MEAN IT NO LONGER ACCURATELY REFLECTS
+ * KHRONOS STANDARDS. THE UNMODIFIED, NORMATIVE VERSIONS OF KHRONOS
+ * SPECIFICATIONS AND HEADER INFORMATION ARE LOCATED AT
+ *    https://www.khronos.org/registry/
+ *
+ * THE MATERIALS ARE PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
+ * IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
+ * CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
+ * TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
+ * MATERIALS OR THE USE OR OTHER DEALINGS IN THE MATERIALS.
+ ******************************************************************************/
+/*****************************************************************************\
+
+Copyright (c) 2013-2019 Intel Corporation All Rights Reserved.
+
+THESE MATERIALS ARE PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL INTEL OR ITS
+CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
+EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
+PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
+PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY
+OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY OR TORT (INCLUDING
+NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THESE
+MATERIALS, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+File Name: cl_ext_intel.h
+
+Abstract:
+
+Notes:
+
+\*****************************************************************************/
+
+#ifndef __CL_EXT_INTEL_H
+#define __CL_EXT_INTEL_H
+
+#include <CL/cl.h>
+#include <CL/cl_platform.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/***************************************
+* cl_intel_thread_local_exec extension *
+****************************************/
+
+#define cl_intel_thread_local_exec 1
+
+#define CL_QUEUE_THREAD_LOCAL_EXEC_ENABLE_INTEL      (((cl_bitfield)1) << 31)
+
+/***********************************************
+* cl_intel_device_partition_by_names extension *
+************************************************/
+
+#define cl_intel_device_partition_by_names 1
+
+#define CL_DEVICE_PARTITION_BY_NAMES_INTEL          0x4052
+#define CL_PARTITION_BY_NAMES_LIST_END_INTEL        -1
+
+/************************************************
+* cl_intel_accelerator extension                *
+* cl_intel_motion_estimation extension          *
+* cl_intel_advanced_motion_estimation extension *
+*************************************************/
+
+#define cl_intel_accelerator 1
+#define cl_intel_motion_estimation 1
+#define cl_intel_advanced_motion_estimation 1
+
+typedef struct _cl_accelerator_intel* cl_accelerator_intel;
+typedef cl_uint cl_accelerator_type_intel;
+typedef cl_uint cl_accelerator_info_intel;
+
+typedef struct _cl_motion_estimation_desc_intel {
+    cl_uint mb_block_type;
+    cl_uint subpixel_mode;
+    cl_uint sad_adjust_mode;
+    cl_uint search_path_type;
+} cl_motion_estimation_desc_intel;
+
+/* error codes */
+#define CL_INVALID_ACCELERATOR_INTEL                              -1094
+#define CL_INVALID_ACCELERATOR_TYPE_INTEL                         -1095
+#define CL_INVALID_ACCELERATOR_DESCRIPTOR_INTEL                   -1096
+#define CL_ACCELERATOR_TYPE_NOT_SUPPORTED_INTEL                   -1097
+
+/* cl_accelerator_type_intel */
+#define CL_ACCELERATOR_TYPE_MOTION_ESTIMATION_INTEL               0x0
+
+/* cl_accelerator_info_intel */
+#define CL_ACCELERATOR_DESCRIPTOR_INTEL                           0x4090
+#define CL_ACCELERATOR_REFERENCE_COUNT_INTEL                      0x4091
+#define CL_ACCELERATOR_CONTEXT_INTEL                              0x4092
+#define CL_ACCELERATOR_TYPE_INTEL                                 0x4093
+
+/* cl_motion_detect_desc_intel flags */
+#define CL_ME_MB_TYPE_16x16_INTEL                                 0x0
+#define CL_ME_MB_TYPE_8x8_INTEL                                   0x1
+#define CL_ME_MB_TYPE_4x4_INTEL                                   0x2
+
+#define CL_ME_SUBPIXEL_MODE_INTEGER_INTEL                         0x0
+#define CL_ME_SUBPIXEL_MODE_HPEL_INTEL                            0x1
+#define CL_ME_SUBPIXEL_MODE_QPEL_INTEL                            0x2
+
+#define CL_ME_SAD_ADJUST_MODE_NONE_INTEL                          0x0
+#define CL_ME_SAD_ADJUST_MODE_HAAR_INTEL                          0x1
+
+#define CL_ME_SEARCH_PATH_RADIUS_2_2_INTEL                        0x0
+#define CL_ME_SEARCH_PATH_RADIUS_4_4_INTEL                        0x1
+#define CL_ME_SEARCH_PATH_RADIUS_16_12_INTEL                      0x5
+
+#define CL_ME_SKIP_BLOCK_TYPE_16x16_INTEL                         0x0
+#define CL_ME_CHROMA_INTRA_PREDICT_ENABLED_INTEL                  0x1
+#define CL_ME_LUMA_INTRA_PREDICT_ENABLED_INTEL                    0x2
+#define CL_ME_SKIP_BLOCK_TYPE_8x8_INTEL                           0x4
+
+#define CL_ME_FORWARD_INPUT_MODE_INTEL                            0x1
+#define CL_ME_BACKWARD_INPUT_MODE_INTEL                           0x2
+#define CL_ME_BIDIRECTION_INPUT_MODE_INTEL                        0x3
+
+#define CL_ME_BIDIR_WEIGHT_QUARTER_INTEL                          16
+#define CL_ME_BIDIR_WEIGHT_THIRD_INTEL                            21
+#define CL_ME_BIDIR_WEIGHT_HALF_INTEL                             32
+#define CL_ME_BIDIR_WEIGHT_TWO_THIRD_INTEL                        43
+#define CL_ME_BIDIR_WEIGHT_THREE_QUARTER_INTEL                    48
+
+#define CL_ME_COST_PENALTY_NONE_INTEL                             0x0
+#define CL_ME_COST_PENALTY_LOW_INTEL                              0x1
+#define CL_ME_COST_PENALTY_NORMAL_INTEL                           0x2
+#define CL_ME_COST_PENALTY_HIGH_INTEL                             0x3
+
+#define CL_ME_COST_PRECISION_QPEL_INTEL                           0x0
+#define CL_ME_COST_PRECISION_HPEL_INTEL                           0x1
+#define CL_ME_COST_PRECISION_PEL_INTEL                            0x2
+#define CL_ME_COST_PRECISION_DPEL_INTEL                           0x3
+
+#define CL_ME_LUMA_PREDICTOR_MODE_VERTICAL_INTEL                  0x0
+#define CL_ME_LUMA_PREDICTOR_MODE_HORIZONTAL_INTEL                0x1
+#define CL_ME_LUMA_PREDICTOR_MODE_DC_INTEL                        0x2
+#define CL_ME_LUMA_PREDICTOR_MODE_DIAGONAL_DOWN_LEFT_INTEL        0x3
+
+#define CL_ME_LUMA_PREDICTOR_MODE_DIAGONAL_DOWN_RIGHT_INTEL       0x4
+#define CL_ME_LUMA_PREDICTOR_MODE_PLANE_INTEL                     0x4
+#define CL_ME_LUMA_PREDICTOR_MODE_VERTICAL_RIGHT_INTEL            0x5
+#define CL_ME_LUMA_PREDICTOR_MODE_HORIZONTAL_DOWN_INTEL           0x6
+#define CL_ME_LUMA_PREDICTOR_MODE_VERTICAL_LEFT_INTEL             0x7
+#define CL_ME_LUMA_PREDICTOR_MODE_HORIZONTAL_UP_INTEL             0x8
+
+#define CL_ME_CHROMA_PREDICTOR_MODE_DC_INTEL                      0x0
+#define CL_ME_CHROMA_PREDICTOR_MODE_HORIZONTAL_INTEL              0x1
+#define CL_ME_CHROMA_PREDICTOR_MODE_VERTICAL_INTEL                0x2
+#define CL_ME_CHROMA_PREDICTOR_MODE_PLANE_INTEL                   0x3
+
+/* cl_device_info */
+#define CL_DEVICE_ME_VERSION_INTEL                                0x407E
+
+#define CL_ME_VERSION_LEGACY_INTEL                                0x0
+#define CL_ME_VERSION_ADVANCED_VER_1_INTEL                        0x1
+#define CL_ME_VERSION_ADVANCED_VER_2_INTEL                        0x2
+
+extern CL_API_ENTRY cl_accelerator_intel CL_API_CALL
+clCreateAcceleratorINTEL(
+    cl_context                   context,
+    cl_accelerator_type_intel    accelerator_type,
+    size_t                       descriptor_size,
+    const void*                  descriptor,
+    cl_int*                      errcode_ret) CL_EXT_SUFFIX__VERSION_1_2;
+
+typedef CL_API_ENTRY cl_accelerator_intel (CL_API_CALL *clCreateAcceleratorINTEL_fn)(
+    cl_context                   context,
+    cl_accelerator_type_intel    accelerator_type,
+    size_t                       descriptor_size,
+    const void*                  descriptor,
+    cl_int*                      errcode_ret) CL_EXT_SUFFIX__VERSION_1_2;
+
+extern CL_API_ENTRY cl_int CL_API_CALL
+clGetAcceleratorInfoINTEL(
+    cl_accelerator_intel         accelerator,
+    cl_accelerator_info_intel    param_name,
+    size_t                       param_value_size,
+    void*                        param_value,
+    size_t*                      param_value_size_ret) CL_EXT_SUFFIX__VERSION_1_2;
+
+typedef CL_API_ENTRY cl_int (CL_API_CALL *clGetAcceleratorInfoINTEL_fn)(
+    cl_accelerator_intel         accelerator,
+    cl_accelerator_info_intel    param_name,
+    size_t                       param_value_size,
+    void*                        param_value,
+    size_t*                      param_value_size_ret) CL_EXT_SUFFIX__VERSION_1_2;
+
+extern CL_API_ENTRY cl_int CL_API_CALL
+clRetainAcceleratorINTEL(
+    cl_accelerator_intel         accelerator) CL_EXT_SUFFIX__VERSION_1_2;
+
+typedef CL_API_ENTRY cl_int (CL_API_CALL *clRetainAcceleratorINTEL_fn)(
+    cl_accelerator_intel         accelerator) CL_EXT_SUFFIX__VERSION_1_2;
+
+extern CL_API_ENTRY cl_int CL_API_CALL
+clReleaseAcceleratorINTEL(
+    cl_accelerator_intel         accelerator) CL_EXT_SUFFIX__VERSION_1_2;
+
+typedef CL_API_ENTRY cl_int (CL_API_CALL *clReleaseAcceleratorINTEL_fn)(
+    cl_accelerator_intel         accelerator) CL_EXT_SUFFIX__VERSION_1_2;
+
+/******************************************
+* cl_intel_simultaneous_sharing extension *
+*******************************************/
+
+#define cl_intel_simultaneous_sharing 1
+
+#define CL_DEVICE_SIMULTANEOUS_INTEROPS_INTEL            0x4104
+#define CL_DEVICE_NUM_SIMULTANEOUS_INTEROPS_INTEL        0x4105
+
+/***********************************
+* cl_intel_egl_image_yuv extension *
+************************************/
+
+#define cl_intel_egl_image_yuv 1
+
+#define CL_EGL_YUV_PLANE_INTEL                           0x4107
+
+/********************************
+* cl_intel_packed_yuv extension *
+*********************************/
+
+#define cl_intel_packed_yuv 1
+
+#define CL_YUYV_INTEL                                    0x4076
+#define CL_UYVY_INTEL                                    0x4077
+#define CL_YVYU_INTEL                                    0x4078
+#define CL_VYUY_INTEL                                    0x4079
+
+/********************************************
+* cl_intel_required_subgroup_size extension *
+*********************************************/
+
+#define cl_intel_required_subgroup_size 1
+
+#define CL_DEVICE_SUB_GROUP_SIZES_INTEL                  0x4108
+#define CL_KERNEL_SPILL_MEM_SIZE_INTEL                   0x4109
+#define CL_KERNEL_COMPILE_SUB_GROUP_SIZE_INTEL           0x410A
+
+/****************************************
+* cl_intel_driver_diagnostics extension *
+*****************************************/
+
+#define cl_intel_driver_diagnostics 1
+
+typedef cl_uint cl_diagnostics_verbose_level;
+
+#define CL_CONTEXT_SHOW_DIAGNOSTICS_INTEL                0x4106
+
+#define CL_CONTEXT_DIAGNOSTICS_LEVEL_ALL_INTEL           ( 0xff )
+#define CL_CONTEXT_DIAGNOSTICS_LEVEL_GOOD_INTEL          ( 1 )
+#define CL_CONTEXT_DIAGNOSTICS_LEVEL_BAD_INTEL           ( 1 << 1 )
+#define CL_CONTEXT_DIAGNOSTICS_LEVEL_NEUTRAL_INTEL       ( 1 << 2 )
+
+/********************************
+* cl_intel_planar_yuv extension *
+*********************************/
+
+#define CL_NV12_INTEL                                       0x410E
+
+#define CL_MEM_NO_ACCESS_INTEL                              ( 1 << 24 )
+#define CL_MEM_ACCESS_FLAGS_UNRESTRICTED_INTEL              ( 1 << 25 )
+
+#define CL_DEVICE_PLANAR_YUV_MAX_WIDTH_INTEL                0x417E
+#define CL_DEVICE_PLANAR_YUV_MAX_HEIGHT_INTEL               0x417F
+
+/*******************************************************
+* cl_intel_device_side_avc_motion_estimation extension *
+********************************************************/
+
+#define CL_DEVICE_AVC_ME_VERSION_INTEL                      0x410B
+#define CL_DEVICE_AVC_ME_SUPPORTS_TEXTURE_SAMPLER_USE_INTEL 0x410C
+#define CL_DEVICE_AVC_ME_SUPPORTS_PREEMPTION_INTEL          0x410D
+
+#define CL_AVC_ME_VERSION_0_INTEL                           0x0;  // No support.
+#define CL_AVC_ME_VERSION_1_INTEL                           0x1;  // First supported version.
+
+#define CL_AVC_ME_MAJOR_16x16_INTEL                         0x0
+#define CL_AVC_ME_MAJOR_16x8_INTEL                          0x1
+#define CL_AVC_ME_MAJOR_8x16_INTEL                          0x2
+#define CL_AVC_ME_MAJOR_8x8_INTEL                           0x3
+
+#define CL_AVC_ME_MINOR_8x8_INTEL                           0x0
+#define CL_AVC_ME_MINOR_8x4_INTEL                           0x1
+#define CL_AVC_ME_MINOR_4x8_INTEL                           0x2
+#define CL_AVC_ME_MINOR_4x4_INTEL                           0x3
+
+#define CL_AVC_ME_MAJOR_FORWARD_INTEL                       0x0
+#define CL_AVC_ME_MAJOR_BACKWARD_INTEL                      0x1
+#define CL_AVC_ME_MAJOR_BIDIRECTIONAL_INTEL                 0x2
+
+#define CL_AVC_ME_PARTITION_MASK_ALL_INTEL                  0x0
+#define CL_AVC_ME_PARTITION_MASK_16x16_INTEL                0x7E
+#define CL_AVC_ME_PARTITION_MASK_16x8_INTEL                 0x7D
+#define CL_AVC_ME_PARTITION_MASK_8x16_INTEL                 0x7B
+#define CL_AVC_ME_PARTITION_MASK_8x8_INTEL                  0x77
+#define CL_AVC_ME_PARTITION_MASK_8x4_INTEL                  0x6F
+#define CL_AVC_ME_PARTITION_MASK_4x8_INTEL                  0x5F
+#define CL_AVC_ME_PARTITION_MASK_4x4_INTEL                  0x3F
+
+#define CL_AVC_ME_SEARCH_WINDOW_EXHAUSTIVE_INTEL            0x0
+#define CL_AVC_ME_SEARCH_WINDOW_SMALL_INTEL                 0x1
+#define CL_AVC_ME_SEARCH_WINDOW_TINY_INTEL                  0x2
+#define CL_AVC_ME_SEARCH_WINDOW_EXTRA_TINY_INTEL            0x3
+#define CL_AVC_ME_SEARCH_WINDOW_DIAMOND_INTEL               0x4
+#define CL_AVC_ME_SEARCH_WINDOW_LARGE_DIAMOND_INTEL         0x5
+#define CL_AVC_ME_SEARCH_WINDOW_RESERVED0_INTEL             0x6
+#define CL_AVC_ME_SEARCH_WINDOW_RESERVED1_INTEL             0x7
+#define CL_AVC_ME_SEARCH_WINDOW_CUSTOM_INTEL                0x8
+#define CL_AVC_ME_SEARCH_WINDOW_16x12_RADIUS_INTEL          0x9
+#define CL_AVC_ME_SEARCH_WINDOW_4x4_RADIUS_INTEL            0x2
+#define CL_AVC_ME_SEARCH_WINDOW_2x2_RADIUS_INTEL            0xa
+
+#define CL_AVC_ME_SAD_ADJUST_MODE_NONE_INTEL                0x0
+#define CL_AVC_ME_SAD_ADJUST_MODE_HAAR_INTEL                0x2
+
+#define CL_AVC_ME_SUBPIXEL_MODE_INTEGER_INTEL               0x0
+#define CL_AVC_ME_SUBPIXEL_MODE_HPEL_INTEL                  0x1
+#define CL_AVC_ME_SUBPIXEL_MODE_QPEL_INTEL                  0x3
+
+#define CL_AVC_ME_COST_PRECISION_QPEL_INTEL                 0x0
+#define CL_AVC_ME_COST_PRECISION_HPEL_INTEL                 0x1
+#define CL_AVC_ME_COST_PRECISION_PEL_INTEL                  0x2
+#define CL_AVC_ME_COST_PRECISION_DPEL_INTEL                 0x3
+
+#define CL_AVC_ME_BIDIR_WEIGHT_QUARTER_INTEL                0x10
+#define CL_AVC_ME_BIDIR_WEIGHT_THIRD_INTEL                  0x15
+#define CL_AVC_ME_BIDIR_WEIGHT_HALF_INTEL                   0x20
+#define CL_AVC_ME_BIDIR_WEIGHT_TWO_THIRD_INTEL              0x2B
+#define CL_AVC_ME_BIDIR_WEIGHT_THREE_QUARTER_INTEL          0x30
+
+#define CL_AVC_ME_BORDER_REACHED_LEFT_INTEL                 0x0
+#define CL_AVC_ME_BORDER_REACHED_RIGHT_INTEL                0x2
+#define CL_AVC_ME_BORDER_REACHED_TOP_INTEL                  0x4
+#define CL_AVC_ME_BORDER_REACHED_BOTTOM_INTEL               0x8
+
+#define CL_AVC_ME_SKIP_BLOCK_PARTITION_16x16_INTEL          0x0
+#define CL_AVC_ME_SKIP_BLOCK_PARTITION_8x8_INTEL            0x4000
+
+#define CL_AVC_ME_SKIP_BLOCK_16x16_FORWARD_ENABLE_INTEL     ( 0x1 << 24 )
+#define CL_AVC_ME_SKIP_BLOCK_16x16_BACKWARD_ENABLE_INTEL    ( 0x2 << 24 )
+#define CL_AVC_ME_SKIP_BLOCK_16x16_DUAL_ENABLE_INTEL        ( 0x3 << 24 )
+#define CL_AVC_ME_SKIP_BLOCK_8x8_FORWARD_ENABLE_INTEL       ( 0x55 << 24 )
+#define CL_AVC_ME_SKIP_BLOCK_8x8_BACKWARD_ENABLE_INTEL      ( 0xAA << 24 )
+#define CL_AVC_ME_SKIP_BLOCK_8x8_DUAL_ENABLE_INTEL          ( 0xFF << 24 )
+#define CL_AVC_ME_SKIP_BLOCK_8x8_0_FORWARD_ENABLE_INTEL     ( 0x1 << 24 )
+#define CL_AVC_ME_SKIP_BLOCK_8x8_0_BACKWARD_ENABLE_INTEL    ( 0x2 << 24 )
+#define CL_AVC_ME_SKIP_BLOCK_8x8_1_FORWARD_ENABLE_INTEL     ( 0x1 << 26 )
+#define CL_AVC_ME_SKIP_BLOCK_8x8_1_BACKWARD_ENABLE_INTEL    ( 0x2 << 26 )
+#define CL_AVC_ME_SKIP_BLOCK_8x8_2_FORWARD_ENABLE_INTEL     ( 0x1 << 28 )
+#define CL_AVC_ME_SKIP_BLOCK_8x8_2_BACKWARD_ENABLE_INTEL    ( 0x2 << 28 )
+#define CL_AVC_ME_SKIP_BLOCK_8x8_3_FORWARD_ENABLE_INTEL     ( 0x1 << 30 )
+#define CL_AVC_ME_SKIP_BLOCK_8x8_3_BACKWARD_ENABLE_INTEL    ( 0x2 << 30 )
+
+#define CL_AVC_ME_BLOCK_BASED_SKIP_4x4_INTEL                0x00
+#define CL_AVC_ME_BLOCK_BASED_SKIP_8x8_INTEL                0x80
+
+#define CL_AVC_ME_INTRA_16x16_INTEL                         0x0
+#define CL_AVC_ME_INTRA_8x8_INTEL                           0x1
+#define CL_AVC_ME_INTRA_4x4_INTEL                           0x2
+
+#define CL_AVC_ME_INTRA_LUMA_PARTITION_MASK_16x16_INTEL     0x6
+#define CL_AVC_ME_INTRA_LUMA_PARTITION_MASK_8x8_INTEL       0x5
+#define CL_AVC_ME_INTRA_LUMA_PARTITION_MASK_4x4_INTEL       0x3
+
+#define CL_AVC_ME_INTRA_NEIGHBOR_LEFT_MASK_ENABLE_INTEL         0x60
+#define CL_AVC_ME_INTRA_NEIGHBOR_UPPER_MASK_ENABLE_INTEL        0x10
+#define CL_AVC_ME_INTRA_NEIGHBOR_UPPER_RIGHT_MASK_ENABLE_INTEL  0x8
+#define CL_AVC_ME_INTRA_NEIGHBOR_UPPER_LEFT_MASK_ENABLE_INTEL   0x4
+
+#define CL_AVC_ME_LUMA_PREDICTOR_MODE_VERTICAL_INTEL            0x0
+#define CL_AVC_ME_LUMA_PREDICTOR_MODE_HORIZONTAL_INTEL          0x1
+#define CL_AVC_ME_LUMA_PREDICTOR_MODE_DC_INTEL                  0x2
+#define CL_AVC_ME_LUMA_PREDICTOR_MODE_DIAGONAL_DOWN_LEFT_INTEL  0x3
+#define CL_AVC_ME_LUMA_PREDICTOR_MODE_DIAGONAL_DOWN_RIGHT_INTEL 0x4
+#define CL_AVC_ME_LUMA_PREDICTOR_MODE_PLANE_INTEL               0x4
+#define CL_AVC_ME_LUMA_PREDICTOR_MODE_VERTICAL_RIGHT_INTEL      0x5
+#define CL_AVC_ME_LUMA_PREDICTOR_MODE_HORIZONTAL_DOWN_INTEL     0x6
+#define CL_AVC_ME_LUMA_PREDICTOR_MODE_VERTICAL_LEFT_INTEL       0x7
+#define CL_AVC_ME_LUMA_PREDICTOR_MODE_HORIZONTAL_UP_INTEL       0x8
+#define CL_AVC_ME_CHROMA_PREDICTOR_MODE_DC_INTEL                0x0
+#define CL_AVC_ME_CHROMA_PREDICTOR_MODE_HORIZONTAL_INTEL        0x1
+#define CL_AVC_ME_CHROMA_PREDICTOR_MODE_VERTICAL_INTEL          0x2
+#define CL_AVC_ME_CHROMA_PREDICTOR_MODE_PLANE_INTEL             0x3
+
+#define CL_AVC_ME_FRAME_FORWARD_INTEL                       0x1
+#define CL_AVC_ME_FRAME_BACKWARD_INTEL                      0x2
+#define CL_AVC_ME_FRAME_DUAL_INTEL                          0x3
+
+#define CL_AVC_ME_SLICE_TYPE_PRED_INTEL                     0x0
+#define CL_AVC_ME_SLICE_TYPE_BPRED_INTEL                    0x1
+#define CL_AVC_ME_SLICE_TYPE_INTRA_INTEL                    0x2
+
+#define CL_AVC_ME_INTERLACED_SCAN_TOP_FIELD_INTEL           0x0
+#define CL_AVC_ME_INTERLACED_SCAN_BOTTOM_FIELD_INTEL        0x1
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* __CL_EXT_INTEL_H */
--- a/src/3rdparty/CL/cl_gl.h
+++ b/src/3rdparty/CL/cl_gl.h
@ -0,0 +1,171 @@
+/**********************************************************************************
+ * Copyright (c) 2008-2019 The Khronos Group Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and/or associated documentation files (the
+ * "Materials"), to deal in the Materials without restriction, including
+ * without limitation the rights to use, copy, modify, merge, publish,
+ * distribute, sublicense, and/or sell copies of the Materials, and to
+ * permit persons to whom the Materials are furnished to do so, subject to
+ * the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included
+ * in all copies or substantial portions of the Materials.
+ *
+ * MODIFICATIONS TO THIS FILE MAY MEAN IT NO LONGER ACCURATELY REFLECTS
+ * KHRONOS STANDARDS. THE UNMODIFIED, NORMATIVE VERSIONS OF KHRONOS
+ * SPECIFICATIONS AND HEADER INFORMATION ARE LOCATED AT
+ *    https://www.khronos.org/registry/
+ *
+ * THE MATERIALS ARE PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
+ * IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
+ * CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
+ * TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
+ * MATERIALS OR THE USE OR OTHER DEALINGS IN THE MATERIALS.
+ **********************************************************************************/
+
+#ifndef __OPENCL_CL_GL_H
+#define __OPENCL_CL_GL_H
+
+#include <CL/cl.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+typedef cl_uint     cl_gl_object_type;
+typedef cl_uint     cl_gl_texture_info;
+typedef cl_uint     cl_gl_platform_info;
+typedef struct __GLsync *cl_GLsync;
+
+/* cl_gl_object_type = 0x2000 - 0x200F enum values are currently taken           */
+#define CL_GL_OBJECT_BUFFER                     0x2000
+#define CL_GL_OBJECT_TEXTURE2D                  0x2001
+#define CL_GL_OBJECT_TEXTURE3D                  0x2002
+#define CL_GL_OBJECT_RENDERBUFFER               0x2003
+#ifdef CL_VERSION_1_2
+#define CL_GL_OBJECT_TEXTURE2D_ARRAY            0x200E
+#define CL_GL_OBJECT_TEXTURE1D                  0x200F
+#define CL_GL_OBJECT_TEXTURE1D_ARRAY            0x2010
+#define CL_GL_OBJECT_TEXTURE_BUFFER             0x2011
+#endif
+
+/* cl_gl_texture_info           */
+#define CL_GL_TEXTURE_TARGET                    0x2004
+#define CL_GL_MIPMAP_LEVEL                      0x2005
+#ifdef CL_VERSION_1_2
+#define CL_GL_NUM_SAMPLES                       0x2012
+#endif
+
+
+extern CL_API_ENTRY cl_mem CL_API_CALL
+clCreateFromGLBuffer(cl_context     context,
+                     cl_mem_flags   flags,
+                     cl_GLuint      bufobj,
+                     cl_int *       errcode_ret) CL_API_SUFFIX__VERSION_1_0;
+
+#ifdef CL_VERSION_1_2
+
+extern CL_API_ENTRY cl_mem CL_API_CALL
+clCreateFromGLTexture(cl_context      context,
+                      cl_mem_flags    flags,
+                      cl_GLenum       target,
+                      cl_GLint        miplevel,
+                      cl_GLuint       texture,
+                      cl_int *        errcode_ret) CL_API_SUFFIX__VERSION_1_2;
+
+#endif
+
+extern CL_API_ENTRY cl_mem CL_API_CALL
+clCreateFromGLRenderbuffer(cl_context   context,
+                           cl_mem_flags flags,
+                           cl_GLuint    renderbuffer,
+                           cl_int *     errcode_ret) CL_API_SUFFIX__VERSION_1_0;
+
+extern CL_API_ENTRY cl_int CL_API_CALL
+clGetGLObjectInfo(cl_mem                memobj,
+                  cl_gl_object_type *   gl_object_type,
+                  cl_GLuint *           gl_object_name) CL_API_SUFFIX__VERSION_1_0;
+
+extern CL_API_ENTRY cl_int CL_API_CALL
+clGetGLTextureInfo(cl_mem               memobj,
+                   cl_gl_texture_info   param_name,
+                   size_t               param_value_size,
+                   void *               param_value,
+                   size_t *             param_value_size_ret) CL_API_SUFFIX__VERSION_1_0;
+
+extern CL_API_ENTRY cl_int CL_API_CALL
+clEnqueueAcquireGLObjects(cl_command_queue      command_queue,
+                          cl_uint               num_objects,
+                          const cl_mem *        mem_objects,
+                          cl_uint               num_events_in_wait_list,
+                          const cl_event *      event_wait_list,
+                          cl_event *            event) CL_API_SUFFIX__VERSION_1_0;
+
+extern CL_API_ENTRY cl_int CL_API_CALL
+clEnqueueReleaseGLObjects(cl_command_queue      command_queue,
+                          cl_uint               num_objects,
+                          const cl_mem *        mem_objects,
+                          cl_uint               num_events_in_wait_list,
+                          const cl_event *      event_wait_list,
+                          cl_event *            event) CL_API_SUFFIX__VERSION_1_0;
+
+
+/* Deprecated OpenCL 1.1 APIs */
+extern CL_API_ENTRY CL_EXT_PREFIX__VERSION_1_1_DEPRECATED cl_mem CL_API_CALL
+clCreateFromGLTexture2D(cl_context      context,
+                        cl_mem_flags    flags,
+                        cl_GLenum       target,
+                        cl_GLint        miplevel,
+                        cl_GLuint       texture,
+                        cl_int *        errcode_ret) CL_EXT_SUFFIX__VERSION_1_1_DEPRECATED;
+
+extern CL_API_ENTRY CL_EXT_PREFIX__VERSION_1_1_DEPRECATED cl_mem CL_API_CALL
+clCreateFromGLTexture3D(cl_context      context,
+                        cl_mem_flags    flags,
+                        cl_GLenum       target,
+                        cl_GLint        miplevel,
+                        cl_GLuint       texture,
+                        cl_int *        errcode_ret) CL_EXT_SUFFIX__VERSION_1_1_DEPRECATED;
+
+/* cl_khr_gl_sharing extension  */
+
+#define cl_khr_gl_sharing 1
+
+typedef cl_uint     cl_gl_context_info;
+
+/* Additional Error Codes  */
+#define CL_INVALID_GL_SHAREGROUP_REFERENCE_KHR  -1000
+
+/* cl_gl_context_info  */
+#define CL_CURRENT_DEVICE_FOR_GL_CONTEXT_KHR    0x2006
+#define CL_DEVICES_FOR_GL_CONTEXT_KHR           0x2007
+
+/* Additional cl_context_properties  */
+#define CL_GL_CONTEXT_KHR                       0x2008
+#define CL_EGL_DISPLAY_KHR                      0x2009
+#define CL_GLX_DISPLAY_KHR                      0x200A
+#define CL_WGL_HDC_KHR                          0x200B
+#define CL_CGL_SHAREGROUP_KHR                   0x200C
+
+extern CL_API_ENTRY cl_int CL_API_CALL
+clGetGLContextInfoKHR(const cl_context_properties * properties,
+                      cl_gl_context_info            param_name,
+                      size_t                        param_value_size,
+                      void *                        param_value,
+                      size_t *                      param_value_size_ret) CL_API_SUFFIX__VERSION_1_0;
+
+typedef CL_API_ENTRY cl_int (CL_API_CALL *clGetGLContextInfoKHR_fn)(
+    const cl_context_properties * properties,
+    cl_gl_context_info            param_name,
+    size_t                        param_value_size,
+    void *                        param_value,
+    size_t *                      param_value_size_ret);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif  /* __OPENCL_CL_GL_H */
--- a/src/3rdparty/CL/cl_gl_ext.h
+++ b/src/3rdparty/CL/cl_gl_ext.h
@ -0,0 +1,52 @@
+/**********************************************************************************
+ * Copyright (c) 2008-2019 The Khronos Group Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and/or associated documentation files (the
+ * "Materials"), to deal in the Materials without restriction, including
+ * without limitation the rights to use, copy, modify, merge, publish,
+ * distribute, sublicense, and/or sell copies of the Materials, and to
+ * permit persons to whom the Materials are furnished to do so, subject to
+ * the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included
+ * in all copies or substantial portions of the Materials.
+ *
+ * MODIFICATIONS TO THIS FILE MAY MEAN IT NO LONGER ACCURATELY REFLECTS
+ * KHRONOS STANDARDS. THE UNMODIFIED, NORMATIVE VERSIONS OF KHRONOS
+ * SPECIFICATIONS AND HEADER INFORMATION ARE LOCATED AT
+ *    https://www.khronos.org/registry/
+ *
+ * THE MATERIALS ARE PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
+ * IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
+ * CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
+ * TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
+ * MATERIALS OR THE USE OR OTHER DEALINGS IN THE MATERIALS.
+ **********************************************************************************/
+
+#ifndef __OPENCL_CL_GL_EXT_H
+#define __OPENCL_CL_GL_EXT_H
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <CL/cl_gl.h>
+
+/* 
+ *  cl_khr_gl_event extension
+ */
+#define CL_COMMAND_GL_FENCE_SYNC_OBJECT_KHR     0x200D
+
+extern CL_API_ENTRY cl_event CL_API_CALL
+clCreateEventFromGLsyncKHR(cl_context context,
+                           cl_GLsync  cl_GLsync,
+                           cl_int *   errcode_ret) CL_EXT_SUFFIX__VERSION_1_1;
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif	/* __OPENCL_CL_GL_EXT_H  */
--- a/src/3rdparty/CL/cl_platform.h
+++ b/src/3rdparty/CL/cl_platform.h
--- a/src/3rdparty/CL/cl_va_api_media_sharing_intel.h
+++ b/src/3rdparty/CL/cl_va_api_media_sharing_intel.h
@ -0,0 +1,172 @@
+/**********************************************************************************
+ * Copyright (c) 2008-2019 The Khronos Group Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and/or associated documentation files (the
+ * "Materials"), to deal in the Materials without restriction, including
+ * without limitation the rights to use, copy, modify, merge, publish,
+ * distribute, sublicense, and/or sell copies of the Materials, and to
+ * permit persons to whom the Materials are furnished to do so, subject to
+ * the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included
+ * in all copies or substantial portions of the Materials.
+ *
+ * MODIFICATIONS TO THIS FILE MAY MEAN IT NO LONGER ACCURATELY REFLECTS
+ * KHRONOS STANDARDS. THE UNMODIFIED, NORMATIVE VERSIONS OF KHRONOS
+ * SPECIFICATIONS AND HEADER INFORMATION ARE LOCATED AT
+ *    https://www.khronos.org/registry/
+ *
+ * THE MATERIALS ARE PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
+ * IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
+ * CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
+ * TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
+ * MATERIALS OR THE USE OR OTHER DEALINGS IN THE MATERIALS.
+ **********************************************************************************/
+/*****************************************************************************\
+
+Copyright (c) 2013-2019 Intel Corporation All Rights Reserved.
+
+THESE MATERIALS ARE PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL INTEL OR ITS
+CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
+EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
+PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
+PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY
+OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY OR TORT (INCLUDING
+NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THESE
+MATERIALS, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+File Name: cl_va_api_media_sharing_intel.h
+
+Abstract:
+
+Notes:
+
+\*****************************************************************************/
+
+
+#ifndef __OPENCL_CL_VA_API_MEDIA_SHARING_INTEL_H
+#define __OPENCL_CL_VA_API_MEDIA_SHARING_INTEL_H
+
+#include <CL/cl.h>
+#include <CL/cl_platform.h>
+#include <va/va.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/******************************************
+* cl_intel_va_api_media_sharing extension *
+*******************************************/
+
+#define cl_intel_va_api_media_sharing 1
+
+/* error codes */
+#define CL_INVALID_VA_API_MEDIA_ADAPTER_INTEL               -1098
+#define CL_INVALID_VA_API_MEDIA_SURFACE_INTEL               -1099
+#define CL_VA_API_MEDIA_SURFACE_ALREADY_ACQUIRED_INTEL      -1100
+#define CL_VA_API_MEDIA_SURFACE_NOT_ACQUIRED_INTEL          -1101
+
+/* cl_va_api_device_source_intel */
+#define CL_VA_API_DISPLAY_INTEL                             0x4094
+
+/* cl_va_api_device_set_intel */
+#define CL_PREFERRED_DEVICES_FOR_VA_API_INTEL               0x4095
+#define CL_ALL_DEVICES_FOR_VA_API_INTEL                     0x4096
+
+/* cl_context_info */
+#define CL_CONTEXT_VA_API_DISPLAY_INTEL                     0x4097
+
+/* cl_mem_info */
+#define CL_MEM_VA_API_MEDIA_SURFACE_INTEL                   0x4098
+
+/* cl_image_info */
+#define CL_IMAGE_VA_API_PLANE_INTEL                         0x4099
+
+/* cl_command_type */
+#define CL_COMMAND_ACQUIRE_VA_API_MEDIA_SURFACES_INTEL      0x409A
+#define CL_COMMAND_RELEASE_VA_API_MEDIA_SURFACES_INTEL      0x409B
+
+typedef cl_uint cl_va_api_device_source_intel;
+typedef cl_uint cl_va_api_device_set_intel;
+
+extern CL_API_ENTRY cl_int CL_API_CALL
+clGetDeviceIDsFromVA_APIMediaAdapterINTEL(
+    cl_platform_id                platform,
+    cl_va_api_device_source_intel media_adapter_type,
+    void*                         media_adapter,
+    cl_va_api_device_set_intel    media_adapter_set,
+    cl_uint                       num_entries,
+    cl_device_id*                 devices,
+    cl_uint*                      num_devices) CL_EXT_SUFFIX__VERSION_1_2;
+
+typedef CL_API_ENTRY cl_int (CL_API_CALL * clGetDeviceIDsFromVA_APIMediaAdapterINTEL_fn)(
+    cl_platform_id                platform,
+    cl_va_api_device_source_intel media_adapter_type,
+    void*                         media_adapter,
+    cl_va_api_device_set_intel    media_adapter_set,
+    cl_uint                       num_entries,
+    cl_device_id*                 devices,
+    cl_uint*                      num_devices) CL_EXT_SUFFIX__VERSION_1_2;
+
+extern CL_API_ENTRY cl_mem CL_API_CALL
+clCreateFromVA_APIMediaSurfaceINTEL(
+    cl_context                    context,
+    cl_mem_flags                  flags,
+    VASurfaceID*                  surface,
+    cl_uint                       plane,
+    cl_int*                       errcode_ret) CL_EXT_SUFFIX__VERSION_1_2;
+
+typedef CL_API_ENTRY cl_mem (CL_API_CALL * clCreateFromVA_APIMediaSurfaceINTEL_fn)(
+    cl_context                    context,
+    cl_mem_flags                  flags,
+    VASurfaceID*                  surface,
+    cl_uint                       plane,
+    cl_int*                       errcode_ret) CL_EXT_SUFFIX__VERSION_1_2;
+
+extern CL_API_ENTRY cl_int CL_API_CALL
+clEnqueueAcquireVA_APIMediaSurfacesINTEL(
+    cl_command_queue              command_queue,
+    cl_uint                       num_objects,
+    const cl_mem*                 mem_objects,
+    cl_uint                       num_events_in_wait_list,
+    const cl_event*               event_wait_list,
+    cl_event*                     event) CL_EXT_SUFFIX__VERSION_1_2;
+
+typedef CL_API_ENTRY cl_int (CL_API_CALL *clEnqueueAcquireVA_APIMediaSurfacesINTEL_fn)(
+    cl_command_queue              command_queue,
+    cl_uint                       num_objects,
+    const cl_mem*                 mem_objects,
+    cl_uint                       num_events_in_wait_list,
+    const cl_event*               event_wait_list,
+    cl_event*                     event) CL_EXT_SUFFIX__VERSION_1_2;
+
+extern CL_API_ENTRY cl_int CL_API_CALL
+clEnqueueReleaseVA_APIMediaSurfacesINTEL(
+    cl_command_queue              command_queue,
+    cl_uint                       num_objects,
+    const cl_mem*                 mem_objects,
+    cl_uint                       num_events_in_wait_list,
+    const cl_event*               event_wait_list,
+    cl_event*                     event) CL_EXT_SUFFIX__VERSION_1_2;
+
+typedef CL_API_ENTRY cl_int (CL_API_CALL *clEnqueueReleaseVA_APIMediaSurfacesINTEL_fn)(
+    cl_command_queue              command_queue,
+    cl_uint                       num_objects,
+    const cl_mem*                 mem_objects,
+    cl_uint                       num_events_in_wait_list,
+    const cl_event*               event_wait_list,
+    cl_event*                     event) CL_EXT_SUFFIX__VERSION_1_2;
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif  /* __OPENCL_CL_VA_API_MEDIA_SHARING_INTEL_H */
+
--- a/src/3rdparty/CL/cl_version.h
+++ b/src/3rdparty/CL/cl_version.h
@ -0,0 +1,86 @@
+/*******************************************************************************
+ * Copyright (c) 2018 The Khronos Group Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and/or associated documentation files (the
+ * "Materials"), to deal in the Materials without restriction, including
+ * without limitation the rights to use, copy, modify, merge, publish,
+ * distribute, sublicense, and/or sell copies of the Materials, and to
+ * permit persons to whom the Materials are furnished to do so, subject to
+ * the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included
+ * in all copies or substantial portions of the Materials.
+ *
+ * MODIFICATIONS TO THIS FILE MAY MEAN IT NO LONGER ACCURATELY REFLECTS
+ * KHRONOS STANDARDS. THE UNMODIFIED, NORMATIVE VERSIONS OF KHRONOS
+ * SPECIFICATIONS AND HEADER INFORMATION ARE LOCATED AT
+ *    https://www.khronos.org/registry/
+ *
+ * THE MATERIALS ARE PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
+ * IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
+ * CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
+ * TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
+ * MATERIALS OR THE USE OR OTHER DEALINGS IN THE MATERIALS.
+ ******************************************************************************/
+
+#ifndef __CL_VERSION_H
+#define __CL_VERSION_H
+
+/* Detect which version to target */
+#if !defined(CL_TARGET_OPENCL_VERSION)
+#pragma message("cl_version.h: CL_TARGET_OPENCL_VERSION is not defined. Defaulting to 220 (OpenCL 2.2)")
+#define CL_TARGET_OPENCL_VERSION 220
+#endif
+#if CL_TARGET_OPENCL_VERSION != 100 && \
+    CL_TARGET_OPENCL_VERSION != 110 && \
+    CL_TARGET_OPENCL_VERSION != 120 && \
+    CL_TARGET_OPENCL_VERSION != 200 && \
+    CL_TARGET_OPENCL_VERSION != 210 && \
+    CL_TARGET_OPENCL_VERSION != 220
+#pragma message("cl_version: CL_TARGET_OPENCL_VERSION is not a valid value (100, 110, 120, 200, 210, 220). Defaulting to 220 (OpenCL 2.2)")
+#undef CL_TARGET_OPENCL_VERSION
+#define CL_TARGET_OPENCL_VERSION 220
+#endif
+
+
+/* OpenCL Version */
+#if CL_TARGET_OPENCL_VERSION >= 220 && !defined(CL_VERSION_2_2)
+#define CL_VERSION_2_2  1
+#endif
+#if CL_TARGET_OPENCL_VERSION >= 210 && !defined(CL_VERSION_2_1)
+#define CL_VERSION_2_1  1
+#endif
+#if CL_TARGET_OPENCL_VERSION >= 200 && !defined(CL_VERSION_2_0)
+#define CL_VERSION_2_0  1
+#endif
+#if CL_TARGET_OPENCL_VERSION >= 120 && !defined(CL_VERSION_1_2)
+#define CL_VERSION_1_2  1
+#endif
+#if CL_TARGET_OPENCL_VERSION >= 110 && !defined(CL_VERSION_1_1)
+#define CL_VERSION_1_1  1
+#endif
+#if CL_TARGET_OPENCL_VERSION >= 100 && !defined(CL_VERSION_1_0)
+#define CL_VERSION_1_0  1
+#endif
+
+/* Allow deprecated APIs for older OpenCL versions. */
+#if CL_TARGET_OPENCL_VERSION <= 210 && !defined(CL_USE_DEPRECATED_OPENCL_2_1_APIS)
+#define CL_USE_DEPRECATED_OPENCL_2_1_APIS
+#endif
+#if CL_TARGET_OPENCL_VERSION <= 200 && !defined(CL_USE_DEPRECATED_OPENCL_2_0_APIS)
+#define CL_USE_DEPRECATED_OPENCL_2_0_APIS
+#endif
+#if CL_TARGET_OPENCL_VERSION <= 120 && !defined(CL_USE_DEPRECATED_OPENCL_1_2_APIS)
+#define CL_USE_DEPRECATED_OPENCL_1_2_APIS
+#endif
+#if CL_TARGET_OPENCL_VERSION <= 110 && !defined(CL_USE_DEPRECATED_OPENCL_1_1_APIS)
+#define CL_USE_DEPRECATED_OPENCL_1_1_APIS
+#endif
+#if CL_TARGET_OPENCL_VERSION <= 100 && !defined(CL_USE_DEPRECATED_OPENCL_1_0_APIS)
+#define CL_USE_DEPRECATED_OPENCL_1_0_APIS
+#endif
+
+#endif  /* __CL_VERSION_H */
--- a/src/3rdparty/CL/opencl.h
+++ b/src/3rdparty/CL/opencl.h
@ -0,0 +1,47 @@
+/*******************************************************************************
+ * Copyright (c) 2008-2015 The Khronos Group Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and/or associated documentation files (the
+ * "Materials"), to deal in the Materials without restriction, including
+ * without limitation the rights to use, copy, modify, merge, publish,
+ * distribute, sublicense, and/or sell copies of the Materials, and to
+ * permit persons to whom the Materials are furnished to do so, subject to
+ * the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included
+ * in all copies or substantial portions of the Materials.
+ *
+ * MODIFICATIONS TO THIS FILE MAY MEAN IT NO LONGER ACCURATELY REFLECTS
+ * KHRONOS STANDARDS. THE UNMODIFIED, NORMATIVE VERSIONS OF KHRONOS
+ * SPECIFICATIONS AND HEADER INFORMATION ARE LOCATED AT
+ *    https://www.khronos.org/registry/
+ *
+ * THE MATERIALS ARE PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
+ * IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
+ * CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
+ * TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
+ * MATERIALS OR THE USE OR OTHER DEALINGS IN THE MATERIALS.
+ ******************************************************************************/
+
+/* $Revision: 11708 $ on $Date: 2010-06-13 23:36:24 -0700 (Sun, 13 Jun 2010) $ */
+
+#ifndef __OPENCL_H
+#define __OPENCL_H
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <CL/cl.h>
+#include <CL/cl_gl.h>
+#include <CL/cl_gl_ext.h>
+#include <CL/cl_ext.h>
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif  /* __OPENCL_H   */
--- a/src/3rdparty/argon2/CMakeLists.txt
+++ b/src/3rdparty/argon2/CMakeLists.txt
@ -22,7 +22,7 @@ set(ARGON2_X86_64_SOURCES arch/x86_64/lib/argon2-arch.c arch/x86_64/lib/cpu-flag
 if (CMAKE_C_COMPILER_ID MATCHES MSVC)
    function(add_feature_impl FEATURE MSVC_FLAG DEF)
        add_library(argon2-${FEATURE} STATIC arch/x86_64/lib/argon2-${FEATURE}.c)
-        target_include_directories(argon2-${FEATURE} PRIVATE ${CMAKE_CURRENT_SOURCE_DIR}/include)
+        target_include_directories(argon2-${FEATURE} PRIVATE ${CMAKE_CURRENT_SOURCE_DIR}/../../)
        target_include_directories(argon2-${FEATURE} PRIVATE ${CMAKE_CURRENT_SOURCE_DIR}/lib)
        set_target_properties(argon2-${FEATURE} PROPERTIES POSITION_INDEPENDENT_CODE True)

@ -38,7 +38,7 @@ if (CMAKE_C_COMPILER_ID MATCHES MSVC)
 elseif (NOT XMRIG_ARM AND CMAKE_SIZEOF_VOID_P EQUAL 8)
    function(add_feature_impl FEATURE GCC_FLAG DEF)
        add_library(argon2-${FEATURE} STATIC arch/x86_64/lib/argon2-${FEATURE}.c)
-        target_include_directories(argon2-${FEATURE} PRIVATE ${CMAKE_CURRENT_SOURCE_DIR}/include)
+        target_include_directories(argon2-${FEATURE} PRIVATE ${CMAKE_CURRENT_SOURCE_DIR}/../../)
        target_include_directories(argon2-${FEATURE} PRIVATE ${CMAKE_CURRENT_SOURCE_DIR}/lib)
        set_target_properties(argon2-${FEATURE} PROPERTIES POSITION_INDEPENDENT_CODE True)

@ -84,5 +84,5 @@ endif()
 add_library(argon2 STATIC ${ARGON2_SOURCES})
 target_link_libraries(argon2 ${ARGON2_LIBS})

-target_include_directories(argon2 PRIVATE ${CMAKE_CURRENT_SOURCE_DIR}/include)
+target_include_directories(argon2 PRIVATE ${CMAKE_CURRENT_SOURCE_DIR}/../../)
 target_include_directories(argon2 PRIVATE ${CMAKE_CURRENT_SOURCE_DIR}/lib)
--- a/src/3rdparty/argon2/lib/argon2.c
+++ b/src/3rdparty/argon2/lib/argon2.c
@ -15,7 +15,7 @@
 #include <stdlib.h>
 #include <stdio.h>

-#include "argon2.h"
+#include "3rdparty/argon2.h"
 #include "encoding.h"
 #include "core.h"

--- a/src/3rdparty/argon2/lib/core.h
+++ b/src/3rdparty/argon2/lib/core.h
@ -14,7 +14,7 @@
 #ifndef ARGON2_CORE_H
 #define ARGON2_CORE_H

-#include "argon2.h"
+#include "3rdparty/argon2.h"

 #if defined(_MSC_VER)
 #define ALIGN(n) __declspec(align(16))
--- a/src/3rdparty/argon2/lib/encoding.h
+++ b/src/3rdparty/argon2/lib/encoding.h
@ -1,6 +1,6 @@
 #ifndef ENCODING_H
 #define ENCODING_H
-#include "argon2.h"
+#include "3rdparty/argon2.h"

 #define ARGON2_MAX_DECODED_LANES UINT32_C(255)
 #define ARGON2_MIN_DECODED_SALT_LEN UINT32_C(8)
--- a/src/3rdparty/argon2/lib/impl-select.c
+++ b/src/3rdparty/argon2/lib/impl-select.c
@ -3,7 +3,7 @@

 #include "impl-select.h"

-#include "argon2.h"
+#include "3rdparty/argon2.h"

 #define BENCH_SAMPLES 1024
 #define BENCH_MEM_BLOCKS 512
--- a/src/3rdparty/base32/base32.h
+++ b/src/3rdparty/base32/base32.h
@ -0,0 +1,68 @@
+// Base32 implementation
+//
+// Copyright 2010 Google Inc.
+// Author: Markus Gutschke
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+//      http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+//
+// Encode and decode from base32 encoding using the following alphabet:
+//   ABCDEFGHIJKLMNOPQRSTUVWXYZ234567
+// This alphabet is documented in RFC 4648/3548
+//
+// We allow white-space and hyphens, but all other characters are considered
+// invalid.
+//
+// All functions return the number of output bytes or -1 on error. If the
+// output buffer is too small, the result will silently be truncated.
+
+#ifndef XMRIG_BASE32_H
+#define XMRIG_BASE32_H
+
+
+#include <stdint.h>
+
+
+int base32_encode(const uint8_t *data, int length, uint8_t *result, int bufSize) {
+  if (length < 0 || length > (1 << 28)) {
+    return -1;
+  }
+  int count = 0;
+  if (length > 0) {
+    int buffer = data[0];
+    int next = 1;
+    int bitsLeft = 8;
+    while (count < bufSize && (bitsLeft > 0 || next < length)) {
+      if (bitsLeft < 5) {
+        if (next < length) {
+          buffer <<= 8;
+          buffer |= data[next++] & 0xFF;
+          bitsLeft += 8;
+        } else {
+          int pad = 5 - bitsLeft;
+          buffer <<= pad;
+          bitsLeft += pad;
+        }
+      }
+      int index = 0x1F & (buffer >> (bitsLeft - 5));
+      bitsLeft -= 5;
+      result[count++] = "ABCDEFGHIJKLMNOPQRSTUVWXYZ234567"[index];
+    }
+  }
+  if (count < bufSize) {
+    result[count] = '\000';
+  }
+  return count;
+}
+
+
+#endif /* XMRIG_BASE32_H */
--- a/src/3rdparty/cl.h
+++ b/src/3rdparty/cl.h
@ -0,0 +1,36 @@
+/* XMRig
+ * Copyright 2010      Jeff Garzik <jgarzik@pobox.com>
+ * Copyright 2012-2014 pooler      <pooler@litecoinpool.org>
+ * Copyright 2014      Lucas Jones <https://github.com/lucasjones>
+ * Copyright 2014-2016 Wolf9466    <https://github.com/OhGodAPet>
+ * Copyright 2016      Jay D Dee   <jayddee246@gmail.com>
+ * Copyright 2017-2018 XMR-Stak    <https://github.com/fireice-uk>, <https://github.com/psychocrypt>
+ * Copyright 2018-2019 SChernykh   <https://github.com/SChernykh>
+ * Copyright 2016-2019 XMRig       <https://github.com/xmrig>, <support@xmrig.com>
+ *
+ *   This program is free software: you can redistribute it and/or modify
+ *   it under the terms of the GNU General Public License as published by
+ *   the Free Software Foundation, either version 3 of the License, or
+ *   (at your option) any later version.
+ *
+ *   This program is distributed in the hope that it will be useful,
+ *   but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ *   GNU General Public License for more details.
+ *
+ *   You should have received a copy of the GNU General Public License
+ *   along with this program. If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef XMRIG_CL_H
+#define XMRIG_CL_H
+
+
+#if defined(__APPLE__)
+#   include <OpenCL/cl.h>
+#else
+#   include "3rdparty/CL/cl.h"
+#endif
+
+
+#endif /* XMRIG_CL_H */
--- a/src/App.cpp
+++ b/src/App.cpp
@ -24,7 +24,7 @@
 */


-#include <stdlib.h>
+#include <cstdlib>
 #include <uv.h>


@ -36,24 +36,14 @@
 #include "core/config/Config.h"
 #include "core/Controller.h"
 #include "core/Miner.h"
-#include "crypto/common/VirtualMemory.h"
 #include "net/Network.h"
 #include "Summary.h"
 #include "version.h"


-xmrig::App::App(Process *process) :
-    m_console(nullptr),
-    m_signals(nullptr)
+xmrig::App::App(Process *process)
 {
    m_controller = new Controller(process);
-    if (m_controller->init() != 0) {
-        return;
-    }
-
-    if (!m_controller->config()->isBackground()) {
-        m_console = new Console(this);
-    }
 }


@ -68,14 +58,26 @@ xmrig::App::~App()
 int xmrig::App::exec()
 {
    if (!m_controller->isReady()) {
+        LOG_EMERG("no valid configuration found.");
+
        return 2;
    }

    m_signals = new Signals(this);

-    background();
+    int rc = 0;
+    if (background(rc)) {
+        return rc;
+    }

-    VirtualMemory::init(m_controller->config()->cpu().isHugePages());
+    rc = m_controller->init();
+    if (rc != 0) {
+        return rc;
+    }
+
+    if (!m_controller->isBackground()) {
+        m_console = new Console(this);
+    }

    Summary::print(m_controller);

@ -87,38 +89,21 @@ int xmrig::App::exec()

    m_controller->start();

-    const int r = uv_run(uv_default_loop(), UV_RUN_DEFAULT);
+    rc = uv_run(uv_default_loop(), UV_RUN_DEFAULT);
    uv_loop_close(uv_default_loop());

-    return r;
+    return rc;
 }


 void xmrig::App::onConsoleCommand(char command)
 {
-    switch (command) {
-    case 'h':
-    case 'H':
-        m_controller->miner()->printHashrate(true);
-        break;
-
-    case 'p':
-    case 'P':
-        m_controller->miner()->setEnabled(false);
-        break;
-
-    case 'r':
-    case 'R':
-        m_controller->miner()->setEnabled(true);
-        break;
-
-    case 3:
+    if (command == 3) {
        LOG_WARN("Ctrl+C received, exiting");
        close();
-        break;
-
-    default:
-        break;
+    }
+    else {
+        m_controller->miner()->execCommand(command);
    }
 }

--- a/src/App.h
+++ b/src/App.h
@ -29,6 +29,7 @@

 #include "base/kernel/interfaces/IConsoleListener.h"
 #include "base/kernel/interfaces/ISignalListener.h"
+#include "base/tools/Object.h"


 namespace xmrig {
@ -44,6 +45,8 @@ class Signals;
 class App : public IConsoleListener, public ISignalListener
 {
 public:
+    XMRIG_DISABLE_COPY_MOVE_DEFAULT(App)
+
    App(Process *process);
    ~App() override;

@ -54,12 +57,12 @@ protected:
    void onSignal(int signum) override;

 private:
-    void background();
+    bool background(int &rc);
    void close();

-    Console *m_console;
-    Controller *m_controller;
-    Signals *m_signals;
+    Console *m_console          = nullptr;
+    Controller *m_controller    = nullptr;
+    Signals *m_signals          = nullptr;
 };


--- a/src/App_unix.cpp
+++ b/src/App_unix.cpp
@ -23,33 +23,36 @@
 */


-#include <stdlib.h>
-#include <signal.h>
-#include <errno.h>
+#include <cstdlib>
+#include <csignal>
+#include <cerrno>
 #include <unistd.h>


 #include "App.h"
 #include "base/io/log/Log.h"
-#include "core/config/Config.h"
 #include "core/Controller.h"


-void xmrig::App::background()
+bool xmrig::App::background(int &rc)
 {
    signal(SIGPIPE, SIG_IGN);

-    if (!m_controller->config()->isBackground()) {
-        return;
+    if (!m_controller->isBackground()) {
+        return false;
    }

    int i = fork();
    if (i < 0) {
-        exit(1);
+        rc = 1;
+
+        return true;
    }

    if (i > 0) {
-        exit(0);
+        rc = 0;
+
+        return true;
    }

    i = setsid();
@ -62,4 +65,6 @@ void xmrig::App::background()
    if (i < 0) {
        LOG_ERR("chdir() failed (errno = %d)", errno);
    }
+
+    return false;
 }
--- a/src/App_win.cpp
+++ b/src/App_win.cpp
@ -29,13 +29,12 @@

 #include "App.h"
 #include "core/Controller.h"
-#include "core/config/Config.h"


-void xmrig::App::background()
+bool xmrig::App::background(int &)
 {
-    if (!m_controller->config()->isBackground()) {
-        return;
+    if (!m_controller->isBackground()) {
+        return false;
    }

    HWND hcon = GetConsoleWindow();
@ -46,4 +45,6 @@ void xmrig::App::background()
        CloseHandle(h);
        FreeConsole();
    }
+
+    return false;
 }
--- a/src/Summary.cpp
+++ b/src/Summary.cpp
@ -23,8 +23,8 @@
 */


-#include <inttypes.h>
-#include <stdio.h>
+#include <cinttypes>
+#include <cstdio>
 #include <uv.h>


@ -59,10 +59,10 @@ inline static const char *asmName(Assembly::Id assembly)
 #endif


-static void print_memory(Config *) {
+static void print_memory(Config *config) {
 #   ifdef _WIN32
    Log::print(GREEN_BOLD(" * ") WHITE_BOLD("%-13s") "%s",
-               "HUGE PAGES", VirtualMemory::isHugepagesAvailable() ? GREEN_BOLD("permission granted") : RED_BOLD("unavailable"));
+               "HUGE PAGES", config->cpu().isHugePages() ? (VirtualMemory::isHugepagesAvailable() ? GREEN_BOLD("permission granted") : RED_BOLD("unavailable")) : RED_BOLD("disabled"));
 #   endif
 }

@ -126,9 +126,9 @@ static void print_threads(Config *config)
 static void print_commands(Config *)
 {
    if (Log::colors) {
-        Log::print(GREEN_BOLD(" * ") WHITE_BOLD("COMMANDS     ") MAGENTA_BOLD("h") WHITE_BOLD("ashrate, ")
-                                                                     MAGENTA_BOLD("p") WHITE_BOLD("ause, ")
-                                                                     MAGENTA_BOLD("r") WHITE_BOLD("esume"));
+        Log::print(GREEN_BOLD(" * ") WHITE_BOLD("COMMANDS     ") MAGENTA_BG(WHITE_BOLD_S "h") WHITE_BOLD("ashrate, ")
+                                                                     MAGENTA_BG(WHITE_BOLD_S "p") WHITE_BOLD("ause, ")
+                                                                     MAGENTA_BG(WHITE_BOLD_S "r") WHITE_BOLD("esume"));
    }
    else {
        Log::print(" * COMMANDS     'h' hashrate, 'p' pause, 'r' resume");
--- a/src/backend/backend.cmake
+++ b/src/backend/backend.cmake
@ -1,13 +1,19 @@
 include (src/backend/cpu/cpu.cmake)
+include (src/backend/opencl/opencl.cmake)
+include (src/backend/cuda/cuda.cmake)
 include (src/backend/common/common.cmake)


 set(HEADERS_BACKEND
    "${HEADERS_BACKEND_COMMON}"
    "${HEADERS_BACKEND_CPU}"
+    "${HEADERS_BACKEND_OPENCL}"
+    "${HEADERS_BACKEND_CUDA}"
   )

 set(SOURCES_BACKEND
    "${SOURCES_BACKEND_COMMON}"
    "${SOURCES_BACKEND_CPU}"
+    "${SOURCES_BACKEND_OPENCL}"
+    "${SOURCES_BACKEND_CUDA}"
   )
--- a/src/backend/common/Hashrate.cpp
+++ b/src/backend/common/Hashrate.cpp
@ -23,10 +23,10 @@
 */


-#include <assert.h>
+#include <cassert>
 #include <cmath>
 #include <memory.h>
-#include <stdio.h>
+#include <cstdio>


 #include "backend/common/Hashrate.h"
@ -47,7 +47,6 @@ inline static const char *format(double h, char *buf, size_t size)


 xmrig::Hashrate::Hashrate(size_t threads) :
-    m_highest(0.0),
    m_threads(threads)
 {
    m_counts     = new uint64_t*[threads];
@ -100,30 +99,30 @@ double xmrig::Hashrate::calc(size_t threadId, size_t ms) const

    uint64_t earliestHashCount = 0;
    uint64_t earliestStamp     = 0;
-    uint64_t lastestStamp      = 0;
-    uint64_t lastestHashCnt    = 0;
    bool haveFullSet           = false;

-    for (size_t i = 1; i < kBucketSize; i++) {
-        const size_t idx = (m_top[threadId] - i) & kBucketMask;
+    const uint64_t timeStampLimit = xmrig::Chrono::highResolutionMSecs() - ms;
+    uint64_t* timestamps = m_timestamps[threadId];
+    uint64_t* counts = m_counts[threadId];

-        if (m_timestamps[threadId][idx] == 0) {
+    const size_t idx_start = (m_top[threadId] - 1) & kBucketMask;
+    size_t idx = idx_start;
+
+    uint64_t lastestStamp = timestamps[idx];
+    uint64_t lastestHashCnt = counts[idx];
+
+    do {
+        if (timestamps[idx] < timeStampLimit) {
+            haveFullSet = (timestamps[idx] != 0);
+            if (idx != idx_start) {
+                idx = (idx + 1) & kBucketMask;
+                earliestStamp = timestamps[idx];
+                earliestHashCount = counts[idx];
+            }
            break;
        }
-
-        if (lastestStamp == 0) {
-            lastestStamp = m_timestamps[threadId][idx];
-            lastestHashCnt = m_counts[threadId][idx];
-        }
-
-        if (xmrig::Chrono::highResolutionMSecs() - m_timestamps[threadId][idx] > ms) {
-            haveFullSet = true;
-            break;
-        }
-
-        earliestStamp = m_timestamps[threadId][idx];
-        earliestHashCount = m_counts[threadId][idx];
-    }
+        idx = (idx - 1) & kBucketMask;
+    } while (idx != idx_start);

    if (!haveFullSet || earliestStamp == 0 || lastestStamp == 0) {
        return nan("");
@ -133,8 +132,8 @@ double xmrig::Hashrate::calc(size_t threadId, size_t ms) const
        return nan("");
    }

-    const double hashes = static_cast<double>(lastestHashCnt - earliestHashCount);
-    const double time   = static_cast<double>(lastestStamp - earliestStamp) / 1000.0;
+    const auto hashes = static_cast<double>(lastestHashCnt - earliestHashCount);
+    const auto time   = static_cast<double>(lastestStamp - earliestStamp) / 1000.0;

    return hashes / time;
 }
@ -150,15 +149,6 @@ void xmrig::Hashrate::add(size_t threadId, uint64_t count, uint64_t timestamp)
 }


-void xmrig::Hashrate::updateHighest()
-{
-   double highest = calc(ShortInterval);
-   if (std::isnormal(highest) && highest > m_highest) {
-       m_highest = highest;
-   }
-}
-
-
 const char *xmrig::Hashrate::format(double h, char *buf, size_t size)
 {
    return ::format(h, buf, size);
@ -175,3 +165,33 @@ rapidjson::Value xmrig::Hashrate::normalize(double d)

    return Value(floor(d * 100.0) / 100.0);
 }
+
+
+#ifdef XMRIG_FEATURE_API
+rapidjson::Value xmrig::Hashrate::toJSON(rapidjson::Document &doc) const
+{
+    using namespace rapidjson;
+    auto &allocator = doc.GetAllocator();
+
+    Value out(kArrayType);
+    out.PushBack(normalize(calc(ShortInterval)),  allocator);
+    out.PushBack(normalize(calc(MediumInterval)), allocator);
+    out.PushBack(normalize(calc(LargeInterval)),  allocator);
+
+    return out;
+}
+
+
+rapidjson::Value xmrig::Hashrate::toJSON(size_t threadId, rapidjson::Document &doc) const
+{
+    using namespace rapidjson;
+    auto &allocator = doc.GetAllocator();
+
+    Value out(kArrayType);
+    out.PushBack(normalize(calc(threadId, ShortInterval)),  allocator);
+    out.PushBack(normalize(calc(threadId, MediumInterval)), allocator);
+    out.PushBack(normalize(calc(threadId, LargeInterval)),  allocator);
+
+    return out;
+}
+#endif
--- a/src/backend/common/Hashrate.h
+++ b/src/backend/common/Hashrate.h
@ -26,10 +26,11 @@
 #define XMRIG_HASHRATE_H


-#include <stddef.h>
-#include <stdint.h>
+#include <cstddef>
+#include <cstdint>


+#include "base/tools/Object.h"
 #include "rapidjson/fwd.h"


@ -39,6 +40,8 @@ namespace xmrig {
 class Hashrate
 {
 public:
+    XMRIG_DISABLE_COPY_MOVE_DEFAULT(Hashrate)
+
    enum Intervals {
        ShortInterval  = 10000,
        MediumInterval = 60000,
@ -50,19 +53,21 @@ public:
    double calc(size_t ms) const;
    double calc(size_t threadId, size_t ms) const;
    void add(size_t threadId, uint64_t count, uint64_t timestamp);
-    void updateHighest();

-    inline double highest() const { return m_highest; }
    inline size_t threads() const { return m_threads; }

    static const char *format(double h, char *buf, size_t size);
    static rapidjson::Value normalize(double d);

+#   ifdef XMRIG_FEATURE_API
+    rapidjson::Value toJSON(rapidjson::Document &doc) const;
+    rapidjson::Value toJSON(size_t threadId, rapidjson::Document &doc) const;
+#   endif
+
 private:
    constexpr static size_t kBucketSize = 2 << 11;
    constexpr static size_t kBucketMask = kBucketSize - 1;

-    double m_highest;
    size_t m_threads;
    uint32_t* m_top;
    uint64_t** m_counts;
--- a/src/backend/common/Tags.h
+++ b/src/backend/common/Tags.h
@ -0,0 +1,60 @@
+/* XMRig
+ * Copyright 2010      Jeff Garzik <jgarzik@pobox.com>
+ * Copyright 2012-2014 pooler      <pooler@litecoinpool.org>
+ * Copyright 2014      Lucas Jones <https://github.com/lucasjones>
+ * Copyright 2014-2016 Wolf9466    <https://github.com/OhGodAPet>
+ * Copyright 2016      Jay D Dee   <jayddee246@gmail.com>
+ * Copyright 2017-2018 XMR-Stak    <https://github.com/fireice-uk>, <https://github.com/psychocrypt>
+ * Copyright 2018      Lee Clagett <https://github.com/vtnerd>
+ * Copyright 2018-2019 SChernykh   <https://github.com/SChernykh>
+ * Copyright 2016-2019 XMRig       <https://github.com/xmrig>, <support@xmrig.com>
+ *
+ *   This program is free software: you can redistribute it and/or modify
+ *   it under the terms of the GNU General Public License as published by
+ *   the Free Software Foundation, either version 3 of the License, or
+ *   (at your option) any later version.
+ *
+ *   This program is distributed in the hope that it will be useful,
+ *   but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ *   GNU General Public License for more details.
+ *
+ *   You should have received a copy of the GNU General Public License
+ *   along with this program. If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef XMRIG_TAGS_H
+#define XMRIG_TAGS_H
+
+
+#include <cstdint>
+
+
+namespace xmrig {
+
+
+const char *backend_tag(uint32_t backend);
+const char *cpu_tag();
+const char *net_tag();
+
+
+#ifdef XMRIG_FEATURE_OPENCL
+const char *ocl_tag();
+#endif
+
+
+#ifdef XMRIG_FEATURE_CUDA
+const char *cuda_tag();
+#endif
+
+
+
+#ifdef XMRIG_ALGO_RANDOMX
+const char *rx_tag();
+#endif
+
+
+} // namespace xmrig
+
+
+#endif /* XMRIG_TAGS_H */
--- a/src/backend/common/Thread.h
+++ b/src/backend/common/Thread.h
@ -26,10 +26,11 @@
 #define XMRIG_THREAD_H


-#include <thread>
-
-
 #include "backend/common/interfaces/IWorker.h"
+#include "base/tools/Object.h"
+
+
+#include <thread>


 namespace xmrig {
@ -42,18 +43,20 @@ template<class T>
 class Thread
 {
 public:
-    inline Thread(IBackend *backend, size_t index, const T &config) : m_index(index), m_config(config), m_backend(backend) {}
+    XMRIG_DISABLE_COPY_MOVE_DEFAULT(Thread)
+
+    inline Thread(IBackend *backend, size_t id, const T &config) : m_id(id), m_config(config), m_backend(backend) {}
    inline ~Thread() { m_thread.join(); delete m_worker; }

    inline const T &config() const                  { return m_config; }
    inline IBackend *backend() const                { return m_backend; }
    inline IWorker *worker() const                  { return m_worker; }
-    inline size_t index() const                     { return m_index; }
+    inline size_t id() const                        { return m_id; }
    inline void setWorker(IWorker *worker)          { m_worker = worker; }
    inline void start(void (*callback) (void *))    { m_thread = std::thread(callback, this); }

 private:
-    const size_t m_index    = 0;
+    const size_t m_id    = 0;
    const T m_config;
    IBackend *m_backend;
    IWorker *m_worker       = nullptr;
--- a/src/backend/common/Threads.cpp
+++ b/src/backend/common/Threads.cpp
@ -25,13 +25,25 @@

 #include "backend/common/Threads.h"
 #include "backend/cpu/CpuThreads.h"
+#include "crypto/cn/CnAlgo.h"
 #include "rapidjson/document.h"


+#ifdef XMRIG_FEATURE_OPENCL
+#   include "backend/opencl/OclThreads.h"
+#endif
+
+
+#ifdef XMRIG_FEATURE_CUDA
+#   include "backend/cuda/CudaThreads.h"
+#endif
+
+
 namespace xmrig {


 static const char *kAsterisk = "*";
+static const char *kCn2      = "cn/2";


 } // namespace xmrig
@ -113,6 +125,10 @@ xmrig::String xmrig::Threads<T>::profileName(const Algorithm &algorithm, bool st
        return String();
    }

+    if (algorithm.family() == Algorithm::CN && CnAlgo<>::base(algorithm) == Algorithm::CN_2 && has(kCn2)) {
+        return kCn2;
+    }
+
    if (name.contains("/")) {
        const String base = name.split('/').at(0);
        if (has(base)) {
@ -152,4 +168,12 @@ namespace xmrig {

 template class Threads<CpuThreads>;

+#ifdef XMRIG_FEATURE_OPENCL
+template class Threads<OclThreads>;
+#endif
+
+#ifdef XMRIG_FEATURE_CUDA
+template class Threads<CudaThreads>;
+#endif
+
 } // namespace xmrig
--- a/src/backend/common/Threads.h
+++ b/src/backend/common/Threads.h
@ -44,10 +44,26 @@ class Threads
 public:
    inline bool has(const char *profile) const                                         { return m_profiles.count(profile) > 0; }
    inline bool isDisabled(const Algorithm &algo) const                                { return m_disabled.count(algo) > 0; }
+    inline bool isEmpty() const                                                        { return m_profiles.empty(); }
    inline bool isExist(const Algorithm &algo) const                                   { return isDisabled(algo) || m_aliases.count(algo) > 0 || has(algo.shortName()); }
    inline const T &get(const Algorithm &algo, bool strict = false) const              { return get(profileName(algo, strict)); }
    inline void disable(const Algorithm &algo)                                         { m_disabled.insert(algo); }
-    inline void move(const char *profile, T &&threads)                                 { m_profiles.insert({ profile, threads }); }
+    inline void setAlias(const Algorithm &algo, const char *profile)                   { m_aliases[algo] = profile; }
+
+    inline size_t move(const char *profile, T &&threads)
+    {
+        if (has(profile)) {
+            return 0;
+        }
+
+        const size_t count = threads.count();
+
+        if (!threads.isEmpty()) {
+            m_profiles.insert({ profile, std::move(threads) });
+        }
+
+        return count;
+    }

    const T &get(const String &profileName) const;
    size_t read(const rapidjson::Value &value);
--- a/src/backend/common/Worker.cpp
+++ b/src/backend/common/Worker.cpp
@ -34,8 +34,7 @@ xmrig::Worker::Worker(size_t id, int64_t affinity, int priority) :
    m_affinity(affinity),
    m_id(id),
    m_hashCount(0),
-    m_timestamp(0),
-    m_count(0)
+    m_timestamp(0)
 {
    m_node = VirtualMemory::bindToNUMANode(affinity);

--- a/src/backend/common/Worker.h
+++ b/src/backend/common/Worker.h
@ -28,7 +28,7 @@


 #include <atomic>
-#include <stdint.h>
+#include <cstdint>


 #include "backend/common/interfaces/IWorker.h"
@ -54,8 +54,8 @@ protected:
    const size_t m_id;
    std::atomic<uint64_t> m_hashCount;
    std::atomic<uint64_t> m_timestamp;
-    uint32_t m_node = 0;
-    uint64_t m_count;
+    uint32_t m_node     = 0;
+    uint64_t m_count    = 0;
 };


--- a/src/backend/common/WorkerJob.h
+++ b/src/backend/common/WorkerJob.h
@ -26,7 +26,7 @@
 #define XMRIG_WORKERJOB_H


-#include <string.h>
+#include <cstring>


 #include "base/net/stratum/Job.h"
@ -47,9 +47,9 @@ public:
    inline uint8_t index() const            { return m_index; }


-    inline void add(const Job &job, uint64_t sequence, uint32_t reserveCount)
+    inline void add(const Job &job, uint32_t reserveCount, Nonce::Backend backend)
    {
-        m_sequence = sequence;
+        m_sequence = Nonce::sequence(backend);

        if (currentJob() == job) {
            return;
@ -60,35 +60,37 @@ public:
            return;
        }

-        save(job, reserveCount);
+        save(job, reserveCount, backend);
    }


-    inline void nextRound(uint32_t reserveCount)
+    inline void nextRound(uint32_t rounds, uint32_t roundSize)
    {
        m_rounds[index()]++;

-        if ((m_rounds[index()] % reserveCount) == 0) {
+        if ((m_rounds[index()] % rounds) == 0) {
            for (size_t i = 0; i < N; ++i) {
-                *nonce(i) = Nonce::next(index(), *nonce(i), reserveCount, currentJob().isNicehash());
+                *nonce(i) = Nonce::next(index(), *nonce(i), rounds * roundSize, currentJob().isNicehash());
            }
        }
        else {
            for (size_t i = 0; i < N; ++i) {
-                *nonce(i) += 1;
+                *nonce(i) += roundSize;
            }
        }
    }


 private:
-    inline void save(const Job &job, uint32_t reserveCount)
+    inline void save(const Job &job, uint32_t reserveCount, Nonce::Backend backend)
    {
        m_index           = job.index();
        const size_t size = job.size();
        m_jobs[index()]   = job;
        m_rounds[index()] = 0;

+        m_jobs[index()].setBackend(backend);
+
        for (size_t i = 0; i < N; ++i) {
            memcpy(m_blobs[index()] + (i * size), job.blob(), size);
            *nonce(i) = Nonce::next(index(), *nonce(i), reserveCount, job.isNicehash());
@ -96,7 +98,7 @@ private:
    }


-    alignas(16) uint8_t m_blobs[2][Job::kMaxBlobSize * N];
+    alignas(16) uint8_t m_blobs[2][Job::kMaxBlobSize * N]{};
    Job m_jobs[2];
    uint32_t m_rounds[2] = { 0, 0 };
    uint64_t m_sequence  = 0;
@ -112,26 +114,28 @@ inline uint32_t *xmrig::WorkerJob<1>::nonce(size_t)


 template<>
-inline void xmrig::WorkerJob<1>::nextRound(uint32_t reserveCount)
+inline void xmrig::WorkerJob<1>::nextRound(uint32_t rounds, uint32_t roundSize)
 {
    m_rounds[index()]++;

-    if ((m_rounds[index()] % reserveCount) == 0) {
-        *nonce() = Nonce::next(index(), *nonce(), reserveCount, currentJob().isNicehash());
+    if ((m_rounds[index()] % rounds) == 0) {
+        *nonce() = Nonce::next(index(), *nonce(), rounds * roundSize, currentJob().isNicehash());
    }
    else {
-        *nonce() += 1;
+        *nonce() += roundSize;
    }
 }


 template<>
-inline void xmrig::WorkerJob<1>::save(const Job &job, uint32_t reserveCount)
+inline void xmrig::WorkerJob<1>::save(const Job &job, uint32_t reserveCount, Nonce::Backend backend)
 {
    m_index           = job.index();
    m_jobs[index()]   = job;
    m_rounds[index()] = 0;

+    m_jobs[index()].setBackend(backend);
+
    memcpy(blob(), job.blob(), job.size());
    *nonce() = Nonce::next(index(), *nonce(), reserveCount, currentJob().isNicehash());
 }
--- a/src/backend/common/Workers.cpp
+++ b/src/backend/common/Workers.cpp
@ -29,6 +29,17 @@
 #include "backend/common/Workers.h"
 #include "backend/cpu/CpuWorker.h"
 #include "base/io/log/Log.h"
+#include "base/tools/Object.h"
+
+
+#ifdef XMRIG_FEATURE_OPENCL
+#   include "backend/opencl/OclWorker.h"
+#endif
+
+
+#ifdef XMRIG_FEATURE_CUDA
+#   include "backend/cuda/CudaWorker.h"
+#endif


 namespace xmrig {
@ -37,9 +48,10 @@ namespace xmrig {
 class WorkersPrivate
 {
 public:
-    inline WorkersPrivate()
-    {
-    }
+    XMRIG_DISABLE_COPY_MOVE(WorkersPrivate)
+
+
+    WorkersPrivate() = default;


    inline ~WorkersPrivate()
@ -93,6 +105,7 @@ void xmrig::Workers<T>::start(const std::vector<T> &data)
    }

    d_ptr->hashrate = new Hashrate(m_workers.size());
+    Nonce::touch(T::backend());

    for (Thread<T> *worker : m_workers) {
        worker->start(Workers<T>::onReady);
@ -126,18 +139,16 @@ void xmrig::Workers<T>::tick(uint64_t)

    for (Thread<T> *handle : m_workers) {
        if (!handle->worker()) {
-            return;
+            continue;
        }

-        d_ptr->hashrate->add(handle->index(), handle->worker()->hashCount(), handle->worker()->timestamp());
+        d_ptr->hashrate->add(handle->id(), handle->worker()->hashCount(), handle->worker()->timestamp());
    }
-
-    d_ptr->hashrate->updateHighest();
 }


 template<class T>
-xmrig::IWorker *xmrig::Workers<T>::create(Thread<CpuLaunchData> *)
+xmrig::IWorker *xmrig::Workers<T>::create(Thread<T> *)
 {
    return nullptr;
 }
@ -146,17 +157,24 @@ xmrig::IWorker *xmrig::Workers<T>::create(Thread<CpuLaunchData> *)
 template<class T>
 void xmrig::Workers<T>::onReady(void *arg)
 {
-    Thread<T> *handle = static_cast<Thread<T>* >(arg);
+    auto handle = static_cast<Thread<T>* >(arg);

    IWorker *worker = create(handle);
+    assert(worker != nullptr);
+
    if (!worker || !worker->selfTest()) {
-        LOG_ERR("thread %zu error: \"hash self-test failed\".", worker->id());
+        LOG_ERR("%s " RED("thread ") RED_BOLD("#%zu") RED(" self-test failed"), T::tag(), worker->id());
+
+        handle->backend()->start(worker, false);
+        delete worker;

        return;
    }

+    assert(handle->backend() != nullptr);
+
    handle->setWorker(worker);
-    handle->backend()->start(worker);
+    handle->backend()->start(worker, true);
 }


@ -168,19 +186,19 @@ xmrig::IWorker *xmrig::Workers<CpuLaunchData>::create(Thread<CpuLaunchData> *han
 {
    switch (handle->config().intensity) {
    case 1:
-        return new CpuWorker<1>(handle->index(), handle->config());
+        return new CpuWorker<1>(handle->id(), handle->config());

    case 2:
-        return new CpuWorker<2>(handle->index(), handle->config());
+        return new CpuWorker<2>(handle->id(), handle->config());

    case 3:
-        return new CpuWorker<3>(handle->index(), handle->config());
+        return new CpuWorker<3>(handle->id(), handle->config());

    case 4:
-        return new CpuWorker<4>(handle->index(), handle->config());
+        return new CpuWorker<4>(handle->id(), handle->config());

    case 5:
-        return new CpuWorker<5>(handle->index(), handle->config());
+        return new CpuWorker<5>(handle->id(), handle->config());
    }

    return nullptr;
@ -190,4 +208,28 @@ xmrig::IWorker *xmrig::Workers<CpuLaunchData>::create(Thread<CpuLaunchData> *han
 template class Workers<CpuLaunchData>;


+#ifdef XMRIG_FEATURE_OPENCL
+template<>
+xmrig::IWorker *xmrig::Workers<OclLaunchData>::create(Thread<OclLaunchData> *handle)
+{
+    return new OclWorker(handle->id(), handle->config());
+}
+
+
+template class Workers<OclLaunchData>;
+#endif
+
+
+#ifdef XMRIG_FEATURE_CUDA
+template<>
+xmrig::IWorker *xmrig::Workers<CudaLaunchData>::create(Thread<CudaLaunchData> *handle)
+{
+    return new CudaWorker(handle->id(), handle->config());
+}
+
+
+template class Workers<CudaLaunchData>;
+#endif
+
+
 } // namespace xmrig
--- a/src/backend/common/Workers.h
+++ b/src/backend/common/Workers.h
@ -29,6 +29,17 @@

 #include "backend/common/Thread.h"
 #include "backend/cpu/CpuLaunchData.h"
+#include "base/tools/Object.h"
+
+
+#ifdef XMRIG_FEATURE_OPENCL
+#   include "backend/opencl/OclLaunchData.h"
+#endif
+
+
+#ifdef XMRIG_FEATURE_CUDA
+#   include "backend/cuda/CudaLaunchData.h"
+#endif


 namespace xmrig {
@ -42,6 +53,8 @@ template<class T>
 class Workers
 {
 public:
+    XMRIG_DISABLE_COPY_MOVE(Workers)
+
    Workers();
    ~Workers();

@ -52,7 +65,7 @@ public:
    void tick(uint64_t ticks);

 private:
-    static IWorker *create(Thread<CpuLaunchData> *handle);
+    static IWorker *create(Thread<T> *handle);
    static void onReady(void *arg);

    std::vector<Thread<T> *> m_workers;
@ -62,11 +75,23 @@ private:

 template<>
 IWorker *Workers<CpuLaunchData>::create(Thread<CpuLaunchData> *handle);
-
-
 extern template class Workers<CpuLaunchData>;


+#ifdef XMRIG_FEATURE_OPENCL
+template<>
+IWorker *Workers<OclLaunchData>::create(Thread<OclLaunchData> *handle);
+extern template class Workers<OclLaunchData>;
+#endif
+
+
+#ifdef XMRIG_FEATURE_CUDA
+template<>
+IWorker *Workers<CudaLaunchData>::create(Thread<CudaLaunchData> *handle);
+extern template class Workers<CudaLaunchData>;
+#endif
+
+
 } // namespace xmrig


--- a/src/backend/common/common.cmake
+++ b/src/backend/common/common.cmake
@ -1,13 +1,17 @@
 set(HEADERS_BACKEND_COMMON
+    src/backend/common/Hashrate.h
+    src/backend/common/Tags.h
    src/backend/common/interfaces/IBackend.h
+    src/backend/common/interfaces/IRxListener.h
+    src/backend/common/interfaces/IRxStorage.h
    src/backend/common/interfaces/IThread.h
    src/backend/common/interfaces/IWorker.h
-    src/backend/common/Hashrate.h
+    src/backend/common/misc/PciTopology.h
    src/backend/common/Thread.h
    src/backend/common/Threads.h
    src/backend/common/Worker.h
-    src/backend/common/Workers.h
    src/backend/common/WorkerJob.h
+    src/backend/common/Workers.h
   )

 set(SOURCES_BACKEND_COMMON
--- a/src/backend/common/interfaces/IBackend.h
+++ b/src/backend/common/interfaces/IBackend.h
@ -26,7 +26,7 @@
 #define XMRIG_IBACKEND_H


-#include <stdint.h>
+#include <cstdint>


 #include "rapidjson/fwd.h"
@ -37,6 +37,7 @@ namespace xmrig {

 class Algorithm;
 class Hashrate;
+class IApiRequest;
 class IWorker;
 class Job;
 class String;
@ -52,15 +53,17 @@ public:
    virtual const Hashrate *hashrate() const                            = 0;
    virtual const String &profileName() const                           = 0;
    virtual const String &type() const                                  = 0;
+    virtual void execCommand(char command)                              = 0;
    virtual void prepare(const Job &nextJob)                            = 0;
    virtual void printHashrate(bool details)                            = 0;
    virtual void setJob(const Job &job)                                 = 0;
-    virtual void start(IWorker *worker)                                 = 0;
+    virtual void start(IWorker *worker, bool ready)                     = 0;
    virtual void stop()                                                 = 0;
    virtual void tick(uint64_t ticks)                                   = 0;

 #   ifdef XMRIG_FEATURE_API
    virtual rapidjson::Value toJSON(rapidjson::Document &doc) const     = 0;
+    virtual void handleRequest(IApiRequest &request)                    = 0;
 #   endif
 };

--- a/src/backend/common/interfaces/IMemoryPool.h
+++ b/src/backend/common/interfaces/IMemoryPool.h
@ -0,0 +1,53 @@
+/* XMRig
+ * Copyright 2010      Jeff Garzik <jgarzik@pobox.com>
+ * Copyright 2012-2014 pooler      <pooler@litecoinpool.org>
+ * Copyright 2014      Lucas Jones <https://github.com/lucasjones>
+ * Copyright 2014-2016 Wolf9466    <https://github.com/OhGodAPet>
+ * Copyright 2016      Jay D Dee   <jayddee246@gmail.com>
+ * Copyright 2017-2018 XMR-Stak    <https://github.com/fireice-uk>, <https://github.com/psychocrypt>
+ * Copyright 2018      Lee Clagett <https://github.com/vtnerd>
+ * Copyright 2018-2019 SChernykh   <https://github.com/SChernykh>
+ * Copyright 2018-2019 tevador     <tevador@gmail.com>
+ * Copyright 2016-2019 XMRig       <https://github.com/xmrig>, <support@xmrig.com>
+ *
+ *   This program is free software: you can redistribute it and/or modify
+ *   it under the terms of the GNU General Public License as published by
+ *   the Free Software Foundation, either version 3 of the License, or
+ *   (at your option) any later version.
+ *
+ *   This program is distributed in the hope that it will be useful,
+ *   but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ *   GNU General Public License for more details.
+ *
+ *   You should have received a copy of the GNU General Public License
+ *   along with this program. If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef XMRIG_IMEMORYPOOL_H
+#define XMRIG_IMEMORYPOOL_H
+
+
+#include <cstddef>
+#include <cstdint>
+
+
+namespace xmrig {
+
+
+class IMemoryPool
+{
+public:
+    virtual ~IMemoryPool() = default;
+
+    virtual bool isHugePages(uint32_t node) const       = 0;
+    virtual uint8_t *get(size_t size, uint32_t node)    = 0;
+    virtual void release(uint32_t node)                 = 0;
+};
+
+
+} /* namespace xmrig */
+
+
+
+#endif /* XMRIG_IMEMORYPOOL_H */
--- a/src/backend/common/interfaces/IRxListener.h
+++ b/src/backend/common/interfaces/IRxListener.h
@ -0,0 +1,44 @@
+/* XMRig
+ * Copyright 2010      Jeff Garzik <jgarzik@pobox.com>
+ * Copyright 2012-2014 pooler      <pooler@litecoinpool.org>
+ * Copyright 2014      Lucas Jones <https://github.com/lucasjones>
+ * Copyright 2014-2016 Wolf9466    <https://github.com/OhGodAPet>
+ * Copyright 2016      Jay D Dee   <jayddee246@gmail.com>
+ * Copyright 2016-2018 XMRig       <support@xmrig.com>
+ *
+ *   This program is free software: you can redistribute it and/or modify
+ *   it under the terms of the GNU General Public License as published by
+ *   the Free Software Foundation, either version 3 of the License, or
+ *   (at your option) any later version.
+ *
+ *   This program is distributed in the hope that it will be useful,
+ *   but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ *   GNU General Public License for more details.
+ *
+ *   You should have received a copy of the GNU General Public License
+ *   along with this program. If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef XMRIG_IRXLISTENER_H
+#define XMRIG_IRXLISTENER_H
+
+
+namespace xmrig {
+
+
+class IRxListener
+{
+public:
+    virtual ~IRxListener() = default;
+
+#   ifdef XMRIG_ALGO_RANDOMX
+    virtual void onDatasetReady() = 0;
+#   endif
+};
+
+
+} /* namespace xmrig */
+
+
+#endif // XMRIG_IRXLISTENER_H
--- a/src/backend/common/interfaces/IRxStorage.h
+++ b/src/backend/common/interfaces/IRxStorage.h
@ -0,0 +1,53 @@
+/* XMRig
+ * Copyright 2010      Jeff Garzik <jgarzik@pobox.com>
+ * Copyright 2012-2014 pooler      <pooler@litecoinpool.org>
+ * Copyright 2014      Lucas Jones <https://github.com/lucasjones>
+ * Copyright 2014-2016 Wolf9466    <https://github.com/OhGodAPet>
+ * Copyright 2016      Jay D Dee   <jayddee246@gmail.com>
+ * Copyright 2016-2018 XMRig       <support@xmrig.com>
+ *
+ *   This program is free software: you can redistribute it and/or modify
+ *   it under the terms of the GNU General Public License as published by
+ *   the Free Software Foundation, either version 3 of the License, or
+ *   (at your option) any later version.
+ *
+ *   This program is distributed in the hope that it will be useful,
+ *   but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ *   GNU General Public License for more details.
+ *
+ *   You should have received a copy of the GNU General Public License
+ *   along with this program. If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef XMRIG_IRXSTORAGE_H
+#define XMRIG_IRXSTORAGE_H
+
+
+#include <cstdint>
+#include <utility>
+
+
+namespace xmrig {
+
+
+class Job;
+class RxDataset;
+class RxSeed;
+
+
+class IRxStorage
+{
+public:
+    virtual ~IRxStorage() = default;
+
+    virtual RxDataset *dataset(const Job &job, uint32_t nodeId) const       = 0;
+    virtual std::pair<uint32_t, uint32_t> hugePages() const                 = 0;
+    virtual void init(const RxSeed &seed, uint32_t threads, bool hugePages) = 0;
+};
+
+
+} /* namespace xmrig */
+
+
+#endif // XMRIG_IRXSTORAGE_H
--- a/src/backend/common/interfaces/IWorker.h
+++ b/src/backend/common/interfaces/IWorker.h
@ -26,8 +26,8 @@
 #define XMRIG_IWORKER_H


-#include <stdint.h>
-#include <stddef.h>
+#include <cstdint>
+#include <cstddef>


 namespace xmrig {
@ -44,6 +44,7 @@ public:
    virtual bool selfTest()                         = 0;
    virtual const VirtualMemory *memory() const     = 0;
    virtual size_t id() const                       = 0;
+    virtual size_t intensity() const                = 0;
    virtual uint64_t hashCount() const              = 0;
    virtual uint64_t timestamp() const              = 0;
    virtual void start()                            = 0;
--- a/src/backend/common/misc/PciTopology.h
+++ b/src/backend/common/misc/PciTopology.h
@ -0,0 +1,73 @@
+/* XMRig
+ * Copyright 2010      Jeff Garzik <jgarzik@pobox.com>
+ * Copyright 2012-2014 pooler      <pooler@litecoinpool.org>
+ * Copyright 2014      Lucas Jones <https://github.com/lucasjones>
+ * Copyright 2014-2016 Wolf9466    <https://github.com/OhGodAPet>
+ * Copyright 2016      Jay D Dee   <jayddee246@gmail.com>
+ * Copyright 2017-2018 XMR-Stak    <https://github.com/fireice-uk>, <https://github.com/psychocrypt>
+ * Copyright 2018      Lee Clagett <https://github.com/vtnerd>
+ * Copyright 2018-2019 SChernykh   <https://github.com/SChernykh>
+ * Copyright 2016-2019 XMRig       <https://github.com/xmrig>, <support@xmrig.com>
+ *
+ *   This program is free software: you can redistribute it and/or modify
+ *   it under the terms of the GNU General Public License as published by
+ *   the Free Software Foundation, either version 3 of the License, or
+ *   (at your option) any later version.
+ *
+ *   This program is distributed in the hope that it will be useful,
+ *   but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ *   GNU General Public License for more details.
+ *
+ *   You should have received a copy of the GNU General Public License
+ *   along with this program. If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef XMRIG_PCITOPOLOGY_H
+#define XMRIG_PCITOPOLOGY_H
+
+
+#include <cstdio>
+
+
+#include "base/tools/String.h"
+
+
+namespace xmrig {
+
+
+class PciTopology
+{
+public:
+    PciTopology() = default;
+    PciTopology(uint32_t bus, uint32_t device, uint32_t function) : m_valid(true), m_bus(bus), m_device(device), m_function(function) {}
+
+    inline bool isValid() const        { return m_valid; }
+    inline uint8_t bus() const         { return m_bus; }
+    inline uint8_t device() const      { return m_device; }
+    inline uint8_t function() const    { return m_function; }
+
+    String toString() const
+    {
+        if (!isValid()) {
+            return "n/a";
+        }
+
+        char *buf = new char[8]();
+        snprintf(buf, 8, "%02hhx:%02hhx.%01hhx", bus(), device(), function());
+
+        return buf;
+    }
+
+private:
+    bool m_valid         = false;
+    uint8_t m_bus        = 0;
+    uint8_t m_device     = 0;
+    uint8_t m_function   = 0;
+};
+
+
+} // namespace xmrig
+
+
+#endif /* XMRIG_PCITOPOLOGY_H */
--- a/src/backend/cpu/Cpu.cpp
+++ b/src/backend/cpu/Cpu.cpp
@ -23,7 +23,7 @@
 */


-#include <assert.h>
+#include <cassert>


 #include "backend/cpu/Cpu.h"
@ -44,7 +44,15 @@ static xmrig::ICpuInfo *cpuInfo = nullptr;

 xmrig::ICpuInfo *xmrig::Cpu::info()
 {
-    assert(cpuInfo != nullptr);
+    if (cpuInfo == nullptr) {
+#       if defined(XMRIG_FEATURE_HWLOC)
+        cpuInfo = new HwlocCpuInfo();
+#       elif defined(XMRIG_FEATURE_LIBCPUID)
+        cpuInfo = new AdvancedCpuInfo();
+#       else
+        cpuInfo = new BasicCpuInfo();
+#       endif
+    }

    return cpuInfo;
 }
@ -62,7 +70,7 @@ rapidjson::Value xmrig::Cpu::toJSON(rapidjson::Document &doc)
    cpu.AddMember("brand",      StringRef(i->brand()), allocator);
    cpu.AddMember("aes",        i->hasAES(), allocator);
    cpu.AddMember("avx2",       i->hasAVX2(), allocator);
-    cpu.AddMember("x64",        i->isX64(), allocator);
+    cpu.AddMember("x64",        ICpuInfo::isX64(), allocator);
    cpu.AddMember("l2",         static_cast<uint64_t>(i->L2()), allocator);
    cpu.AddMember("l3",         static_cast<uint64_t>(i->L3()), allocator);
    cpu.AddMember("cores",      static_cast<uint64_t>(i->cores()), allocator);
@ -81,20 +89,6 @@ rapidjson::Value xmrig::Cpu::toJSON(rapidjson::Document &doc)
 }


-void xmrig::Cpu::init()
-{
-    assert(cpuInfo == nullptr);
-
-#   if defined(XMRIG_FEATURE_HWLOC)
-    cpuInfo = new HwlocCpuInfo();
-#   elif defined(XMRIG_FEATURE_LIBCPUID)
-    cpuInfo = new AdvancedCpuInfo();
-#   else
-    cpuInfo = new BasicCpuInfo();
-#   endif
-}
-
-
 void xmrig::Cpu::release()
 {
    assert(cpuInfo != nullptr);
--- a/src/backend/cpu/Cpu.h
+++ b/src/backend/cpu/Cpu.h
@ -37,7 +37,6 @@ class Cpu
 public:
    static ICpuInfo *info();
    static rapidjson::Value toJSON(rapidjson::Document &doc);
-    static void init();
    static void release();

    inline static Assembly::Id assembly(Assembly::Id hint) { return hint == Assembly::AUTO ? Cpu::info()->assembly() : hint; }
--- a/src/backend/cpu/CpuBackend.cpp
+++ b/src/backend/cpu/CpuBackend.cpp
@ -28,6 +28,7 @@

 #include "backend/common/Hashrate.h"
 #include "backend/common/interfaces/IWorker.h"
+#include "backend/common/Tags.h"
 #include "backend/common/Workers.h"
 #include "backend/cpu/Cpu.h"
 #include "backend/cpu/CpuBackend.h"
@ -43,6 +44,11 @@
 #include "rapidjson/document.h"


+#ifdef XMRIG_FEATURE_API
+#   include "base/api/interfaces/IApiRequest.h"
+#endif
+
+
 #ifdef XMRIG_ALGO_ARGON2
 #   include "crypto/argon2/Impl.h"
 #endif
@ -54,31 +60,78 @@ namespace xmrig {
 extern template class Threads<CpuThreads>;


-static const char *tag      = CYAN_BG_BOLD(" cpu ");
+static const char *tag      = CYAN_BG_BOLD(WHITE_BOLD_S " cpu ");
 static const String kType   = "cpu";
+static std::mutex mutex;


-struct LaunchStatus
+struct CpuLaunchStatus
 {
 public:
-    inline void reset()
+    inline size_t hugePages() const     { return m_hugePages; }
+    inline size_t memory() const        { return m_ways * m_memory; }
+    inline size_t pages() const         { return m_pages; }
+    inline size_t threads() const       { return m_threads; }
+    inline size_t ways() const          { return m_ways; }
+
+    inline void start(const std::vector<CpuLaunchData> &threads, size_t memory)
    {
-        hugePages = 0;
-        memory    = 0;
-        pages     = 0;
-        started   = 0;
-        threads   = 0;
-        ways      = 0;
-        ts        = Chrono::steadyMSecs();
+        m_hugePages = 0;
+        m_memory    = memory;
+        m_pages     = 0;
+        m_started   = 0;
+        m_errors    = 0;
+        m_threads   = threads.size();
+        m_ways      = 0;
+        m_ts        = Chrono::steadyMSecs();
    }

-    size_t hugePages    = 0;
-    size_t memory       = 0;
-    size_t pages        = 0;
-    size_t started      = 0;
-    size_t threads      = 0;
-    size_t ways         = 0;
-    uint64_t ts         = 0;
+    inline bool started(IWorker *worker, bool ready)
+    {
+        if (ready) {
+            auto hugePages = worker->memory()->hugePages();
+
+            m_started++;
+            m_hugePages += hugePages.first;
+            m_pages     += hugePages.second;
+            m_ways      += worker->intensity();
+        }
+        else {
+            m_errors++;
+        }
+
+        return (m_started + m_errors) == m_threads;
+    }
+
+    inline void print() const
+    {
+        if (m_started == 0) {
+            LOG_ERR("%s " RED_BOLD("disabled") YELLOW(" (failed to start threads)"), tag);
+
+            return;
+        }
+
+        LOG_INFO("%s" GREEN_BOLD(" READY") " threads %s%zu/%zu (%zu)" CLEAR " huge pages %s%1.0f%% %zu/%zu" CLEAR " memory " CYAN_BOLD("%zu KB") BLACK_BOLD(" (%" PRIu64 " ms)"),
+                 tag,
+                 m_errors == 0 ? CYAN_BOLD_S : YELLOW_BOLD_S,
+                 m_started, m_threads, m_ways,
+                 (m_hugePages == m_pages ? GREEN_BOLD_S : (m_hugePages == 0 ? RED_BOLD_S : YELLOW_BOLD_S)),
+                 m_hugePages == 0 ? 0.0 : static_cast<double>(m_hugePages) / m_pages * 100.0,
+                 m_hugePages, m_pages,
+                 memory() / 1024,
+                 Chrono::steadyMSecs() - m_ts
+                 );
+    }
+
+private:
+    size_t m_errors       = 0;
+    size_t m_hugePages    = 0;
+    size_t m_memory       = 0;
+    size_t m_pages        = 0;
+    size_t m_started      = 0;
+    size_t m_threads      = 0;
+    size_t m_ways         = 0;
+    uint64_t m_ts         = 0;
 };


@ -93,23 +146,15 @@ public:

    inline void start()
    {
-        LOG_INFO("%s use profile " BLUE_BG(WHITE_BOLD_S " %s ") WHITE_BOLD_S " (" CYAN_BOLD("%zu") WHITE_BOLD(" threads)") " scratchpad " CYAN_BOLD("%zu KB"),
+        LOG_INFO("%s use profile " BLUE_BG(WHITE_BOLD_S " %s ") WHITE_BOLD_S " (" CYAN_BOLD("%zu") WHITE_BOLD(" thread%s)") " scratchpad " CYAN_BOLD("%zu KB"),
                 tag,
                 profileName.data(),
                 threads.size(),
+                 threads.size() > 1 ? "s" : "",
                 algo.l3() / 1024
                 );

-        workers.stop();
-
-        status.reset();
-        status.memory   = algo.l3();
-        status.threads  = threads.size();
-
-        for (const CpuLaunchData &data : threads) {
-            status.ways += static_cast<size_t>(data.intensity);
-        }
-
+        status.start(threads, algo.l3());
        workers.start(threads);
    }

@ -118,14 +163,45 @@ public:
    {
        std::lock_guard<std::mutex> lock(mutex);

-        return status.ways;
+        return status.ways();
+    }
+
+
+    rapidjson::Value hugePages(int version, rapidjson::Document &doc)
+    {
+        std::pair<unsigned, unsigned> pages(0, 0);
+
+    #   ifdef XMRIG_ALGO_RANDOMX
+        if (algo.family() == Algorithm::RANDOM_X) {
+            pages = Rx::hugePages();
+        }
+    #   endif
+
+        mutex.lock();
+
+        pages.first  += status.hugePages();
+        pages.second += status.pages();
+
+        mutex.unlock();
+
+        rapidjson::Value hugepages;
+
+        if (version > 1) {
+            hugepages.SetArray();
+            hugepages.PushBack(pages.first, doc.GetAllocator());
+            hugepages.PushBack(pages.second, doc.GetAllocator());
+        }
+        else {
+            hugepages = pages.first == pages.second;
+        }
+
+        return hugepages;
    }


    Algorithm algo;
    Controller *controller;
-    LaunchStatus status;
-    std::mutex mutex;
+    CpuLaunchStatus status;
    std::vector<CpuLaunchData> threads;
    String profileName;
    Workers<CpuLaunchData> workers;
@ -135,6 +211,30 @@ public:
 } // namespace xmrig


+const char *xmrig::backend_tag(uint32_t backend)
+{
+#   ifdef XMRIG_FEATURE_OPENCL
+    if (backend == Nonce::OPENCL) {
+        return ocl_tag();
+    }
+#   endif
+
+#   ifdef XMRIG_FEATURE_CUDA
+    if (backend == Nonce::CUDA) {
+        return cuda_tag();
+    }
+#   endif
+
+    return tag;
+}
+
+
+const char *xmrig::cpu_tag()
+{
+    return tag;
+}
+
+
 xmrig::CpuBackend::CpuBackend(Controller *controller) :
    d_ptr(new CpuBackendPrivate(controller))
 {
@ -148,25 +248,6 @@ xmrig::CpuBackend::~CpuBackend()
 }


-std::pair<unsigned, unsigned> xmrig::CpuBackend::hugePages() const
-{
-    std::pair<unsigned, unsigned> pages(0, 0);
-
-#   ifdef XMRIG_ALGO_RANDOMX
-    if (d_ptr->algo.family() == Algorithm::RANDOM_X) {
-        pages = Rx::hugePages();
-    }
-#   endif
-
-    std::lock_guard<std::mutex> lock(d_ptr->mutex);
-
-    pages.first  += d_ptr->status.hugePages;
-    pages.second += d_ptr->status.pages;
-
-    return pages;
-}
-
-
 bool xmrig::CpuBackend::isEnabled() const
 {
    return d_ptr->controller->config()->cpu().isEnabled();
@ -219,11 +300,11 @@ void xmrig::CpuBackend::printHashrate(bool details)

    char num[8 * 3] = { 0 };

-    Log::print(WHITE_BOLD_S "|    CPU THREAD | AFFINITY | 10s H/s | 60s H/s | 15m H/s |");
+    Log::print(WHITE_BOLD_S "|    CPU # | AFFINITY | 10s H/s | 60s H/s | 15m H/s |");

    size_t i = 0;
    for (const CpuLaunchData &data : d_ptr->threads) {
-         Log::print("| %13zu | %8" PRId64 " | %7s | %7s | %7s |",
+         Log::print("| %8zu | %8" PRId64 " | %7s | %7s | %7s |",
                    i,
                    data.affinity,
                    Hashrate::format(hashrate()->calc(i, Hashrate::ShortInterval),  num,         sizeof num / 3),
@ -233,6 +314,14 @@ void xmrig::CpuBackend::printHashrate(bool details)

         i++;
    }
+
+#   ifdef XMRIG_FEATURE_OPENCL
+    Log::print(WHITE_BOLD_S "|        - |        - | %7s | %7s | %7s |",
+               Hashrate::format(hashrate()->calc(Hashrate::ShortInterval),  num,         sizeof num / 3),
+               Hashrate::format(hashrate()->calc(Hashrate::MediumInterval), num + 8,     sizeof num / 3),
+               Hashrate::format(hashrate()->calc(Hashrate::LargeInterval),  num + 8 * 2, sizeof num / 3)
+               );
+#   endif
 }


@ -245,7 +334,7 @@ void xmrig::CpuBackend::setJob(const Job &job)
    const CpuConfig &cpu = d_ptr->controller->config()->cpu();

    std::vector<CpuLaunchData> threads = cpu.get(d_ptr->controller->miner(), job.algorithm());
-    if (d_ptr->threads.size() == threads.size() && std::equal(d_ptr->threads.begin(), d_ptr->threads.end(), threads.begin())) {
+    if (!d_ptr->threads.empty() && d_ptr->threads.size() == threads.size() && std::equal(d_ptr->threads.begin(), d_ptr->threads.end(), threads.begin())) {
        return;
    }

@ -253,49 +342,40 @@ void xmrig::CpuBackend::setJob(const Job &job)
    d_ptr->profileName  = cpu.threads().profileName(job.algorithm());

    if (d_ptr->profileName.isNull() || threads.empty()) {
-        d_ptr->workers.stop();
+        LOG_WARN("%s " RED_BOLD("disabled") YELLOW(" (no suitable configuration found)"), tag);

-        LOG_WARN(YELLOW_BOLD_S "CPU disabled, no suitable configuration for algo %s", job.algorithm().shortName());
-
-        return;
+        return stop();
    }

+    stop();
+
    d_ptr->threads = std::move(threads);
    d_ptr->start();
 }


-void xmrig::CpuBackend::start(IWorker *worker)
+void xmrig::CpuBackend::start(IWorker *worker, bool ready)
 {
-    d_ptr->mutex.lock();
+    mutex.lock();

-    const auto pages = worker->memory()->hugePages();
-
-    d_ptr->status.started++;
-    d_ptr->status.hugePages += pages.first;
-    d_ptr->status.pages     += pages.second;
-
-    if (d_ptr->status.started == d_ptr->status.threads) {
-        const double percent = d_ptr->status.hugePages == 0 ? 0.0 : static_cast<double>(d_ptr->status.hugePages) / d_ptr->status.pages * 100.0;
-        const size_t memory  = d_ptr->status.ways * d_ptr->status.memory / 1024;
-
-        LOG_INFO("%s" GREEN_BOLD(" READY") " threads " CYAN_BOLD("%zu(%zu)") " huge pages %s%zu/%zu %1.0f%%\x1B[0m memory " CYAN_BOLD("%zu KB") BLACK_BOLD(" (%" PRIu64 " ms)"),
-                 tag,
-                 d_ptr->status.threads, d_ptr->status.ways,
-                 (d_ptr->status.hugePages == d_ptr->status.pages ? GREEN_BOLD_S : (d_ptr->status.hugePages == 0 ? RED_BOLD_S : YELLOW_BOLD_S)),
-                 d_ptr->status.hugePages, d_ptr->status.pages, percent, memory,
-                 Chrono::steadyMSecs() - d_ptr->status.ts
-                 );
+    if (d_ptr->status.started(worker, ready)) {
+        d_ptr->status.print();
    }

-    d_ptr->mutex.unlock();
+    mutex.unlock();

-    worker->start();
+    if (ready) {
+        worker->start();
+    }
 }


 void xmrig::CpuBackend::stop()
 {
+    if (d_ptr->threads.empty()) {
+        return;
+    }
+
    const uint64_t ts = Chrono::steadyMSecs();

    d_ptr->workers.stop();
@ -337,21 +417,16 @@ rapidjson::Value xmrig::CpuBackend::toJSON(rapidjson::Document &doc) const
    out.AddMember("argon2-impl", argon2::Impl::name().toJSON(), allocator);
 #   endif

-    const auto pages = hugePages();
-
-    rapidjson::Value hugepages(rapidjson::kArrayType);
-    hugepages.PushBack(pages.first, allocator);
-    hugepages.PushBack(pages.second, allocator);
-
-    out.AddMember("hugepages", hugepages, allocator);
+    out.AddMember("hugepages", d_ptr->hugePages(2, doc), allocator);
    out.AddMember("memory",    static_cast<uint64_t>(d_ptr->algo.isValid() ? (d_ptr->ways() * d_ptr->algo.l3()) : 0), allocator);

    if (d_ptr->threads.empty() || !hashrate()) {
        return out;
    }

+    out.AddMember("hashrate", hashrate()->toJSON(doc), allocator);
+
    Value threads(kArrayType);
-    const Hashrate *hr = hashrate();

    size_t i = 0;
    for (const CpuLaunchData &data : d_ptr->threads) {
@ -359,15 +434,9 @@ rapidjson::Value xmrig::CpuBackend::toJSON(rapidjson::Document &doc) const
        thread.AddMember("intensity",   data.intensity, allocator);
        thread.AddMember("affinity",    data.affinity, allocator);
        thread.AddMember("av",          data.av(), allocator);
-
-        Value hashrate(kArrayType);
-        hashrate.PushBack(Hashrate::normalize(hr->calc(i, Hashrate::ShortInterval)),  allocator);
-        hashrate.PushBack(Hashrate::normalize(hr->calc(i, Hashrate::MediumInterval)), allocator);
-        hashrate.PushBack(Hashrate::normalize(hr->calc(i, Hashrate::LargeInterval)),  allocator);
+        thread.AddMember("hashrate",    hashrate()->toJSON(i, doc), allocator);

        i++;
-
-        thread.AddMember("hashrate", hashrate, allocator);
        threads.PushBack(thread, allocator);
    }

@ -375,4 +444,12 @@ rapidjson::Value xmrig::CpuBackend::toJSON(rapidjson::Document &doc) const

    return out;
 }
+
+
+void xmrig::CpuBackend::handleRequest(IApiRequest &request)
+{
+    if (request.type() == IApiRequest::REQ_SUMMARY) {
+        request.reply().AddMember("hugepages", d_ptr->hugePages(request.version(), request.doc()), request.doc().GetAllocator());
+    }
+}
 #endif
--- a/src/backend/cpu/CpuBackend.h
+++ b/src/backend/cpu/CpuBackend.h
@ -26,10 +26,11 @@
 #define XMRIG_CPUBACKEND_H


-#include <utility>
-
-
 #include "backend/common/interfaces/IBackend.h"
+#include "base/tools/Object.h"
+
+
+#include <utility>


 namespace xmrig {
@ -43,12 +44,14 @@ class Miner;
 class CpuBackend : public IBackend
 {
 public:
+    XMRIG_DISABLE_COPY_MOVE_DEFAULT(CpuBackend)
+
    CpuBackend(Controller *controller);
    ~CpuBackend() override;

-    std::pair<unsigned, unsigned> hugePages() const;
-
 protected:
+    inline void execCommand(char) override {}
+
    bool isEnabled() const override;
    bool isEnabled(const Algorithm &algorithm) const override;
    const Hashrate *hashrate() const override;
@ -57,12 +60,13 @@ protected:
    void prepare(const Job &nextJob) override;
    void printHashrate(bool details) override;
    void setJob(const Job &job) override;
-    void start(IWorker *worker) override;
+    void start(IWorker *worker, bool ready) override;
    void stop() override;
    void tick(uint64_t ticks) override;

 #   ifdef XMRIG_FEATURE_API
    rapidjson::Value toJSON(rapidjson::Document &doc) const override;
+    void handleRequest(IApiRequest &request) override;
 #   endif

 private:
--- a/src/backend/cpu/CpuConfig.cpp
+++ b/src/backend/cpu/CpuConfig.cpp
@ -23,47 +23,27 @@
 */


-#include "backend/cpu/Cpu.h"
 #include "backend/cpu/CpuConfig.h"
+#include "backend/cpu/CpuConfig_gen.h"
+#include "backend/cpu/Cpu.h"
 #include "base/io/json/Json.h"
 #include "rapidjson/document.h"


 namespace xmrig {

-static const char *kCn                  = "cn";
 static const char *kEnabled             = "enabled";
 static const char *kHugePages           = "huge-pages";
 static const char *kHwAes               = "hw-aes";
+static const char *kMaxThreadsHint      = "max-threads-hint";
+static const char *kMemoryPool          = "memory-pool";
 static const char *kPriority            = "priority";

 #ifdef XMRIG_FEATURE_ASM
 static const char *kAsm = "asm";
 #endif

-#ifdef XMRIG_ALGO_CN_GPU
-static const char *kCnGPU = "cn/gpu";
-#endif
-
-#ifdef XMRIG_ALGO_CN_LITE
-static const char *kCnLite = "cn-lite";
-#endif
-
-#ifdef XMRIG_ALGO_CN_HEAVY
-static const char *kCnHeavy = "cn-heavy";
-#endif
-
-#ifdef XMRIG_ALGO_CN_PICO
-static const char *kCnPico = "cn-pico";
-#endif
-
-#ifdef XMRIG_ALGO_RANDOMX
-static const char *kRx    = "rx";
-static const char *kRxWOW = "rx/wow";
-#endif
-
 #ifdef XMRIG_ALGO_ARGON2
-static const char *kArgon2     = "argon2";
 static const char *kArgon2Impl = "argon2-impl";
 #endif

@ -72,11 +52,6 @@ extern template class Threads<CpuThreads>;
 }


-xmrig::CpuConfig::CpuConfig()
-{
-}
-
-
 bool xmrig::CpuConfig::isHwAES() const
 {
    return (m_aes == AES_AUTO ? (Cpu::info()->hasAES() ? AES_HW : AES_SOFT) : m_aes) == AES_HW;
@ -94,6 +69,11 @@ rapidjson::Value xmrig::CpuConfig::toJSON(rapidjson::Document &doc) const
    obj.AddMember(StringRef(kHugePages),    m_hugePages, allocator);
    obj.AddMember(StringRef(kHwAes),        m_aes == AES_AUTO ? Value(kNullType) : Value(m_aes == AES_HW), allocator);
    obj.AddMember(StringRef(kPriority),     priority() != -1 ? Value(priority()) : Value(kNullType), allocator);
+    obj.AddMember(StringRef(kMemoryPool),   m_memoryPool < 1 ? Value(m_memoryPool < 0) : Value(m_memoryPool), allocator);
+
+    if (m_threads.isEmpty()) {
+        obj.AddMember(StringRef(kMaxThreadsHint), m_limit, allocator);
+    }

 #   ifdef XMRIG_FEATURE_ASM
    obj.AddMember(StringRef(kAsm), m_assembly.toJSON(), allocator);
@ -109,6 +89,12 @@ rapidjson::Value xmrig::CpuConfig::toJSON(rapidjson::Document &doc) const
 }


+size_t xmrig::CpuConfig::memPoolSize() const
+{
+    return m_memoryPool < 0 ? Cpu::info()->threads() : m_memoryPool;
+}
+
+
 std::vector<xmrig::CpuLaunchData> xmrig::CpuConfig::get(const Miner *miner, const Algorithm &algorithm) const
 {
    std::vector<CpuLaunchData> out;
@ -121,21 +107,23 @@ std::vector<xmrig::CpuLaunchData> xmrig::CpuConfig::get(const Miner *miner, cons
    out.reserve(threads.count());

    for (const CpuThread &thread : threads.data()) {
-        out.push_back(CpuLaunchData(miner, algorithm, *this, thread));
+        out.emplace_back(miner, algorithm, *this, thread);
    }

    return out;
 }


-void xmrig::CpuConfig::read(const rapidjson::Value &value, uint32_t version)
+void xmrig::CpuConfig::read(const rapidjson::Value &value)
 {
    if (value.IsObject()) {
-        m_enabled       = Json::getBool(value, kEnabled, m_enabled);
-        m_hugePages     = Json::getBool(value, kHugePages, m_hugePages);
+        m_enabled   = Json::getBool(value, kEnabled, m_enabled);
+        m_hugePages = Json::getBool(value, kHugePages, m_hugePages);
+        m_limit     = Json::getUint(value, kMaxThreadsHint, m_limit);

        setAesMode(Json::getValue(value, kHwAes));
        setPriority(Json::getInt(value,  kPriority, -1));
+        setMemoryPool(Json::getValue(value, kMemoryPool));

 #       ifdef XMRIG_FEATURE_ASM
        m_assembly = Json::getValue(value, kAsm);
@ -145,16 +133,14 @@ void xmrig::CpuConfig::read(const rapidjson::Value &value, uint32_t version)
        m_argon2Impl = Json::getString(value, kArgon2Impl);
 #       endif

-        if (!m_threads.read(value)) {
-            generate();
-        }
+        m_threads.read(value);

-        if (version == 0) {
-            generateArgon2();
-        }
+        generate();
    }
-    else if (value.IsBool() && value.IsFalse()) {
-        m_enabled = false;
+    else if (value.IsBool()) {
+        m_enabled = value.GetBool();
+
+        generate();
    }
    else {
        generate();
@ -164,52 +150,40 @@ void xmrig::CpuConfig::read(const rapidjson::Value &value, uint32_t version)

 void xmrig::CpuConfig::generate()
 {
-    m_shouldSave  = true;
-    ICpuInfo *cpu = Cpu::info();
+    if (!isEnabled() || m_threads.has("*")) {
+        return;
+    }

-    m_threads.disable(Algorithm::CN_0);
-    m_threads.move(kCn, cpu->threads(Algorithm::CN_0));
+    size_t count = 0;

-#   ifdef XMRIG_ALGO_CN_GPU
-    m_threads.move(kCnGPU, cpu->threads(Algorithm::CN_GPU));
-#   endif
+    count += xmrig::generate<Algorithm::CN>(m_threads, m_limit);
+    count += xmrig::generate<Algorithm::CN_LITE>(m_threads, m_limit);
+    count += xmrig::generate<Algorithm::CN_HEAVY>(m_threads, m_limit);
+    count += xmrig::generate<Algorithm::CN_PICO>(m_threads, m_limit);
+    count += xmrig::generate<Algorithm::RANDOM_X>(m_threads, m_limit);
+    count += xmrig::generate<Algorithm::ARGON2>(m_threads, m_limit);

-#   ifdef XMRIG_ALGO_CN_LITE
-    m_threads.disable(Algorithm::CN_LITE_0);
-    m_threads.move(kCnLite, cpu->threads(Algorithm::CN_LITE_1));
-#   endif
-
-#   ifdef XMRIG_ALGO_CN_HEAVY
-    m_threads.move(kCnHeavy, cpu->threads(Algorithm::CN_HEAVY_0));
-#   endif
-
-#   ifdef XMRIG_ALGO_CN_PICO
-    m_threads.move(kCnPico, cpu->threads(Algorithm::CN_PICO_0));
-#   endif
-
-#   ifdef XMRIG_ALGO_RANDOMX
-    m_threads.move(kRx, cpu->threads(Algorithm::RX_0));
-    m_threads.move(kRxWOW, cpu->threads(Algorithm::RX_WOW));
-#   endif
-
-    generateArgon2();
+    m_shouldSave = count > 0;
 }


-void xmrig::CpuConfig::generateArgon2()
+void xmrig::CpuConfig::setAesMode(const rapidjson::Value &value)
 {
-#   ifdef XMRIG_ALGO_ARGON2
-    m_threads.move(kArgon2, Cpu::info()->threads(Algorithm::AR2_CHUKWA));
-#   endif
-}
-
-
-void xmrig::CpuConfig::setAesMode(const rapidjson::Value &aesMode)
-{
-    if (aesMode.IsBool()) {
-        m_aes = aesMode.GetBool() ? AES_HW : AES_SOFT;
+    if (value.IsBool()) {
+        m_aes = value.GetBool() ? AES_HW : AES_SOFT;
    }
    else {
        m_aes = AES_AUTO;
    }
 }
+
+
+void xmrig::CpuConfig::setMemoryPool(const rapidjson::Value &value)
+{
+    if (value.IsBool()) {
+        m_memoryPool = value.GetBool() ? -1 : 0;
+    }
+    else if (value.IsInt()) {
+        m_memoryPool = value.GetInt();
+    }
+}
--- a/src/backend/cpu/CpuConfig.h
+++ b/src/backend/cpu/CpuConfig.h
@ -44,12 +44,13 @@ public:
        AES_SOFT
    };

-    CpuConfig();
+    CpuConfig() = default;

    bool isHwAES() const;
    rapidjson::Value toJSON(rapidjson::Document &doc) const;
+    size_t memPoolSize() const;
    std::vector<CpuLaunchData> get(const Miner *miner, const Algorithm &algorithm) const;
-    void read(const rapidjson::Value &value, uint32_t version);
+    void read(const rapidjson::Value &value);

    inline bool isEnabled() const                       { return m_enabled; }
    inline bool isHugePages() const                     { return m_hugePages; }
@ -61,8 +62,8 @@ public:

 private:
    void generate();
-    void generateArgon2();
-    void setAesMode(const rapidjson::Value &aesMode);
+    void setAesMode(const rapidjson::Value &value);
+    void setMemoryPool(const rapidjson::Value &value);

    inline void setPriority(int priority)   { m_priority = (priority >= -1 && priority <= 5) ? priority : -1; }

@ -71,9 +72,11 @@ private:
    bool m_enabled       = true;
    bool m_hugePages     = true;
    bool m_shouldSave    = false;
+    int m_memoryPool     = 0;
    int m_priority       = -1;
    String m_argon2Impl;
    Threads<CpuThreads> m_threads;
+    uint32_t m_limit     = 100;
 };


--- a/src/backend/cpu/CpuConfig_gen.h
+++ b/src/backend/cpu/CpuConfig_gen.h
@ -0,0 +1,149 @@
+/* XMRig
+ * Copyright 2010      Jeff Garzik <jgarzik@pobox.com>
+ * Copyright 2012-2014 pooler      <pooler@litecoinpool.org>
+ * Copyright 2014      Lucas Jones <https://github.com/lucasjones>
+ * Copyright 2014-2016 Wolf9466    <https://github.com/OhGodAPet>
+ * Copyright 2016      Jay D Dee   <jayddee246@gmail.com>
+ * Copyright 2017-2018 XMR-Stak    <https://github.com/fireice-uk>, <https://github.com/psychocrypt>
+ * Copyright 2018-2019 SChernykh   <https://github.com/SChernykh>
+ * Copyright 2016-2019 XMRig       <https://github.com/xmrig>, <support@xmrig.com>
+ *
+ *   This program is free software: you can redistribute it and/or modify
+ *   it under the terms of the GNU General Public License as published by
+ *   the Free Software Foundation, either version 3 of the License, or
+ *   (at your option) any later version.
+ *
+ *   This program is distributed in the hope that it will be useful,
+ *   but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ *   GNU General Public License for more details.
+ *
+ *   You should have received a copy of the GNU General Public License
+ *   along with this program. If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef XMRIG_CPUCONFIG_GEN_H
+#define XMRIG_CPUCONFIG_GEN_H
+
+
+#include "backend/common/Threads.h"
+#include "backend/cpu/Cpu.h"
+#include "backend/cpu/CpuThreads.h"
+
+
+namespace xmrig {
+
+
+static inline size_t generate(const char *key, Threads<CpuThreads> &threads, const Algorithm &algorithm, uint32_t limit)
+{
+    if (threads.isExist(algorithm) || threads.has(key)) {
+        return 0;
+    }
+
+    return threads.move(key, Cpu::info()->threads(algorithm, limit));
+}
+
+
+template<Algorithm::Family FAMILY>
+static inline size_t generate(Threads<CpuThreads> &, uint32_t) { return 0; }
+
+
+template<>
+size_t inline generate<Algorithm::CN>(Threads<CpuThreads> &threads, uint32_t limit)
+{
+    size_t count = 0;
+
+    count += generate("cn", threads, Algorithm::CN_1, limit);
+
+    if (!threads.isExist(Algorithm::CN_0)) {
+        threads.disable(Algorithm::CN_0);
+        ++count;
+    }
+
+#   ifdef XMRIG_ALGO_CN_GPU
+    count += generate("cn/gpu", threads, Algorithm::CN_GPU, limit);
+#   endif
+
+    return count;
+}
+
+
+#ifdef XMRIG_ALGO_CN_LITE
+template<>
+size_t inline generate<Algorithm::CN_LITE>(Threads<CpuThreads> &threads, uint32_t limit)
+{
+    size_t count = 0;
+
+    count += generate("cn-lite", threads, Algorithm::CN_LITE_1, limit);
+
+    if (!threads.isExist(Algorithm::CN_LITE_0)) {
+        threads.disable(Algorithm::CN_LITE_0);
+        ++count;
+    }
+
+    return count;
+}
+#endif
+
+
+#ifdef XMRIG_ALGO_CN_HEAVY
+template<>
+size_t inline generate<Algorithm::CN_HEAVY>(Threads<CpuThreads> &threads, uint32_t limit)
+{
+    return generate("cn-heavy", threads, Algorithm::CN_HEAVY_0, limit);
+}
+#endif
+
+
+#ifdef XMRIG_ALGO_CN_PICO
+template<>
+size_t inline generate<Algorithm::CN_PICO>(Threads<CpuThreads> &threads, uint32_t limit)
+{
+    return generate("cn-pico", threads, Algorithm::CN_PICO_0, limit);
+}
+#endif
+
+
+#ifdef XMRIG_ALGO_RANDOMX
+template<>
+size_t inline generate<Algorithm::RANDOM_X>(Threads<CpuThreads> &threads, uint32_t limit)
+{
+    size_t count = 0;
+
+    auto wow = Cpu::info()->threads(Algorithm::RX_WOW, limit);
+
+    if (!threads.isExist(Algorithm::RX_ARQ)) {
+        auto arq = Cpu::info()->threads(Algorithm::RX_ARQ, limit);
+        if (arq == wow) {
+            threads.setAlias(Algorithm::RX_ARQ, "rx/wow");
+            ++count;
+        }
+        else {
+            count += threads.move("rx/arq", std::move(arq));
+        }
+    }
+
+    if (!threads.isExist(Algorithm::RX_WOW)) {
+        count += threads.move("rx/wow", std::move(wow));
+    }
+
+    count += generate("rx", threads, Algorithm::RX_0, limit);
+
+    return count;
+}
+#endif
+
+
+#ifdef XMRIG_ALGO_ARGON2
+template<>
+size_t inline generate<Algorithm::ARGON2>(Threads<CpuThreads> &threads, uint32_t limit)
+{
+    return generate("argon2", threads, Algorithm::AR2_CHUKWA, limit);
+}
+#endif
+
+
+} /* namespace xmrig */
+
+
+#endif /* XMRIG_CPUCONFIG_GEN_H */
--- a/src/backend/cpu/CpuLaunchData.cpp
+++ b/src/backend/cpu/CpuLaunchData.cpp
@ -24,13 +24,15 @@
 */


-#include <algorithm>
-
-
 #include "backend/cpu/CpuLaunchData.h"
+
+#include "backend/common/Tags.h"
 #include "backend/cpu/CpuConfig.h"


+#include <algorithm>
+
+
 xmrig::CpuLaunchData::CpuLaunchData(const Miner *miner, const Algorithm &algorithm, const CpuConfig &config, const CpuThread &thread) :
    algorithm(algorithm),
    assembly(config.assembly()),
@ -65,3 +67,9 @@ xmrig::CnHash::AlgoVariant xmrig::CpuLaunchData::av() const

    return static_cast<CnHash::AlgoVariant>(!hwAES ? (intensity + 5) : (intensity + 2));
 }
+
+
+const char *xmrig::CpuLaunchData::tag()
+{
+    return cpu_tag();
+}
--- a/src/backend/cpu/CpuLaunchData.h
+++ b/src/backend/cpu/CpuLaunchData.h
@ -54,6 +54,8 @@ public:
    inline bool operator!=(const CpuLaunchData &other) const    { return !isEqual(other); }
    inline bool operator==(const CpuLaunchData &other) const    { return isEqual(other); }

+    static const char *tag();
+
    const Algorithm algorithm;
    const Assembly assembly;
    const bool hugePages;
--- a/src/backend/cpu/CpuThread.h
+++ b/src/backend/cpu/CpuThread.h
@ -35,7 +35,7 @@ namespace xmrig {
 class CpuThread
 {
 public:
-    inline constexpr CpuThread() {}
+    inline constexpr CpuThread() = default;
    inline constexpr CpuThread(int64_t affinity, uint32_t intensity) : m_affinity(affinity), m_intensity(intensity) {}

    CpuThread(const rapidjson::Value &value);
--- a/src/backend/cpu/CpuThreads.cpp
+++ b/src/backend/cpu/CpuThreads.cpp
@ -120,6 +120,16 @@ xmrig::CpuThreads::CpuThreads(size_t count, uint32_t intensity)
 }


+bool xmrig::CpuThreads::isEqual(const CpuThreads &other) const
+{
+    if (isEmpty() && other.isEmpty()) {
+        return true;
+    }
+
+    return count() == other.count() && std::equal(m_data.begin(), m_data.end(), other.m_data.begin());
+}
+
+
 rapidjson::Value xmrig::CpuThreads::toJSON(rapidjson::Document &doc) const
 {
    using namespace rapidjson;
--- a/src/backend/cpu/CpuThreads.h
+++ b/src/backend/cpu/CpuThreads.h
@ -38,7 +38,7 @@ namespace xmrig {
 class CpuThreads
 {
 public:
-    inline CpuThreads() {}
+    inline CpuThreads() = default;
    inline CpuThreads(size_t count) : m_data(count) {}

    CpuThreads(const rapidjson::Value &value);
@ -51,6 +51,10 @@ public:
    inline void add(int64_t affinity, uint32_t intensity)   { add(CpuThread(affinity, intensity)); }
    inline void reserve(size_t capacity)                    { m_data.reserve(capacity); }

+    inline bool operator!=(const CpuThreads &other) const   { return !isEqual(other); }
+    inline bool operator==(const CpuThreads &other) const   { return isEqual(other); }
+
+    bool isEqual(const CpuThreads &other) const;
    rapidjson::Value toJSON(rapidjson::Document &doc) const;

 private:
--- a/src/backend/cpu/CpuWorker.cpp
+++ b/src/backend/cpu/CpuWorker.cpp
@ -24,7 +24,7 @@
 */


-#include <assert.h>
+#include <cassert>
 #include <thread>


@ -46,15 +46,15 @@

 namespace xmrig {

-static constexpr uint32_t kReserveCount = 4096;
+static constexpr uint32_t kReserveCount = 32768;

 } // namespace xmrig



 template<size_t N>
-xmrig::CpuWorker<N>::CpuWorker(size_t index, const CpuLaunchData &data) :
-    Worker(index, data.affinity, data.priority),
+xmrig::CpuWorker<N>::CpuWorker(size_t id, const CpuLaunchData &data) :
+    Worker(id, data.affinity, data.priority),
    m_algorithm(data.algorithm),
    m_assembly(data.assembly),
    m_hwAES(data.hwAES),
@ -62,19 +62,19 @@ xmrig::CpuWorker<N>::CpuWorker(size_t index, const CpuLaunchData &data) :
    m_miner(data.miner),
    m_ctx()
 {
-    m_memory = new VirtualMemory(m_algorithm.l3() * N, data.hugePages);
+    m_memory = new VirtualMemory(m_algorithm.l3() * N, data.hugePages, true, m_node);
 }


 template<size_t N>
 xmrig::CpuWorker<N>::~CpuWorker()
 {
-    CnCtx::release(m_ctx, N);
-    delete m_memory;
-
 #   ifdef XMRIG_ALGO_RANDOMX
    delete m_vm;
 #   endif
+
+    CnCtx::release(m_ctx, N);
+    delete m_memory;
 }


@ -120,7 +120,6 @@ bool xmrig::CpuWorker<N>::selfTest()
                        verify(Algorithm::CN_XAO,    test_output_xao)  &&
                        verify(Algorithm::CN_RTO,    test_output_rto)  &&
                        verify(Algorithm::CN_HALF,   test_output_half) &&
-                        verify2(Algorithm::CN_WOW,   test_output_wow)  &&
                        verify2(Algorithm::CN_R,     test_output_r)    &&
                        verify(Algorithm::CN_RWZ,    test_output_rwz)  &&
                        verify(Algorithm::CN_ZLS,    test_output_zls)  &&
@ -209,11 +208,11 @@ void xmrig::CpuWorker<N>::start()

            for (size_t i = 0; i < N; ++i) {
                if (*reinterpret_cast<uint64_t*>(m_hash + (i * 32) + 24) < job.target()) {
-                    JobResults::submit(JobResult(job, *m_job.nonce(i), m_hash + (i * 32)));
+                    JobResults::submit(job, *m_job.nonce(i), m_hash + (i * 32));
                }
            }

-            m_job.nextRound(kReserveCount);
+            m_job.nextRound(kReserveCount, 1);
            m_count += N;

            std::this_thread::yield();
@ -300,7 +299,11 @@ void xmrig::CpuWorker<N>::allocateCnCtx()
 template<size_t N>
 void xmrig::CpuWorker<N>::consumeJob()
 {
-    m_job.add(m_miner->job(), Nonce::sequence(Nonce::CPU), kReserveCount);
+    if (Nonce::sequence(Nonce::CPU) == 0) {
+        return;
+    }
+
+    m_job.add(m_miner->job(), kReserveCount, Nonce::CPU);

 #   ifdef XMRIG_ALGO_RANDOMX
    if (m_job.currentJob().algorithm().family() == Algorithm::RANDOM_X) {
--- a/src/backend/cpu/CpuWorker.h
+++ b/src/backend/cpu/CpuWorker.h
@ -30,7 +30,7 @@
 #include "backend/common/Worker.h"
 #include "backend/common/WorkerJob.h"
 #include "backend/cpu/CpuLaunchData.h"
-#include "base/net/stratum/Job.h"
+#include "base/tools/Object.h"
 #include "net/JobResult.h"


@ -44,7 +44,9 @@ template<size_t N>
 class CpuWorker : public Worker
 {
 public:
-    CpuWorker(size_t index, const CpuLaunchData &data);
+    XMRIG_DISABLE_COPY_MOVE_DEFAULT(CpuWorker)
+
+    CpuWorker(size_t id, const CpuLaunchData &data);
    ~CpuWorker() override;

 protected:
@ -52,6 +54,7 @@ protected:
    void start() override;

    inline const VirtualMemory *memory() const override { return m_memory; }
+    inline size_t intensity() const override            { return N; }

 private:
    inline cn_hash_fun fn(const Algorithm &algorithm) const { return CnHash::fn(algorithm, m_av, m_assembly); }
@ -71,7 +74,7 @@ private:
    const CnHash::AlgoVariant m_av;
    const Miner *m_miner;
    cryptonight_ctx *m_ctx[N];
-    uint8_t m_hash[N * 32];
+    uint8_t m_hash[N * 32]{ 0 };
    VirtualMemory *m_memory = nullptr;
    WorkerJob<N> m_job;

--- a/src/backend/cpu/cpu.cmake
+++ b/src/backend/cpu/cpu.cmake
@ -2,6 +2,7 @@ set(HEADERS_BACKEND_CPU
    src/backend/cpu/Cpu.h
    src/backend/cpu/CpuBackend.h
    src/backend/cpu/CpuConfig.h
+    src/backend/cpu/CpuConfig_gen.h
    src/backend/cpu/CpuLaunchData.cpp
    src/backend/cpu/CpuThread.h
    src/backend/cpu/CpuThreads.h
--- a/src/backend/cpu/interfaces/ICpuInfo.h
+++ b/src/backend/cpu/interfaces/ICpuInfo.h
@ -45,18 +45,18 @@ public:
    inline constexpr static bool isX64() { return false; }
 #   endif

-    virtual Assembly::Id assembly() const                                     = 0;
-    virtual bool hasAES() const                                               = 0;
-    virtual bool hasAVX2() const                                              = 0;
-    virtual const char *backend() const                                       = 0;
-    virtual const char *brand() const                                         = 0;
-    virtual CpuThreads threads(const Algorithm &algorithm) const              = 0;
-    virtual size_t cores() const                                              = 0;
-    virtual size_t L2() const                                                 = 0;
-    virtual size_t L3() const                                                 = 0;
-    virtual size_t nodes() const                                              = 0;
-    virtual size_t packages() const                                           = 0;
-    virtual size_t threads() const                                            = 0;
+    virtual Assembly::Id assembly() const                                           = 0;
+    virtual bool hasAES() const                                                     = 0;
+    virtual bool hasAVX2() const                                                    = 0;
+    virtual const char *backend() const                                             = 0;
+    virtual const char *brand() const                                               = 0;
+    virtual CpuThreads threads(const Algorithm &algorithm, uint32_t limit) const    = 0;
+    virtual size_t cores() const                                                    = 0;
+    virtual size_t L2() const                                                       = 0;
+    virtual size_t L3() const                                                       = 0;
+    virtual size_t nodes() const                                                    = 0;
+    virtual size_t packages() const                                                 = 0;
+    virtual size_t threads() const                                                  = 0;
 };


--- a/src/backend/cpu/platform/AdvancedCpuInfo.cpp
+++ b/src/backend/cpu/platform/AdvancedCpuInfo.cpp
@ -23,10 +23,10 @@
 */

 #include <algorithm>
-#include <assert.h>
-#include <math.h>
-#include <stdio.h>
-#include <string.h>
+#include <cassert>
+#include <cmath>
+#include <cstdio>
+#include <cstring>


 #include "3rdparty/libcpuid/libcpuid.h"
@ -109,7 +109,7 @@ xmrig::AdvancedCpuInfo::AdvancedCpuInfo() :
 }


-xmrig::CpuThreads xmrig::AdvancedCpuInfo::threads(const Algorithm &algorithm) const
+xmrig::CpuThreads xmrig::AdvancedCpuInfo::threads(const Algorithm &algorithm, uint32_t limit) const
 {
    if (threads() == 1) {
        return 1;
@ -153,5 +153,12 @@ xmrig::CpuThreads xmrig::AdvancedCpuInfo::threads(const Algorithm &algorithm) co
    }
 #   endif

-    return CpuThreads(std::max<size_t>(std::min<size_t>(count, threads()), 1), intensity);
+    if (limit > 0 && limit < 100) {
+        count = std::min(count, static_cast<size_t>(round(threads() * (limit / 100.0))));
+    }
+    else {
+        count = std::min(count, threads());
+    }
+
+    return CpuThreads(std::max<size_t>(count, 1), intensity);
 }
--- a/src/backend/cpu/platform/AdvancedCpuInfo.h
+++ b/src/backend/cpu/platform/AdvancedCpuInfo.h
@ -38,7 +38,7 @@ public:
    AdvancedCpuInfo();

 protected:
-    CpuThreads threads(const Algorithm &algorithm) const override;
+    CpuThreads threads(const Algorithm &algorithm, uint32_t limit) const override;

    inline Assembly::Id assembly() const override   { return m_assembly; }
    inline bool hasAES() const override             { return m_aes; }
--- a/src/backend/cpu/platform/BasicCpuInfo.cpp
+++ b/src/backend/cpu/platform/BasicCpuInfo.cpp
@ -179,7 +179,7 @@ const char *xmrig::BasicCpuInfo::backend() const
 }


-xmrig::CpuThreads xmrig::BasicCpuInfo::threads(const Algorithm &algorithm) const
+xmrig::CpuThreads xmrig::BasicCpuInfo::threads(const Algorithm &algorithm, uint32_t limit) const
 {
    const size_t count = std::thread::hardware_concurrency();

--- a/src/backend/cpu/platform/BasicCpuInfo.h
+++ b/src/backend/cpu/platform/BasicCpuInfo.h
@ -39,7 +39,7 @@ public:

 protected:
    const char *backend() const override;
-    CpuThreads threads(const Algorithm &algorithm) const override;
+    CpuThreads threads(const Algorithm &algorithm, uint32_t limit) const override;

    inline Assembly::Id assembly() const override   { return m_assembly; }
    inline bool hasAES() const override             { return m_aes; }
--- a/src/backend/cpu/platform/BasicCpuInfo_arm.cpp
+++ b/src/backend/cpu/platform/BasicCpuInfo_arm.cpp
@ -63,7 +63,7 @@ const char *xmrig::BasicCpuInfo::backend() const
 }


-xmrig::CpuThreads xmrig::BasicCpuInfo::threads(const Algorithm &) const
+xmrig::CpuThreads xmrig::BasicCpuInfo::threads(const Algorithm &, uint32_t) const
 {
    return CpuThreads(threads());
 }
--- a/src/backend/cpu/platform/HwlocCpuInfo.cpp
+++ b/src/backend/cpu/platform/HwlocCpuInfo.cpp
@ -29,6 +29,7 @@


 #include <algorithm>
+#include <cmath>
 #include <hwloc.h>


@ -45,7 +46,6 @@
 namespace xmrig {


-std::vector<uint32_t> HwlocCpuInfo::m_nodeIndexes;
 uint32_t HwlocCpuInfo::m_features = 0;


@ -127,9 +127,7 @@ static inline bool isCacheExclusive(hwloc_obj_t obj)
 } // namespace xmrig


-xmrig::HwlocCpuInfo::HwlocCpuInfo() : BasicCpuInfo(),
-    m_backend(),
-    m_cache()
+xmrig::HwlocCpuInfo::HwlocCpuInfo()
 {
    m_threads = 0;

@ -149,7 +147,7 @@ xmrig::HwlocCpuInfo::HwlocCpuInfo() : BasicCpuInfo(),
 #   endif

    const std::vector<hwloc_obj_t> packages = findByType(hwloc_get_root_obj(m_topology), HWLOC_OBJ_PACKAGE);
-    if (packages.size()) {
+    if (!packages.empty()) {
        const char *value = hwloc_obj_get_info_by_name(packages[0], "CPUModel");
        if (value) {
            strncpy(m_brand, value, 64);
@ -178,7 +176,7 @@ xmrig::HwlocCpuInfo::HwlocCpuInfo() : BasicCpuInfo(),

    m_threads   = countByType(m_topology, HWLOC_OBJ_PU);
    m_cores     = countByType(m_topology, HWLOC_OBJ_CORE);
-    m_nodes     = std::max<size_t>(countByType(m_topology, HWLOC_OBJ_NUMANODE), 1);
+    m_nodes     = std::max(hwloc_bitmap_weight(hwloc_topology_get_complete_nodeset(m_topology)), 1);
    m_packages  = countByType(m_topology, HWLOC_OBJ_PACKAGE);

    if (m_nodes > 1) {
@ -186,11 +184,11 @@ xmrig::HwlocCpuInfo::HwlocCpuInfo() : BasicCpuInfo(),
            m_features |= SET_THISTHREAD_MEMBIND;
        }

-        m_nodeIndexes.reserve(m_nodes);
+        m_nodeset.reserve(m_nodes);
        hwloc_obj_t node = nullptr;

        while ((node = hwloc_get_next_obj_by_type(m_topology, HWLOC_OBJ_NUMANODE, node)) != nullptr) {
-            m_nodeIndexes.emplace_back(node->os_index);
+            m_nodeset.emplace_back(node->os_index);
        }
    }
 }
@ -202,10 +200,24 @@ xmrig::HwlocCpuInfo::~HwlocCpuInfo()
 }


-xmrig::CpuThreads xmrig::HwlocCpuInfo::threads(const Algorithm &algorithm) const
+bool xmrig::HwlocCpuInfo::membind(hwloc_const_bitmap_t nodeset)
+{
+    if (!hwloc_topology_get_support(m_topology)->membind->set_thisthread_membind) {
+        return false;
+    }
+
+#   if HWLOC_API_VERSION >= 0x20000
+    return hwloc_set_membind(m_topology, nodeset, HWLOC_MEMBIND_BIND, HWLOC_MEMBIND_THREAD | HWLOC_MEMBIND_BYNODESET) >= 0;
+#   else
+    return hwloc_set_membind_nodeset(m_topology, nodeset, HWLOC_MEMBIND_BIND, HWLOC_MEMBIND_THREAD) >= 0;
+#   endif
+}
+
+
+xmrig::CpuThreads xmrig::HwlocCpuInfo::threads(const Algorithm &algorithm, uint32_t limit) const
 {
    if (L2() == 0 && L3() == 0) {
-        return BasicCpuInfo::threads(algorithm);
+        return BasicCpuInfo::threads(algorithm, limit);
    }

    const unsigned depth = L3() > 0 ? 3 : 2;
@ -218,21 +230,37 @@ xmrig::CpuThreads xmrig::HwlocCpuInfo::threads(const Algorithm &algorithm) const

    findCache(hwloc_get_root_obj(m_topology), depth, depth, [&caches](hwloc_obj_t found) { caches.emplace_back(found); });

-    for (hwloc_obj_t cache : caches) {
-        processTopLevelCache(cache, algorithm, threads);
+    if (limit > 0 && limit < 100 && !caches.empty()) {
+        const double maxTotalThreads = round(m_threads * (limit / 100.0));
+        const auto maxPerCache       = std::max(static_cast<int>(round(maxTotalThreads / caches.size())), 1);
+        int remaining                = std::max(static_cast<int>(maxTotalThreads), 1);
+
+        for (hwloc_obj_t cache : caches) {
+            processTopLevelCache(cache, algorithm, threads, std::min(maxPerCache, remaining));
+
+            remaining -= maxPerCache;
+            if (remaining <= 0) {
+                break;
+            }
+        }
+    }
+    else {
+        for (hwloc_obj_t cache : caches) {
+            processTopLevelCache(cache, algorithm, threads, 0);
+        }
    }

    if (threads.isEmpty()) {
        LOG_WARN("hwloc auto configuration for algorithm \"%s\" failed.", algorithm.shortName());

-        return BasicCpuInfo::threads(algorithm);
+        return BasicCpuInfo::threads(algorithm, limit);
    }

    return threads;
 }


-void xmrig::HwlocCpuInfo::processTopLevelCache(hwloc_obj_t cache, const Algorithm &algorithm, CpuThreads &threads) const
+void xmrig::HwlocCpuInfo::processTopLevelCache(hwloc_obj_t cache, const Algorithm &algorithm, CpuThreads &threads, size_t limit) const
 {
    constexpr size_t oneMiB = 1024u * 1024u;

@ -296,6 +324,10 @@ void xmrig::HwlocCpuInfo::processTopLevelCache(hwloc_obj_t cache, const Algorith
    }
 #   endif

+    if (limit > 0) {
+        cacheHashes = std::min(cacheHashes, limit);
+    }
+
    if (cacheHashes >= PUs) {
        for (hwloc_obj_t core : cores) {
            const std::vector<hwloc_obj_t> units = findByType(core, HWLOC_OBJ_PU);
--- a/src/backend/cpu/platform/HwlocCpuInfo.h
+++ b/src/backend/cpu/platform/HwlocCpuInfo.h
@ -27,10 +27,12 @@


 #include "backend/cpu/platform/BasicCpuInfo.h"
+#include "base/tools/Object.h"


-typedef struct hwloc_obj *hwloc_obj_t;
-typedef struct hwloc_topology *hwloc_topology_t;
+using hwloc_const_bitmap_t  = const struct hwloc_bitmap_s *;
+using hwloc_obj_t           = struct hwloc_obj *;
+using hwloc_topology_t      = struct hwloc_topology *;


 namespace xmrig {
@ -39,6 +41,9 @@ namespace xmrig {
 class HwlocCpuInfo : public BasicCpuInfo
 {
 public:
+    XMRIG_DISABLE_COPY_MOVE(HwlocCpuInfo)
+
+
    enum Feature : uint32_t {
        SET_THISTHREAD_MEMBIND = 1
    };
@ -48,10 +53,14 @@ public:
    ~HwlocCpuInfo() override;

    static inline bool has(Feature feature)                     { return m_features & feature; }
-    static inline const std::vector<uint32_t> &nodeIndexes()    { return m_nodeIndexes; }
+
+    inline const std::vector<uint32_t> &nodeset() const         { return m_nodeset; }
+    inline hwloc_topology_t topology() const                    { return m_topology; }
+
+    bool membind(hwloc_const_bitmap_t nodeset);

 protected:
-    CpuThreads threads(const Algorithm &algorithm) const override;
+    CpuThreads threads(const Algorithm &algorithm, uint32_t limit) const override;

    inline const char *backend() const override     { return m_backend; }
    inline size_t cores() const override            { return m_cores; }
@ -61,17 +70,18 @@ protected:
    inline size_t packages() const override         { return m_packages; }

 private:
-    void processTopLevelCache(hwloc_obj_t obj, const Algorithm &algorithm, CpuThreads &threads) const;
+    void processTopLevelCache(hwloc_obj_t obj, const Algorithm &algorithm, CpuThreads &threads, size_t limit) const;
+

-    static std::vector<uint32_t> m_nodeIndexes;
    static uint32_t m_features;

-    char m_backend[20];
-    hwloc_topology_t m_topology;
-    size_t m_cache[5];
-    size_t m_cores      = 0;
-    size_t m_nodes      = 0;
-    size_t m_packages   = 0;
+    char m_backend[20]          = { 0 };
+    hwloc_topology_t m_topology = nullptr;
+    size_t m_cache[5]           = { 0 };
+    size_t m_cores              = 0;
+    size_t m_nodes              = 0;
+    size_t m_packages           = 0;
+    std::vector<uint32_t> m_nodeset;
 };


--- a/src/backend/cuda/CudaBackend.cpp
+++ b/src/backend/cuda/CudaBackend.cpp
@ -0,0 +1,524 @@
+/* XMRig
+ * Copyright 2010      Jeff Garzik <jgarzik@pobox.com>
+ * Copyright 2012-2014 pooler      <pooler@litecoinpool.org>
+ * Copyright 2014      Lucas Jones <https://github.com/lucasjones>
+ * Copyright 2014-2016 Wolf9466    <https://github.com/OhGodAPet>
+ * Copyright 2016      Jay D Dee   <jayddee246@gmail.com>
+ * Copyright 2017-2018 XMR-Stak    <https://github.com/fireice-uk>, <https://github.com/psychocrypt>
+ * Copyright 2018-2019 SChernykh   <https://github.com/SChernykh>
+ * Copyright 2016-2019 XMRig       <https://github.com/xmrig>, <support@xmrig.com>
+ *
+ *   This program is free software: you can redistribute it and/or modify
+ *   it under the terms of the GNU General Public License as published by
+ *   the Free Software Foundation, either version 3 of the License, or
+ *   (at your option) any later version.
+ *
+ *   This program is distributed in the hope that it will be useful,
+ *   but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ *   GNU General Public License for more details.
+ *
+ *   You should have received a copy of the GNU General Public License
+ *   along with this program. If not, see <http://www.gnu.org/licenses/>.
+ */
+
+
+#include <mutex>
+#include <string>
+
+
+#include "backend/cuda/CudaBackend.h"
+#include "backend/common/Hashrate.h"
+#include "backend/common/interfaces/IWorker.h"
+#include "backend/common/Tags.h"
+#include "backend/common/Workers.h"
+#include "backend/cuda/CudaConfig.h"
+#include "backend/cuda/CudaThreads.h"
+#include "backend/cuda/CudaWorker.h"
+#include "backend/cuda/wrappers/CudaDevice.h"
+#include "backend/cuda/wrappers/CudaLib.h"
+#include "base/io/log/Log.h"
+#include "base/net/stratum/Job.h"
+#include "base/tools/Chrono.h"
+#include "base/tools/String.h"
+#include "core/config/Config.h"
+#include "core/Controller.h"
+#include "rapidjson/document.h"
+
+
+#ifdef XMRIG_FEATURE_API
+#   include "base/api/interfaces/IApiRequest.h"
+#endif
+
+
+#ifdef XMRIG_FEATURE_NVML
+#include "backend/cuda/wrappers/NvmlLib.h"
+
+namespace xmrig { static const char *kNvmlLabel = "NVML"; }
+#endif
+
+
+namespace xmrig {
+
+
+extern template class Threads<CudaThreads>;
+
+
+constexpr const size_t oneMiB   = 1024u * 1024u;
+static const char *kLabel       = "CUDA";
+static const char *tag          = GREEN_BG_BOLD(WHITE_BOLD_S " nv  ");
+static const String kType       = "cuda";
+static std::mutex mutex;
+
+
+
+static void printDisabled(const char *label, const char *reason)
+{
+    Log::print(GREEN_BOLD(" * ") WHITE_BOLD("%-13s") RED_BOLD("disabled") "%s", label, reason);
+}
+
+
+struct CudaLaunchStatus
+{
+public:
+    inline size_t threads() const { return m_threads; }
+
+    inline bool started(bool ready)
+    {
+        ready ? m_started++ : m_errors++;
+
+        return (m_started + m_errors) == m_threads;
+    }
+
+    inline void start(size_t threads)
+    {
+        m_started         = 0;
+        m_errors          = 0;
+        m_threads         = threads;
+        m_ts              = Chrono::steadyMSecs();
+        CudaWorker::ready = false;
+    }
+
+    inline void print() const
+    {
+        if (m_started == 0) {
+            LOG_ERR("%s " RED_BOLD("disabled") YELLOW(" (failed to start threads)"), tag);
+
+            return;
+        }
+
+        LOG_INFO("%s" GREEN_BOLD(" READY") " threads " "%s%zu/%zu" BLACK_BOLD(" (%" PRIu64 " ms)"),
+                 tag,
+                 m_errors == 0 ? CYAN_BOLD_S : YELLOW_BOLD_S,
+                 m_started,
+                 m_threads,
+                 Chrono::steadyMSecs() - m_ts
+                 );
+    }
+
+private:
+    size_t m_errors     = 0;
+    size_t m_started    = 0;
+    size_t m_threads    = 0;
+    uint64_t m_ts       = 0;
+};
+
+
+class CudaBackendPrivate
+{
+public:
+    inline CudaBackendPrivate(Controller *controller) :
+        controller(controller)
+    {
+        init(controller->config()->cuda());
+    }
+
+
+    void init(const CudaConfig &cuda)
+    {
+        if (!cuda.isEnabled()) {
+            return printDisabled(kLabel, "");
+        }
+
+        if (!CudaLib::init(cuda.loader())) {
+            return printDisabled(kLabel, RED_S " (failed to load CUDA plugin)");
+        }
+
+        runtimeVersion = CudaLib::runtimeVersion();
+        driverVersion  = CudaLib::driverVersion();
+
+        if (!runtimeVersion || !driverVersion || !CudaLib::deviceCount()) {
+            return printDisabled(kLabel, RED_S " (no devices)");
+        }
+
+        if (!devices.empty()) {
+            return;
+        }
+
+        devices = CudaLib::devices(cuda.bfactor(), cuda.bsleep(), cuda.devicesHint());
+        if (devices.empty()) {
+            return printDisabled(kLabel, RED_S " (no devices)");
+        }
+
+        Log::print(GREEN_BOLD(" * ") WHITE_BOLD("%-13s") WHITE_BOLD("%s") "/" WHITE_BOLD("%s") BLACK_BOLD("/%s"), kLabel,
+                   CudaLib::version(runtimeVersion).c_str(), CudaLib::version(driverVersion).c_str(), CudaLib::pluginVersion());
+
+#       ifdef XMRIG_FEATURE_NVML
+        if (cuda.isNvmlEnabled()) {
+            if (NvmlLib::init(cuda.nvmlLoader())) {
+                NvmlLib::assign(devices);
+
+                Log::print(GREEN_BOLD(" * ") WHITE_BOLD("%-13s") WHITE_BOLD("%s") "/" GREEN_BOLD("%s") " press " MAGENTA_BG(WHITE_BOLD_S "e") " for health report",
+                           kNvmlLabel,
+                           NvmlLib::version(),
+                           NvmlLib::driverVersion()
+                           );
+            }
+            else {
+                printDisabled(kNvmlLabel, RED_S " (failed to load NVML)");
+            }
+        }
+        else {
+            printDisabled(kNvmlLabel, "");
+        }
+#       endif
+
+        for (const CudaDevice &device : devices) {
+            Log::print(GREEN_BOLD(" * ") WHITE_BOLD("%-13s") CYAN_BOLD("#%zu") YELLOW(" %s") GREEN_BOLD(" %s ") WHITE_BOLD("%u/%u MHz") " smx:" WHITE_BOLD("%u") " arch:" WHITE_BOLD("%u%u") " mem:" CYAN("%zu/%zu") " MB",
+                       "CUDA GPU",
+                       device.index(),
+                       device.topology().toString().data(),
+                       device.name().data(),
+                       device.clock(),
+                       device.memoryClock(),
+                       device.smx(),
+                       device.computeCapability(true),
+                       device.computeCapability(false),
+                       device.freeMemSize() / oneMiB,
+                       device.globalMemSize() / oneMiB);
+        }
+    }
+
+
+    inline void start(const Job &)
+    {
+        LOG_INFO("%s use profile " BLUE_BG(WHITE_BOLD_S " %s ") WHITE_BOLD_S " (" CYAN_BOLD("%zu") WHITE_BOLD(" thread%s)") " scratchpad " CYAN_BOLD("%zu KB"),
+                 tag,
+                 profileName.data(),
+                 threads.size(),
+                 threads.size() > 1 ? "s" : "",
+                 algo.l3() / 1024
+                 );
+
+        Log::print(WHITE_BOLD("|  # | GPU |  BUS ID |    I |   T |   B | BF |  BS |  MEM | NAME"));
+
+        size_t i = 0;
+        for (const auto &data : threads) {
+            Log::print("|" CYAN_BOLD("%3zu") " |" CYAN_BOLD("%4u") " |" YELLOW(" %7s") " |" CYAN_BOLD("%5d") " |" CYAN_BOLD("%4d") " |"
+                       CYAN_BOLD("%4d") " |" CYAN_BOLD("%3d") " |" CYAN_BOLD("%4d") " |" CYAN("%5zu") " | " GREEN("%s"),
+                       i,
+                       data.thread.index(),
+                       data.device.topology().toString().data(),
+                       data.thread.threads() * data.thread.blocks(),
+                       data.thread.threads(),
+                       data.thread.blocks(),
+                       data.thread.bfactor(),
+                       data.thread.bsleep(),
+                       (data.thread.threads() * data.thread.blocks()) * algo.l3() / oneMiB,
+                       data.device.name().data()
+                       );
+
+                    i++;
+        }
+
+        status.start(threads.size());
+        workers.start(threads);
+    }
+
+
+#   ifdef XMRIG_FEATURE_NVML
+    void printHealth()
+    {
+        for (const auto &device : devices) {
+            const auto health = NvmlLib::health(device.nvmlDevice());
+
+            std::string clocks;
+            if (health.clock && health.memClock) {
+                clocks += " " + std::to_string(health.clock) + "/" + std::to_string(health.memClock) + " MHz";
+            }
+
+            std::string fans;
+            if (!health.fanSpeed.empty()) {
+                for (uint32_t i = 0; i < health.fanSpeed.size(); ++i) {
+                    fans += " fan" + std::to_string(i) + ":" CYAN_BOLD_S + std::to_string(health.fanSpeed[i]) + "%" CLEAR;
+                }
+            }
+
+            LOG_INFO(CYAN_BOLD("#%u") YELLOW(" %s") MAGENTA_BOLD("%4uW") CSI "1;%um %2uC" CLEAR WHITE_BOLD("%s") "%s",
+                     device.index(),
+                     device.topology().toString().data(),
+                     health.power,
+                     health.temperature < 60 ? 32 : (health.temperature > 85 ? 31 : 33),
+                     health.temperature,
+                     clocks.c_str(),
+                     fans.c_str()
+                     );
+        }
+    }
+#   endif
+
+
+    Algorithm algo;
+    Controller *controller;
+    CudaLaunchStatus status;
+    std::vector<CudaDevice> devices;
+    std::vector<CudaLaunchData> threads;
+    String profileName;
+    uint32_t driverVersion      = 0;
+    uint32_t runtimeVersion     = 0;
+    Workers<CudaLaunchData> workers;
+};
+
+
+} // namespace xmrig
+
+
+const char *xmrig::cuda_tag()
+{
+    return tag;
+}
+
+
+xmrig::CudaBackend::CudaBackend(Controller *controller) :
+    d_ptr(new CudaBackendPrivate(controller))
+{
+    d_ptr->workers.setBackend(this);
+}
+
+
+xmrig::CudaBackend::~CudaBackend()
+{
+    delete d_ptr;
+
+    CudaLib::close();
+
+#   ifdef XMRIG_FEATURE_NVML
+    NvmlLib::close();
+#   endif
+}
+
+
+bool xmrig::CudaBackend::isEnabled() const
+{
+    return d_ptr->controller->config()->cuda().isEnabled() && CudaLib::isInitialized() && !d_ptr->devices.empty();;
+}
+
+
+bool xmrig::CudaBackend::isEnabled(const Algorithm &algorithm) const
+{
+    return !d_ptr->controller->config()->cuda().threads().get(algorithm).isEmpty();
+}
+
+
+const xmrig::Hashrate *xmrig::CudaBackend::hashrate() const
+{
+    return d_ptr->workers.hashrate();
+}
+
+
+const xmrig::String &xmrig::CudaBackend::profileName() const
+{
+    return d_ptr->profileName;
+}
+
+
+const xmrig::String &xmrig::CudaBackend::type() const
+{
+    return kType;
+}
+
+
+void xmrig::CudaBackend::execCommand(char command)
+{
+#   ifdef XMRIG_FEATURE_NVML
+    if (command == 'e' || command == 'E') {
+        d_ptr->printHealth();
+    }
+#   endif
+}
+
+
+void xmrig::CudaBackend::prepare(const Job &)
+{
+}
+
+
+void xmrig::CudaBackend::printHashrate(bool details)
+{
+    if (!details || !hashrate()) {
+        return;
+    }
+
+    char num[8 * 3] = { 0 };
+
+    Log::print(WHITE_BOLD_S "|   CUDA # | AFFINITY | 10s H/s | 60s H/s | 15m H/s |");
+
+    size_t i = 0;
+    for (const auto &data : d_ptr->threads) {
+         Log::print("| %8zu | %8" PRId64 " | %7s | %7s | %7s |" CYAN_BOLD(" #%u") YELLOW(" %s") GREEN(" %s"),
+                    i,
+                    data.thread.affinity(),
+                    Hashrate::format(hashrate()->calc(i, Hashrate::ShortInterval),  num,         sizeof num / 3),
+                    Hashrate::format(hashrate()->calc(i, Hashrate::MediumInterval), num + 8,     sizeof num / 3),
+                    Hashrate::format(hashrate()->calc(i, Hashrate::LargeInterval),  num + 8 * 2, sizeof num / 3),
+                    data.device.index(),
+                    data.device.topology().toString().data(),
+                    data.device.name().data()
+                    );
+
+         i++;
+    }
+
+    Log::print(WHITE_BOLD_S "|        - |        - | %7s | %7s | %7s |",
+               Hashrate::format(hashrate()->calc(Hashrate::ShortInterval),  num,         sizeof num / 3),
+               Hashrate::format(hashrate()->calc(Hashrate::MediumInterval), num + 8,     sizeof num / 3),
+               Hashrate::format(hashrate()->calc(Hashrate::LargeInterval),  num + 8 * 2, sizeof num / 3)
+               );
+}
+
+
+void xmrig::CudaBackend::setJob(const Job &job)
+{
+    const auto &cuda = d_ptr->controller->config()->cuda();
+    if (cuda.isEnabled()) {
+        d_ptr->init(cuda);
+    }
+
+    if (!isEnabled()) {
+        return stop();
+    }
+
+    auto threads = cuda.get(d_ptr->controller->miner(), job.algorithm(), d_ptr->devices);
+    if (!d_ptr->threads.empty() && d_ptr->threads.size() == threads.size() && std::equal(d_ptr->threads.begin(), d_ptr->threads.end(), threads.begin())) {
+        return;
+    }
+
+    d_ptr->algo         = job.algorithm();
+    d_ptr->profileName  = cuda.threads().profileName(job.algorithm());
+
+    if (d_ptr->profileName.isNull() || threads.empty()) {
+        LOG_WARN("%s " RED_BOLD("disabled") YELLOW(" (no suitable configuration found)"), tag);
+
+        return stop();
+    }
+
+    stop();
+
+    d_ptr->threads = std::move(threads);
+    d_ptr->start(job);
+}
+
+
+void xmrig::CudaBackend::start(IWorker *worker, bool ready)
+{
+    mutex.lock();
+
+    if (d_ptr->status.started(ready)) {
+        d_ptr->status.print();
+
+        CudaWorker::ready = true;
+    }
+
+    mutex.unlock();
+
+    if (ready) {
+        worker->start();
+    }
+}
+
+
+void xmrig::CudaBackend::stop()
+{
+    if (d_ptr->threads.empty()) {
+        return;
+    }
+
+    const uint64_t ts = Chrono::steadyMSecs();
+
+    d_ptr->workers.stop();
+    d_ptr->threads.clear();
+
+    LOG_INFO("%s" YELLOW(" stopped") BLACK_BOLD(" (%" PRIu64 " ms)"), tag, Chrono::steadyMSecs() - ts);
+}
+
+
+void xmrig::CudaBackend::tick(uint64_t ticks)
+{
+    d_ptr->workers.tick(ticks);
+
+#   ifdef XMRIG_FEATURE_NVML
+    auto seconds = d_ptr->controller->config()->healthPrintTime();
+    if (seconds && ticks && (ticks % (seconds * 2)) == 0) {
+        d_ptr->printHealth();
+    }
+#   endif
+}
+
+
+#ifdef XMRIG_FEATURE_API
+rapidjson::Value xmrig::CudaBackend::toJSON(rapidjson::Document &doc) const
+{
+    using namespace rapidjson;
+    auto &allocator = doc.GetAllocator();
+
+    Value out(kObjectType);
+    out.AddMember("type",       type().toJSON(), allocator);
+    out.AddMember("enabled",    isEnabled(), allocator);
+    out.AddMember("algo",       d_ptr->algo.toJSON(), allocator);
+    out.AddMember("profile",    profileName().toJSON(), allocator);
+
+    if (CudaLib::isReady()) {
+        Value versions(kObjectType);
+        versions.AddMember("cuda-runtime",   Value(CudaLib::version(d_ptr->runtimeVersion).c_str(), allocator), allocator);
+        versions.AddMember("cuda-driver",    Value(CudaLib::version(d_ptr->driverVersion).c_str(), allocator), allocator);
+        versions.AddMember("plugin",         String(CudaLib::pluginVersion()).toJSON(doc), allocator);
+
+#       ifdef XMRIG_FEATURE_NVML
+        if (NvmlLib::isReady()) {
+            versions.AddMember("nvml",       StringRef(NvmlLib::version()), allocator);
+            versions.AddMember("driver",     StringRef(NvmlLib::driverVersion()), allocator);
+        }
+#       endif
+
+        out.AddMember("versions", versions, allocator);
+    }
+
+    if (d_ptr->threads.empty() || !hashrate()) {
+        return out;
+    }
+
+    out.AddMember("hashrate", hashrate()->toJSON(doc), allocator);
+
+    Value threads(kArrayType);
+
+    size_t i = 0;
+    for (const auto &data : d_ptr->threads) {
+        Value thread = data.thread.toJSON(doc);
+        thread.AddMember("hashrate", hashrate()->toJSON(i, doc), allocator);
+
+        data.device.toJSON(thread, doc);
+
+        i++;
+        threads.PushBack(thread, allocator);
+    }
+
+    out.AddMember("threads", threads, allocator);
+
+    return out;
+}
+
+
+void xmrig::CudaBackend::handleRequest(IApiRequest &)
+{
+}
+#endif
--- a/src/backend/cuda/CudaBackend.h
+++ b/src/backend/cuda/CudaBackend.h
@ -0,0 +1,80 @@
+/* XMRig
+ * Copyright 2010      Jeff Garzik <jgarzik@pobox.com>
+ * Copyright 2012-2014 pooler      <pooler@litecoinpool.org>
+ * Copyright 2014      Lucas Jones <https://github.com/lucasjones>
+ * Copyright 2014-2016 Wolf9466    <https://github.com/OhGodAPet>
+ * Copyright 2016      Jay D Dee   <jayddee246@gmail.com>
+ * Copyright 2017-2018 XMR-Stak    <https://github.com/fireice-uk>, <https://github.com/psychocrypt>
+ * Copyright 2018-2019 SChernykh   <https://github.com/SChernykh>
+ * Copyright 2016-2019 XMRig       <https://github.com/xmrig>, <support@xmrig.com>
+ *
+ *   This program is free software: you can redistribute it and/or modify
+ *   it under the terms of the GNU General Public License as published by
+ *   the Free Software Foundation, either version 3 of the License, or
+ *   (at your option) any later version.
+ *
+ *   This program is distributed in the hope that it will be useful,
+ *   but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ *   GNU General Public License for more details.
+ *
+ *   You should have received a copy of the GNU General Public License
+ *   along with this program. If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef XMRIG_CUDABACKEND_H
+#define XMRIG_CUDABACKEND_H
+
+
+#include <utility>
+
+
+#include "backend/common/interfaces/IBackend.h"
+#include "base/tools/Object.h"
+
+
+namespace xmrig {
+
+
+class Controller;
+class CudaBackendPrivate;
+class Miner;
+
+
+class CudaBackend : public IBackend
+{
+public:
+    XMRIG_DISABLE_COPY_MOVE_DEFAULT(CudaBackend)
+
+    CudaBackend(Controller *controller);
+
+    ~CudaBackend() override;
+
+protected:
+    bool isEnabled() const override;
+    bool isEnabled(const Algorithm &algorithm) const override;
+    const Hashrate *hashrate() const override;
+    const String &profileName() const override;
+    const String &type() const override;
+    void execCommand(char command) override;
+    void prepare(const Job &nextJob) override;
+    void printHashrate(bool details) override;
+    void setJob(const Job &job) override;
+    void start(IWorker *worker, bool ready) override;
+    void stop() override;
+    void tick(uint64_t ticks) override;
+
+#   ifdef XMRIG_FEATURE_API
+    rapidjson::Value toJSON(rapidjson::Document &doc) const override;
+    void handleRequest(IApiRequest &request) override;
+#   endif
+
+private:
+    CudaBackendPrivate *d_ptr;
+};
+
+
+} /* namespace xmrig */
+
+
+#endif /* XMRIG_CUDABACKEND_H */
--- a/src/backend/cuda/CudaConfig.cpp
+++ b/src/backend/cuda/CudaConfig.cpp
@ -0,0 +1,197 @@
+/* XMRig
+ * Copyright 2010      Jeff Garzik <jgarzik@pobox.com>
+ * Copyright 2012-2014 pooler      <pooler@litecoinpool.org>
+ * Copyright 2014      Lucas Jones <https://github.com/lucasjones>
+ * Copyright 2014-2016 Wolf9466    <https://github.com/OhGodAPet>
+ * Copyright 2016      Jay D Dee   <jayddee246@gmail.com>
+ * Copyright 2017-2018 XMR-Stak    <https://github.com/fireice-uk>, <https://github.com/psychocrypt>
+ * Copyright 2018-2019 SChernykh   <https://github.com/SChernykh>
+ * Copyright 2016-2019 XMRig       <https://github.com/xmrig>, <support@xmrig.com>
+ *
+ *   This program is free software: you can redistribute it and/or modify
+ *   it under the terms of the GNU General Public License as published by
+ *   the Free Software Foundation, either version 3 of the License, or
+ *   (at your option) any later version.
+ *
+ *   This program is distributed in the hope that it will be useful,
+ *   but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ *   GNU General Public License for more details.
+ *
+ *   You should have received a copy of the GNU General Public License
+ *   along with this program. If not, see <http://www.gnu.org/licenses/>.
+ */
+
+
+#include "backend/cuda/CudaConfig.h"
+#include "backend/common/Tags.h"
+#include "backend/cuda/CudaConfig_gen.h"
+#include "backend/cuda/wrappers/CudaLib.h"
+#include "base/io/json/Json.h"
+#include "base/io/log/Log.h"
+#include "rapidjson/document.h"
+
+
+namespace xmrig {
+
+
+static bool generated           = false;
+static const char *kDevicesHint = "devices-hint";
+static const char *kEnabled     = "enabled";
+static const char *kLoader      = "loader";
+
+#ifdef XMRIG_FEATURE_NVML
+static const char *kNvml        = "nvml";
+#endif
+
+
+extern template class Threads<CudaThreads>;
+
+
+}
+
+
+rapidjson::Value xmrig::CudaConfig::toJSON(rapidjson::Document &doc) const
+{
+    using namespace rapidjson;
+    auto &allocator = doc.GetAllocator();
+
+    Value obj(kObjectType);
+
+    obj.AddMember(StringRef(kEnabled),  m_enabled, allocator);
+    obj.AddMember(StringRef(kLoader),   m_loader.toJSON(), allocator);
+
+#   ifdef XMRIG_FEATURE_NVML
+    if (m_nvmlLoader.isNull()) {
+        obj.AddMember(StringRef(kNvml), m_nvml, allocator);
+    }
+    else {
+        obj.AddMember(StringRef(kNvml), m_nvmlLoader.toJSON(), allocator);
+    }
+#   endif
+
+    m_threads.toJSON(obj, doc);
+
+    return obj;
+}
+
+
+std::vector<xmrig::CudaLaunchData> xmrig::CudaConfig::get(const Miner *miner, const Algorithm &algorithm, const std::vector<CudaDevice> &devices) const
+{
+    auto deviceIndex = [&devices](uint32_t index) -> int {
+        for (uint32_t i = 0; i < devices.size(); ++i) {
+            if (devices[i].index() == index) {
+                return i;
+            }
+        }
+
+        return -1;
+    };
+
+    std::vector<CudaLaunchData> out;
+    const auto &threads = m_threads.get(algorithm);
+
+    if (threads.isEmpty()) {
+        return out;
+    }
+
+    out.reserve(threads.count());
+
+    for (const auto &thread : threads.data()) {
+        const int index = deviceIndex(thread.index());
+        if (index == -1) {
+            LOG_INFO("%s" YELLOW(" skip non-existing device with index ") YELLOW_BOLD("%u"), cuda_tag(), thread.index());
+            continue;
+        }
+
+        out.emplace_back(miner, algorithm, thread, devices[static_cast<size_t>(index)]);
+    }
+
+    return out;
+}
+
+
+void xmrig::CudaConfig::read(const rapidjson::Value &value)
+{
+    if (value.IsObject()) {
+        m_enabled   = Json::getBool(value, kEnabled, m_enabled);
+        m_loader    = Json::getString(value, kLoader);
+
+        setDevicesHint(Json::getString(value, kDevicesHint));
+
+#       ifdef XMRIG_FEATURE_NVML
+        auto &nvml = Json::getValue(value, kNvml);
+        if (nvml.IsString()) {
+            m_nvmlLoader = nvml.GetString();
+        }
+        else if (nvml.IsBool()) {
+            m_nvml = nvml.GetBool();
+        }
+#       endif
+
+        m_threads.read(value);
+
+        generate();
+    }
+    else if (value.IsBool()) {
+        m_enabled = value.GetBool();
+
+        generate();
+    }
+    else {
+        m_shouldSave = true;
+
+        generate();
+    }
+}
+
+
+void xmrig::CudaConfig::generate()
+{
+    if (generated) {
+        return;
+    }
+
+    if (!isEnabled() || m_threads.has("*")) {
+        return;
+    }
+
+    if (!CudaLib::init(loader())) {
+        return;
+    }
+
+    if (!CudaLib::runtimeVersion() || !CudaLib::driverVersion() || !CudaLib::deviceCount()) {
+        return;
+    }
+
+    const auto devices = CudaLib::devices(bfactor(), bsleep(), m_devicesHint);
+    if (devices.empty()) {
+        return;
+    }
+
+    size_t count = 0;
+
+    count += xmrig::generate<Algorithm::CN>(m_threads, devices);
+    count += xmrig::generate<Algorithm::CN_LITE>(m_threads, devices);
+    count += xmrig::generate<Algorithm::CN_HEAVY>(m_threads, devices);
+    count += xmrig::generate<Algorithm::CN_PICO>(m_threads, devices);
+    count += xmrig::generate<Algorithm::RANDOM_X>(m_threads, devices);
+
+    generated    = true;
+    m_shouldSave = count > 0;
+}
+
+
+void xmrig::CudaConfig::setDevicesHint(const char *devicesHint)
+{
+    if (devicesHint == nullptr) {
+        return;
+    }
+
+    const auto indexes = String(devicesHint).split(',');
+    m_devicesHint.reserve(indexes.size());
+
+    for (const auto &index : indexes) {
+        m_devicesHint.push_back(strtoul(index, nullptr, 10));
+    }
+}
--- a/src/backend/cuda/CudaConfig.h
+++ b/src/backend/cuda/CudaConfig.h
@ -0,0 +1,87 @@
+/* XMRig
+ * Copyright 2010      Jeff Garzik <jgarzik@pobox.com>
+ * Copyright 2012-2014 pooler      <pooler@litecoinpool.org>
+ * Copyright 2014      Lucas Jones <https://github.com/lucasjones>
+ * Copyright 2014-2016 Wolf9466    <https://github.com/OhGodAPet>
+ * Copyright 2016      Jay D Dee   <jayddee246@gmail.com>
+ * Copyright 2017-2018 XMR-Stak    <https://github.com/fireice-uk>, <https://github.com/psychocrypt>
+ * Copyright 2018-2019 SChernykh   <https://github.com/SChernykh>
+ * Copyright 2016-2019 XMRig       <https://github.com/xmrig>, <support@xmrig.com>
+ *
+ *   This program is free software: you can redistribute it and/or modify
+ *   it under the terms of the GNU General Public License as published by
+ *   the Free Software Foundation, either version 3 of the License, or
+ *   (at your option) any later version.
+ *
+ *   This program is distributed in the hope that it will be useful,
+ *   but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ *   GNU General Public License for more details.
+ *
+ *   You should have received a copy of the GNU General Public License
+ *   along with this program. If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef XMRIG_CUDACONFIG_H
+#define XMRIG_CUDACONFIG_H
+
+
+#include "backend/cuda/CudaLaunchData.h"
+#include "backend/common/Threads.h"
+#include "backend/cuda/CudaThreads.h"
+
+
+namespace xmrig {
+
+
+class CudaConfig
+{
+public:
+    CudaConfig() = default;
+
+    rapidjson::Value toJSON(rapidjson::Document &doc) const;
+    std::vector<CudaLaunchData> get(const Miner *miner, const Algorithm &algorithm, const std::vector<CudaDevice> &devices) const;
+    void read(const rapidjson::Value &value);
+
+    inline bool isEnabled() const                               { return m_enabled; }
+    inline bool isShouldSave() const                            { return m_shouldSave; }
+    inline const std::vector<uint32_t> &devicesHint() const     { return m_devicesHint; }
+    inline const String &loader() const                         { return m_loader; }
+    inline const Threads<CudaThreads> &threads() const          { return m_threads; }
+    inline int32_t bfactor() const                              { return m_bfactor; }
+    inline int32_t bsleep() const                               { return m_bsleep; }
+
+#   ifdef XMRIG_FEATURE_NVML
+    inline bool isNvmlEnabled() const                           { return m_nvml; }
+    inline const String &nvmlLoader() const                     { return m_nvmlLoader; }
+#   endif
+
+private:
+    void generate();
+    void setDevicesHint(const char *devicesHint);
+
+    bool m_enabled          = false;
+    bool m_shouldSave       = false;
+    std::vector<uint32_t> m_devicesHint;
+    String m_loader;
+    Threads<CudaThreads> m_threads;
+
+#   ifdef _WIN32
+    int32_t m_bfactor      = 6;
+    int32_t m_bsleep       = 25;
+#   else
+    int32_t m_bfactor      = 0;
+    int32_t m_bsleep       = 0;
+#   endif
+
+#   ifdef XMRIG_FEATURE_NVML
+    bool m_nvml            = true;
+    String m_nvmlLoader;
+#   endif
+};
+
+
+} /* namespace xmrig */
+
+
+#endif /* XMRIG_CUDACONFIG_H */
--- a/src/backend/cuda/CudaConfig_gen.h
+++ b/src/backend/cuda/CudaConfig_gen.h
@ -0,0 +1,137 @@
+/* XMRig
+ * Copyright 2010      Jeff Garzik <jgarzik@pobox.com>
+ * Copyright 2012-2014 pooler      <pooler@litecoinpool.org>
+ * Copyright 2014      Lucas Jones <https://github.com/lucasjones>
+ * Copyright 2014-2016 Wolf9466    <https://github.com/OhGodAPet>
+ * Copyright 2016      Jay D Dee   <jayddee246@gmail.com>
+ * Copyright 2017-2018 XMR-Stak    <https://github.com/fireice-uk>, <https://github.com/psychocrypt>
+ * Copyright 2018-2019 SChernykh   <https://github.com/SChernykh>
+ * Copyright 2016-2019 XMRig       <https://github.com/xmrig>, <support@xmrig.com>
+ *
+ *   This program is free software: you can redistribute it and/or modify
+ *   it under the terms of the GNU General Public License as published by
+ *   the Free Software Foundation, either version 3 of the License, or
+ *   (at your option) any later version.
+ *
+ *   This program is distributed in the hope that it will be useful,
+ *   but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ *   GNU General Public License for more details.
+ *
+ *   You should have received a copy of the GNU General Public License
+ *   along with this program. If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef XMRIG_CUDACONFIG_GEN_H
+#define XMRIG_CUDACONFIG_GEN_H
+
+
+#include "backend/common/Threads.h"
+#include "backend/cuda/CudaThreads.h"
+#include "backend/cuda/wrappers/CudaDevice.h"
+
+
+#include <algorithm>
+
+
+namespace xmrig {
+
+
+static inline size_t generate(const char *key, Threads<CudaThreads> &threads, const Algorithm &algorithm, const std::vector<CudaDevice> &devices)
+{
+    if (threads.isExist(algorithm) || threads.has(key)) {
+        return 0;
+    }
+
+    return threads.move(key, CudaThreads(devices, algorithm));
+}
+
+
+template<Algorithm::Family FAMILY>
+static inline size_t generate(Threads<CudaThreads> &, const std::vector<CudaDevice> &) { return 0; }
+
+
+template<>
+size_t inline generate<Algorithm::CN>(Threads<CudaThreads> &threads, const std::vector<CudaDevice> &devices)
+{
+    size_t count = 0;
+
+    count += generate("cn", threads, Algorithm::CN_1, devices);
+    count += generate("cn/2", threads, Algorithm::CN_2, devices);
+
+    if (!threads.isExist(Algorithm::CN_0)) {
+        threads.disable(Algorithm::CN_0);
+        count++;
+    }
+
+#   ifdef XMRIG_ALGO_CN_GPU
+    count += generate("cn/gpu", threads, Algorithm::CN_GPU, devices);
+#   endif
+
+    return count;
+}
+
+
+#ifdef XMRIG_ALGO_CN_LITE
+template<>
+size_t inline generate<Algorithm::CN_LITE>(Threads<CudaThreads> &threads, const std::vector<CudaDevice> &devices)
+{
+    size_t count = generate("cn-lite", threads, Algorithm::CN_LITE_1, devices);
+
+    if (!threads.isExist(Algorithm::CN_LITE_0)) {
+        threads.disable(Algorithm::CN_LITE_0);
+        ++count;
+    }
+
+    return count;
+}
+#endif
+
+
+#ifdef XMRIG_ALGO_CN_HEAVY
+template<>
+size_t inline generate<Algorithm::CN_HEAVY>(Threads<CudaThreads> &threads, const std::vector<CudaDevice> &devices)
+{
+    return generate("cn-heavy", threads, Algorithm::CN_HEAVY_0, devices);
+}
+#endif
+
+
+#ifdef XMRIG_ALGO_CN_PICO
+template<>
+size_t inline generate<Algorithm::CN_PICO>(Threads<CudaThreads> &threads, const std::vector<CudaDevice> &devices)
+{
+    return generate("cn-pico", threads, Algorithm::CN_PICO_0, devices);
+}
+#endif
+
+
+#ifdef XMRIG_ALGO_RANDOMX
+template<>
+size_t inline generate<Algorithm::RANDOM_X>(Threads<CudaThreads> &threads, const std::vector<CudaDevice> &devices)
+{
+    size_t count = 0;
+
+    auto rx  = CudaThreads(devices, Algorithm::RX_0);
+    auto wow = CudaThreads(devices, Algorithm::RX_WOW);
+    auto arq = CudaThreads(devices, Algorithm::RX_ARQ);
+
+    if (!threads.isExist(Algorithm::RX_WOW) && wow != rx) {
+        count += threads.move("rx/wow", std::move(wow));
+    }
+
+    if (!threads.isExist(Algorithm::RX_ARQ) && arq != rx) {
+        count += threads.move("rx/arq", std::move(arq));
+    }
+
+    count += threads.move("rx", std::move(rx));
+
+    return count;
+}
+#endif
+
+
+} /* namespace xmrig */
+
+
+#endif /* XMRIG_CUDACONFIG_GEN_H */
--- a/src/backend/cuda/CudaLaunchData.cpp
+++ b/src/backend/cuda/CudaLaunchData.cpp
@ -0,0 +1,51 @@
+/* XMRig
+ * Copyright 2010      Jeff Garzik <jgarzik@pobox.com>
+ * Copyright 2012-2014 pooler      <pooler@litecoinpool.org>
+ * Copyright 2014      Lucas Jones <https://github.com/lucasjones>
+ * Copyright 2014-2016 Wolf9466    <https://github.com/OhGodAPet>
+ * Copyright 2016      Jay D Dee   <jayddee246@gmail.com>
+ * Copyright 2017-2018 XMR-Stak    <https://github.com/fireice-uk>, <https://github.com/psychocrypt>
+ * Copyright 2018      Lee Clagett <https://github.com/vtnerd>
+ * Copyright 2018-2019 SChernykh   <https://github.com/SChernykh>
+ * Copyright 2016-2019 XMRig       <https://github.com/xmrig>, <support@xmrig.com>
+ *
+ *   This program is free software: you can redistribute it and/or modify
+ *   it under the terms of the GNU General Public License as published by
+ *   the Free Software Foundation, either version 3 of the License, or
+ *   (at your option) any later version.
+ *
+ *   This program is distributed in the hope that it will be useful,
+ *   but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ *   GNU General Public License for more details.
+ *
+ *   You should have received a copy of the GNU General Public License
+ *   along with this program. If not, see <http://www.gnu.org/licenses/>.
+ */
+
+
+#include "backend/cuda/CudaLaunchData.h"
+#include "backend/common/Tags.h"
+
+
+xmrig::CudaLaunchData::CudaLaunchData(const Miner *miner, const Algorithm &algorithm, const CudaThread &thread, const CudaDevice &device) :
+    algorithm(algorithm),
+    miner(miner),
+    device(device),
+    thread(thread)
+{
+}
+
+
+bool xmrig::CudaLaunchData::isEqual(const CudaLaunchData &other) const
+{
+    return (other.algorithm.family() == algorithm.family() &&
+            other.algorithm.l3()     == algorithm.l3() &&
+            other.thread             == thread);
+}
+
+
+const char *xmrig::CudaLaunchData::tag()
+{
+    return cuda_tag();
+}
--- a/src/backend/cuda/CudaLaunchData.h
+++ b/src/backend/cuda/CudaLaunchData.h
@ -0,0 +1,66 @@
+/* XMRig
+ * Copyright 2010      Jeff Garzik <jgarzik@pobox.com>
+ * Copyright 2012-2014 pooler      <pooler@litecoinpool.org>
+ * Copyright 2014      Lucas Jones <https://github.com/lucasjones>
+ * Copyright 2014-2016 Wolf9466    <https://github.com/OhGodAPet>
+ * Copyright 2016      Jay D Dee   <jayddee246@gmail.com>
+ * Copyright 2017-2018 XMR-Stak    <https://github.com/fireice-uk>, <https://github.com/psychocrypt>
+ * Copyright 2018      Lee Clagett <https://github.com/vtnerd>
+ * Copyright 2018-2019 SChernykh   <https://github.com/SChernykh>
+ * Copyright 2016-2019 XMRig       <https://github.com/xmrig>, <support@xmrig.com>
+ *
+ *   This program is free software: you can redistribute it and/or modify
+ *   it under the terms of the GNU General Public License as published by
+ *   the Free Software Foundation, either version 3 of the License, or
+ *   (at your option) any later version.
+ *
+ *   This program is distributed in the hope that it will be useful,
+ *   but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ *   GNU General Public License for more details.
+ *
+ *   You should have received a copy of the GNU General Public License
+ *   along with this program. If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef XMRIG_CUDALAUNCHDATA_H
+#define XMRIG_CUDALAUNCHDATA_H
+
+
+#include "backend/cuda/CudaThread.h"
+#include "crypto/common/Algorithm.h"
+#include "crypto/common/Nonce.h"
+
+
+namespace xmrig {
+
+
+class CudaDevice;
+class Miner;
+
+
+class CudaLaunchData
+{
+public:
+    CudaLaunchData(const Miner *miner, const Algorithm &algorithm, const CudaThread &thread, const CudaDevice &device);
+
+    bool isEqual(const CudaLaunchData &other) const;
+
+    inline constexpr static Nonce::Backend backend() { return Nonce::CUDA; }
+
+    inline bool operator!=(const CudaLaunchData &other) const    { return !isEqual(other); }
+    inline bool operator==(const CudaLaunchData &other) const    { return isEqual(other); }
+
+    static const char *tag();
+
+    const Algorithm algorithm;
+    const Miner *miner;
+    const CudaDevice &device;
+    const CudaThread thread;
+};
+
+
+} // namespace xmrig
+
+
+#endif /* XMRIG_OCLLAUNCHDATA_H */
--- a/src/backend/cuda/CudaThread.cpp
+++ b/src/backend/cuda/CudaThread.cpp
@ -0,0 +1,113 @@
+/* XMRig
+ * Copyright 2010      Jeff Garzik <jgarzik@pobox.com>
+ * Copyright 2012-2014 pooler      <pooler@litecoinpool.org>
+ * Copyright 2014      Lucas Jones <https://github.com/lucasjones>
+ * Copyright 2014-2016 Wolf9466    <https://github.com/OhGodAPet>
+ * Copyright 2016      Jay D Dee   <jayddee246@gmail.com>
+ * Copyright 2017-2018 XMR-Stak    <https://github.com/fireice-uk>, <https://github.com/psychocrypt>
+ * Copyright 2018-2019 SChernykh   <https://github.com/SChernykh>
+ * Copyright 2016-2019 XMRig       <https://github.com/xmrig>, <support@xmrig.com>
+ *
+ *   This program is free software: you can redistribute it and/or modify
+ *   it under the terms of the GNU General Public License as published by
+ *   the Free Software Foundation, either version 3 of the License, or
+ *   (at your option) any later version.
+ *
+ *   This program is distributed in the hope that it will be useful,
+ *   but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ *   GNU General Public License for more details.
+ *
+ *   You should have received a copy of the GNU General Public License
+ *   along with this program. If not, see <http://www.gnu.org/licenses/>.
+ */
+
+
+#include "backend/cuda/CudaThread.h"
+#include "backend/cuda/wrappers/CudaLib.h"
+#include "base/io/json/Json.h"
+#include "rapidjson/document.h"
+
+
+#include <algorithm>
+
+
+namespace xmrig {
+
+static const char *kAffinity    = "affinity";
+static const char *kBFactor     = "bfactor";
+static const char *kBlocks      = "blocks";
+static const char *kBSleep      = "bsleep";
+static const char *kIndex       = "index";
+static const char *kThreads     = "threads";
+static const char *kDatasetHost = "dataset_host";
+
+} // namespace xmrig
+
+
+xmrig::CudaThread::CudaThread(const rapidjson::Value &value)
+{
+    if (!value.IsObject()) {
+        return;
+    }
+
+    m_index     = Json::getUint(value, kIndex);
+    m_threads   = Json::getInt(value, kThreads);
+    m_blocks    = Json::getInt(value, kBlocks);
+    m_bfactor   = std::min(Json::getUint(value, kBFactor, m_bfactor), 12u);
+    m_bsleep    = Json::getUint(value, kBSleep, m_bsleep);
+    m_affinity  = Json::getUint64(value, kAffinity, m_affinity);
+
+    if (Json::getValue(value, kDatasetHost).IsInt()) {
+        m_datasetHost = Json::getInt(value, kDatasetHost, m_datasetHost) != 0;
+    }
+    else {
+        m_datasetHost = Json::getBool(value, kDatasetHost);
+    }
+}
+
+
+xmrig::CudaThread::CudaThread(uint32_t index, nvid_ctx *ctx) :
+    m_blocks(CudaLib::deviceInt(ctx, CudaLib::DeviceBlocks)),
+    m_datasetHost(CudaLib::deviceInt(ctx, CudaLib::DeviceDatasetHost)),
+    m_threads(CudaLib::deviceInt(ctx, CudaLib::DeviceThreads)),
+    m_index(index),
+    m_bfactor(CudaLib::deviceUint(ctx, CudaLib::DeviceBFactor)),
+    m_bsleep(CudaLib::deviceUint(ctx, CudaLib::DeviceBSleep))
+{
+
+}
+
+
+bool xmrig::CudaThread::isEqual(const CudaThread &other) const
+{
+    return m_blocks      == other.m_blocks &&
+           m_threads     == other.m_threads &&
+           m_affinity    == other.m_affinity &&
+           m_index       == other.m_index &&
+           m_bfactor     == other.m_bfactor &&
+           m_bsleep      == other.m_bsleep &&
+           m_datasetHost == other.m_datasetHost;
+}
+
+
+rapidjson::Value xmrig::CudaThread::toJSON(rapidjson::Document &doc) const
+{
+    using namespace rapidjson;
+    auto &allocator = doc.GetAllocator();
+
+    Value out(kObjectType);
+
+    out.AddMember(StringRef(kIndex),        index(), allocator);
+    out.AddMember(StringRef(kThreads),      threads(), allocator);
+    out.AddMember(StringRef(kBlocks),       blocks(), allocator);
+    out.AddMember(StringRef(kBFactor),      bfactor(), allocator);
+    out.AddMember(StringRef(kBSleep),       bsleep(), allocator);
+    out.AddMember(StringRef(kAffinity),     affinity(), allocator);
+
+    if (m_datasetHost >= 0) {
+        out.AddMember(StringRef(kDatasetHost), m_datasetHost > 0, allocator);
+    }
+
+    return out;
+}
--- a/src/backend/cuda/CudaThread.h
+++ b/src/backend/cuda/CudaThread.h
@ -0,0 +1,81 @@
+/* XMRig
+ * Copyright 2010      Jeff Garzik <jgarzik@pobox.com>
+ * Copyright 2012-2014 pooler      <pooler@litecoinpool.org>
+ * Copyright 2014      Lucas Jones <https://github.com/lucasjones>
+ * Copyright 2014-2016 Wolf9466    <https://github.com/OhGodAPet>
+ * Copyright 2016      Jay D Dee   <jayddee246@gmail.com>
+ * Copyright 2017-2018 XMR-Stak    <https://github.com/fireice-uk>, <https://github.com/psychocrypt>
+ * Copyright 2018-2019 SChernykh   <https://github.com/SChernykh>
+ * Copyright 2016-2019 XMRig       <https://github.com/xmrig>, <support@xmrig.com>
+ *
+ *   This program is free software: you can redistribute it and/or modify
+ *   it under the terms of the GNU General Public License as published by
+ *   the Free Software Foundation, either version 3 of the License, or
+ *   (at your option) any later version.
+ *
+ *   This program is distributed in the hope that it will be useful,
+ *   but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ *   GNU General Public License for more details.
+ *
+ *   You should have received a copy of the GNU General Public License
+ *   along with this program. If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef XMRIG_CUDATHREAD_H
+#define XMRIG_CUDATHREAD_H
+
+
+using nvid_ctx = struct nvid_ctx;
+
+
+#include "crypto/common/Algorithm.h"
+#include "rapidjson/fwd.h"
+
+
+namespace xmrig {
+
+
+class CudaThread
+{
+public:
+    CudaThread() = delete;
+    CudaThread(const rapidjson::Value &value);
+    CudaThread(uint32_t index, nvid_ctx *ctx);
+
+    inline bool isValid() const                              { return m_blocks > 0 && m_threads > 0; }
+    inline int32_t bfactor() const                           { return static_cast<int32_t>(m_bfactor); }
+    inline int32_t blocks() const                            { return m_blocks; }
+    inline int32_t bsleep() const                            { return static_cast<int32_t>(m_bsleep); }
+    inline int32_t datasetHost() const                       { return m_datasetHost; }
+    inline int32_t threads() const                           { return m_threads; }
+    inline int64_t affinity() const                          { return m_affinity; }
+    inline uint32_t index() const                            { return m_index; }
+
+    inline bool operator!=(const CudaThread &other) const    { return !isEqual(other); }
+    inline bool operator==(const CudaThread &other) const    { return isEqual(other); }
+
+    bool isEqual(const CudaThread &other) const;
+    rapidjson::Value toJSON(rapidjson::Document &doc) const;
+
+private:
+    int32_t m_blocks        = 0;
+    int32_t m_datasetHost   = -1;
+    int32_t m_threads       = 0;
+    int64_t m_affinity      = -1;
+    uint32_t m_index        = 0;
+
+#   ifdef _WIN32
+    uint32_t m_bfactor      = 6;
+    uint32_t m_bsleep       = 25;
+#   else
+    uint32_t m_bfactor      = 0;
+    uint32_t m_bsleep       = 0;
+#   endif
+};
+
+
+} /* namespace xmrig */
+
+
+#endif /* XMRIG_CUDATHREAD_H */
--- a/src/backend/cuda/CudaThreads.cpp
+++ b/src/backend/cuda/CudaThreads.cpp
@ -0,0 +1,79 @@
+/* XMRig
+ * Copyright 2010      Jeff Garzik <jgarzik@pobox.com>
+ * Copyright 2012-2014 pooler      <pooler@litecoinpool.org>
+ * Copyright 2014      Lucas Jones <https://github.com/lucasjones>
+ * Copyright 2014-2016 Wolf9466    <https://github.com/OhGodAPet>
+ * Copyright 2016      Jay D Dee   <jayddee246@gmail.com>
+ * Copyright 2017-2018 XMR-Stak    <https://github.com/fireice-uk>, <https://github.com/psychocrypt>
+ * Copyright 2018-2019 SChernykh   <https://github.com/SChernykh>
+ * Copyright 2016-2019 XMRig       <https://github.com/xmrig>, <support@xmrig.com>
+ *
+ *   This program is free software: you can redistribute it and/or modify
+ *   it under the terms of the GNU General Public License as published by
+ *   the Free Software Foundation, either version 3 of the License, or
+ *   (at your option) any later version.
+ *
+ *   This program is distributed in the hope that it will be useful,
+ *   but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ *   GNU General Public License for more details.
+ *
+ *   You should have received a copy of the GNU General Public License
+ *   along with this program. If not, see <http://www.gnu.org/licenses/>.
+ */
+
+
+#include "backend/cuda/CudaThreads.h"
+#include "base/io/json/Json.h"
+#include "rapidjson/document.h"
+
+
+#include <algorithm>
+
+
+xmrig::CudaThreads::CudaThreads(const rapidjson::Value &value)
+{
+    if (value.IsArray()) {
+        for (auto &v : value.GetArray()) {
+            CudaThread thread(v);
+            if (thread.isValid()) {
+                add(std::move(thread));
+            }
+        }
+    }
+}
+
+
+xmrig::CudaThreads::CudaThreads(const std::vector<CudaDevice> &devices, const Algorithm &algorithm)
+{
+    for (const auto &device : devices) {
+        device.generate(algorithm, *this);
+    }
+}
+
+
+bool xmrig::CudaThreads::isEqual(const CudaThreads &other) const
+{
+    if (isEmpty() && other.isEmpty()) {
+        return true;
+    }
+
+    return count() == other.count() && std::equal(m_data.begin(), m_data.end(), other.m_data.begin());
+}
+
+
+rapidjson::Value xmrig::CudaThreads::toJSON(rapidjson::Document &doc) const
+{
+    using namespace rapidjson;
+    auto &allocator = doc.GetAllocator();
+
+    Value out(kArrayType);
+
+    out.SetArray();
+
+    for (const CudaThread &thread : m_data) {
+        out.PushBack(thread.toJSON(doc), allocator);
+    }
+
+    return out;
+}
--- a/src/backend/cuda/CudaThreads.h
+++ b/src/backend/cuda/CudaThreads.h
@ -0,0 +1,66 @@
+/* XMRig
+ * Copyright 2010      Jeff Garzik <jgarzik@pobox.com>
+ * Copyright 2012-2014 pooler      <pooler@litecoinpool.org>
+ * Copyright 2014      Lucas Jones <https://github.com/lucasjones>
+ * Copyright 2014-2016 Wolf9466    <https://github.com/OhGodAPet>
+ * Copyright 2016      Jay D Dee   <jayddee246@gmail.com>
+ * Copyright 2017-2018 XMR-Stak    <https://github.com/fireice-uk>, <https://github.com/psychocrypt>
+ * Copyright 2018-2019 SChernykh   <https://github.com/SChernykh>
+ * Copyright 2016-2019 XMRig       <https://github.com/xmrig>, <support@xmrig.com>
+ *
+ *   This program is free software: you can redistribute it and/or modify
+ *   it under the terms of the GNU General Public License as published by
+ *   the Free Software Foundation, either version 3 of the License, or
+ *   (at your option) any later version.
+ *
+ *   This program is distributed in the hope that it will be useful,
+ *   but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ *   GNU General Public License for more details.
+ *
+ *   You should have received a copy of the GNU General Public License
+ *   along with this program. If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef XMRIG_CUDATHREADS_H
+#define XMRIG_CUDATHREADS_H
+
+
+#include <vector>
+
+
+#include "backend/cuda/CudaThread.h"
+#include "backend/cuda/wrappers/CudaDevice.h"
+
+
+namespace xmrig {
+
+
+class CudaThreads
+{
+public:
+    CudaThreads() = default;
+    CudaThreads(const rapidjson::Value &value);
+    CudaThreads(const std::vector<CudaDevice> &devices, const Algorithm &algorithm);
+
+    inline bool isEmpty() const                              { return m_data.empty(); }
+    inline const std::vector<CudaThread> &data() const       { return m_data; }
+    inline size_t count() const                              { return m_data.size(); }
+    inline void add(CudaThread &&thread)                     { m_data.push_back(thread); }
+    inline void reserve(size_t capacity)                     { m_data.reserve(capacity); }
+
+    inline bool operator!=(const CudaThreads &other) const   { return !isEqual(other); }
+    inline bool operator==(const CudaThreads &other) const   { return isEqual(other); }
+
+    bool isEqual(const CudaThreads &other) const;
+    rapidjson::Value toJSON(rapidjson::Document &doc) const;
+
+private:
+    std::vector<CudaThread> m_data;
+};
+
+
+} /* namespace xmrig */
+
+
+#endif /* XMRIG_CUDATHREADS_H */
--- a/src/backend/cuda/CudaWorker.cpp
+++ b/src/backend/cuda/CudaWorker.cpp
@ -0,0 +1,171 @@
+/* XMRig
+ * Copyright 2010      Jeff Garzik <jgarzik@pobox.com>
+ * Copyright 2012-2014 pooler      <pooler@litecoinpool.org>
+ * Copyright 2014      Lucas Jones <https://github.com/lucasjones>
+ * Copyright 2014-2016 Wolf9466    <https://github.com/OhGodAPet>
+ * Copyright 2016      Jay D Dee   <jayddee246@gmail.com>
+ * Copyright 2017-2018 XMR-Stak    <https://github.com/fireice-uk>, <https://github.com/psychocrypt>
+ * Copyright 2018      Lee Clagett <https://github.com/vtnerd>
+ * Copyright 2018-2019 SChernykh   <https://github.com/SChernykh>
+ * Copyright 2016-2019 XMRig       <https://github.com/xmrig>, <support@xmrig.com>
+ *
+ *   This program is free software: you can redistribute it and/or modify
+ *   it under the terms of the GNU General Public License as published by
+ *   the Free Software Foundation, either version 3 of the License, or
+ *   (at your option) any later version.
+ *
+ *   This program is distributed in the hope that it will be useful,
+ *   but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ *   GNU General Public License for more details.
+ *
+ *   You should have received a copy of the GNU General Public License
+ *   along with this program. If not, see <http://www.gnu.org/licenses/>.
+ */
+
+
+#include "backend/cuda/CudaWorker.h"
+#include "backend/common/Tags.h"
+#include "backend/cuda/runners/CudaCnRunner.h"
+#include "base/io/log/Log.h"
+#include "base/tools/Chrono.h"
+#include "core/Miner.h"
+#include "crypto/common/Nonce.h"
+#include "net/JobResults.h"
+
+
+#ifdef XMRIG_ALGO_RANDOMX
+#   include "backend/cuda/runners/CudaRxRunner.h"
+#endif
+
+
+#include <cassert>
+#include <thread>
+
+
+namespace xmrig {
+
+
+static constexpr uint32_t kReserveCount = 32768;
+std::atomic<bool> CudaWorker::ready;
+
+
+static inline bool isReady()                         { return !Nonce::isPaused() && CudaWorker::ready; }
+static inline uint32_t roundSize(uint32_t intensity) { return kReserveCount / intensity + 1; }
+
+
+} // namespace xmrig
+
+
+
+xmrig::CudaWorker::CudaWorker(size_t id, const CudaLaunchData &data) :
+    Worker(id, data.thread.affinity(), -1),
+    m_algorithm(data.algorithm),
+    m_miner(data.miner)
+{
+    switch (m_algorithm.family()) {
+    case Algorithm::RANDOM_X:
+#       ifdef XMRIG_ALGO_RANDOMX
+        m_runner = new CudaRxRunner(id, data);
+#       endif
+        break;
+
+    case Algorithm::ARGON2:
+        break;
+
+    default:
+        m_runner = new CudaCnRunner(id, data);
+        break;
+    }
+
+    if (!m_runner || !m_runner->init()) {
+        return;
+    }
+}
+
+
+xmrig::CudaWorker::~CudaWorker()
+{
+    delete m_runner;
+}
+
+
+bool xmrig::CudaWorker::selfTest()
+{
+    return m_runner != nullptr;
+}
+
+
+size_t xmrig::CudaWorker::intensity() const
+{
+    return m_runner ? m_runner->intensity() : 0;
+}
+
+
+void xmrig::CudaWorker::start()
+{
+    while (Nonce::sequence(Nonce::CUDA) > 0) {
+        if (!isReady()) {
+            do {
+                std::this_thread::sleep_for(std::chrono::milliseconds(200));
+            }
+            while (!isReady() && Nonce::sequence(Nonce::CUDA) > 0);
+
+            if (Nonce::sequence(Nonce::CUDA) == 0) {
+                break;
+            }
+
+            if (!consumeJob()) {
+                return;
+            }
+        }
+
+        while (!Nonce::isOutdated(Nonce::CUDA, m_job.sequence())) {
+            uint32_t foundNonce[10] = { 0 };
+            uint32_t foundCount     = 0;
+
+            if (!m_runner->run(*m_job.nonce(), &foundCount, foundNonce)) {
+                return;
+            }
+
+            if (foundCount) {
+                JobResults::submit(m_job.currentJob(), foundNonce, foundCount);
+            }
+
+            const size_t batch_size = intensity();
+            m_job.nextRound(roundSize(batch_size), batch_size);
+
+            storeStats();
+            std::this_thread::yield();
+        }
+
+        if (!consumeJob()) {
+            return;
+        }
+    }
+}
+
+
+bool xmrig::CudaWorker::consumeJob()
+{
+    if (Nonce::sequence(Nonce::CUDA) == 0) {
+        return false;
+    }
+
+    const size_t batch_size = intensity();
+    m_job.add(m_miner->job(), roundSize(batch_size) * batch_size, Nonce::CUDA);
+
+    return m_runner->set(m_job.currentJob(), m_job.blob());;
+}
+
+
+void xmrig::CudaWorker::storeStats()
+{
+    if (!isReady()) {
+        return;
+    }
+
+    m_count += intensity();
+
+    Worker::storeStats();
+}
--- a/Show more
+++ b/Show more