Commit graph

559 commits

Author SHA1 Message Date
SChernykh
9768bf65d1 RandomX improved performance of GCC compiled binaries
JIT compilator was slower compared to MSVC compiled binary. Up to +0.1% speedup on rx/wow in Linux.
2020-09-22 13:48:11 +02:00
SChernykh
891a46382e RandomX: AES improvements
- A bit faster hardware AES code when compiled with MSVC
- More reliable software AES benchmark
2020-09-21 17:51:08 +02:00
SChernykh
c7476e076b RandomX refactoring, moved more stuff to compile time
Small x86 JIT compiler speedup.
2020-09-18 20:51:25 +02:00
SChernykh
8d1168385a RandomX: returned old soft AES impl and auto-select between the two 2020-09-15 20:48:27 +02:00
cohcho
30be1cd102 reserve at most 1 bit for wrapping detection 2020-09-13 18:42:16 +00:00
SChernykh
a05393727c RandomX: added performance profiler (for developers)
Also optimized Blake2b SSE4.1 code size to avoid code cache pollution.
2020-09-12 23:07:52 +02:00
xmrig
adf833b60a
Merge pull request #1827 from cohcho/nonce_iteration_without_tests
nonce iteration optimization
2020-09-10 19:33:23 +07:00
SChernykh
4a9db89527 RandomX: added SSE4.1-optimized Blake2b
+0.15% on `rx/0`
+0.3% on `rx/wow`
2020-09-10 14:28:40 +02:00
cohcho
060c1af4c4 fix nonce mask 2020-09-09 19:39:52 +00:00
cohcho
b826985d05 nonce iteration optimization
efficient and correct nonce iteration without duplicates
2020-09-09 10:03:37 +00:00
SChernykh
a84b45b1bb RandomX: added parameter for scratchpad prefetch mode
`scratchpad_prefetch_mode` can have 4 values:
0: off
1: use `prefetcht0` instruction (default, same as previous XMRig versions)
2: use `prefetchnta` instruction (faster on Coffee Lake and a few other CPUs)
3: use `mov` instruction
2020-09-04 16:16:07 +02:00
XMRig
72c8404d18
Fix compile warnings. 2020-08-24 10:04:46 +07:00
XMRig
879e160ba3
Fix compile warning. 2020-08-23 14:22:08 +07:00
XMRig
3e4bf8cd6c
Fix compile warning 2020-08-17 06:08:14 +07:00
XMRig
00b4ae9c36
Fixed compile warning and updated build.uv.sh. 2020-08-16 16:03:27 +07:00
SChernykh
5926dee354 RandomX JIT: optimized address mask calculation 2020-08-12 16:45:16 +02:00
XMRig
ae3ff0f570
Fixed RandomX cache initialization if 1GB pages fails to allocate on a first NUMA node. 2020-08-01 12:30:02 +07:00
SChernykh
abb78302b8 Try to allocate scratchpad from dataset's 1 GB huge pages, if normal huge pages are not available 2020-07-31 13:37:22 +02:00
SChernykh
838cc08680 Force 2 MB pages size in allocateLargePagesMemory() on Linux 2020-07-31 09:55:49 +02:00
XMRig
1acd88ed39
Cleanup 2020-07-22 21:27:40 +07:00
SChernykh
5bc89fdc8b Fixed RandomX initialization for VS debug builds 2020-07-21 10:10:07 +02:00
XMRig
70c7f33a20
Added command line options --cache-qos (--randomx-cache-qos) and --argon2-impl (--cpu-argon2-impl). 2020-07-20 09:17:59 +07:00
XMRig
e0eed7d5d6
Fixed build without MSR support. 2020-07-16 05:15:35 +07:00
SChernykh
1bf159d1e8 Removed cache QoS warning at exit on unsupported CPUs 2020-07-13 20:43:49 +02:00
SChernykh
72c385c870 Cache QoS: fix for seting MSR 2020-07-13 20:30:44 +02:00
SChernykh
c83429c55c RandomX: added cache QoS support
False by default. If set to true, all non-mining CPU cores will not have access to L3 cache.
2020-07-13 17:23:18 +02:00
Jim Huang
b665d2d865 Adopt new SSE2NEON and reduce ARM-specific changes
This patch updated SSE2NEON [1], which contains more functions
provided by Intel intrinsics, only implemented with NEON-based
counterparts to produce the exact semantics of the intrinsics.
Consequently, ARM-specific changes against CryptoNight_arm can
be reduced as well.

[1] https://github.com/DLTcollab/sse2neon/
2020-07-11 01:55:11 +08:00
SChernykh
3d740e81a2 RandomX: tweaked Ryzen code
Very small speedup
2020-07-05 16:06:59 +02:00
SChernykh
59313d9cc3 Print error message when MSR mod fails
Make sure user knows that hashrate is worse than it could be.
2020-06-26 19:54:06 +02:00
SChernykh
5724d8beb6 KawPow: optimized CPU share verification
- 2 times faster CPU share verification (11 -> 5 ms)
- 1.5 times faster light cache initialization
2020-06-26 12:31:26 +02:00
SChernykh
dc0aee1432 KawPow: fixed crash on old CPUs
- Use `popcnt` instruction only when it's supported
2020-06-10 21:49:43 +02:00
XMRig
dbc8e20e53
Merge branch 'dev' into evo 2020-06-07 21:25:31 +07:00
SChernykh
75c57f7563 Fixed GCC 10.1 issues
- Fixed uninitialized `state->x` warning
- Fixed broken code with `-O3` or `-Ofast`
2020-06-07 16:23:17 +02:00
XMRig
5e1199ea48
Merge branch 'dev' into evo 2020-06-07 20:15:12 +07:00
Matt Smith
a28bddcbdf Stop linker from making stack executable
Add .note.GNU-stack section to end of AstroBWT ASM.

Signed-off-by: Matt Smith <matt@offtopica.uk>
2020-06-07 13:57:37 +01:00
SChernykh
7f00cb59d2 Conceal (CCX) support 2020-06-07 01:01:45 +02:00
XMRig
f18bfeb77d
Merge branch 'evo' of https://github.com/SChernykh/xmrig into pr1713 2020-06-05 19:17:01 +07:00
SChernykh
0dbf41f761 Reduced memory for KawPow 2020-06-05 14:01:49 +02:00
SChernykh
2e3d087750 Merge remote-tracking branch 'upstream/evo' into evo 2020-05-28 22:06:10 +02:00
SChernykh
6676126376 Fixed hashrate and diff display for KawPow 2020-05-28 22:03:28 +02:00
XMRig
eb1ed497e7
Log cleanup. 2020-05-29 02:11:29 +07:00
XMRig
7a3233ab4b
Use long tags. 2020-05-28 20:32:41 +07:00
SChernykh
22b937cc1c KawPow WIP 2020-05-27 16:19:57 +02:00
XMRig
0a7324f500
Merge branch 'dev' 2020-05-23 11:08:53 +07:00
Bohan Yu
a797d808b5 Change cases of Windows include file and link library
When cross-compiling on case sensitive systems, such as Linux, there will be an Error.
2020-05-13 21:00:52 +08:00
XMRig
2e34bf7a1b
Removed unnecessary check. 2020-05-09 01:36:57 +07:00
XMRig
7f31f45b6d
Fix build. 2020-05-09 01:26:05 +07:00
XMRig
3cbf0dc0ee
Removed code duplicate. 2020-05-09 01:13:46 +07:00
XMRig
c828e6b793
Code cleanup. 2020-05-05 01:55:00 +07:00
XMRig
b34e3e1a7b
Remove unused code. 2020-05-04 02:07:38 +07:00
SChernykh
80d944bf82 Optimized RandomX dataset initialization
- Use single Argon2 implemenation
- Auto-select the fastest Argon2 implementation for RandomX
2020-05-03 20:44:59 +02:00
XMRig
c18478a6b4
Small cleanups. 2020-05-03 13:38:34 +07:00
XMRig
8aeba61706
Add 3rdparty prefix to all rapidjson includes. 2020-04-29 14:55:04 +07:00
XMRig
0cc90b152d
Move CnAlgo 2020-04-23 12:34:26 +07:00
SChernykh
bfd017d064 Refactored CFROUND 2020-04-21 15:44:04 +02:00
SChernykh
680e4dd865 Fix code style 2020-04-09 14:31:42 +02:00
SChernykh
abb3340cc7 RandomX JIT refactoring
- Smaller memory footprint
- A bit faster overall
2020-04-09 14:24:54 +02:00
SChernykh
92810ad761 Fixed VM destruction 2020-04-08 08:31:53 +02:00
SChernykh
39bd3ca1da Fix off-by-one error 2020-04-07 18:53:08 +02:00
SChernykh
4d0edde66d Fixed pool lock 2020-04-07 18:48:02 +02:00
SChernykh
69cbfd682a Use node number instead of affinity 2020-04-07 18:46:22 +02:00
SChernykh
6ae37a9519 Pooled allocation of RandomX VMs
+0.5% speedup on Zen2 when the whole L3 cache is used (16 threads on 3700X/3800X, 32 threads on 3950X).
2020-04-07 18:31:35 +02:00
SChernykh
539943c655 Fix MacOS compilation 2020-03-11 16:35:52 +01:00
SChernykh
e22f798085 AVX2 optimized code for AstroBWT
Added "astrobwt-avx2" parameter in config.json, it's turned off ("false") by default.

4-5% speedup on CPUs with proper AVX2 support (AMD Ryzen starting with Zen2, Intel Core starting with Haswell).

There will be no speedup on the following CPUs:

- Intel Pentium/Celeron don't support AVX2
- AMD Zen/Zen+ have only half-speed AVX

GCC compiled version is faster without AVX2, MSVC compiled version is faster with AVX2
2020-03-10 22:35:14 +01:00
SChernykh
9405d8ed92 Activate MSR mod only for RandomX algorithms 2020-03-09 19:10:26 +01:00
XMRig
16a83a9f61
Move files. 2020-03-09 01:22:34 +07:00
SChernykh
b7840d9ab6 Fixed invalid AstroBWT hashes after algo switching 2020-03-07 16:41:33 +01:00
XMRig
13ac54ada9
v5.9.0-dev 2020-03-07 21:27:55 +07:00
XMRig
1f36ea2a8e
Added "coin": "keva" and post PR cleanup. 2020-03-07 20:38:44 +07:00
XMRig
ab90af37b3
Merge branch 'master' of https://github.com/kevacoin-project/xmrig into feature-rx-keva 2020-03-07 17:13:08 +07:00
SChernykh
05dc9821c5 Fixed compilation withut randomx/argon2 2020-03-06 07:22:57 +01:00
SChernykh
eeadea53e2 AstroBWT 20-50% speedup
Skips hashes with large stage 2 size. Added configurable `astrobwt-max-size` parameter, default value is 550, min 400, max 1200, optimal value ranges from 500 to 600 depending on CPU.

- Intel CPUs get 20-25% speedup
- 1st- and 2nd-gen Ryzens get 30% speedup
- 3rd-gen Ryzens get up to 50% speedup
2020-03-05 12:20:21 +01:00
kevacoin
0528ccd01e Added Keva. 2020-03-04 16:23:33 -08:00
XMRig
8dc87576c5
Sync changes with proxy. 2020-03-01 14:04:58 +07:00
XMRig
616c52f266
#1572 Fix compile warning. 2020-03-01 11:59:53 +07:00
XMRig
cdd9ea2496
Make "astrobwt" as primary user visible algorithm name. 2020-03-01 10:21:29 +07:00
SChernykh
14ef99ca67 AstroBWT algorithm (DERO) support
To test:

- Download https://github.com/deroproject/derosuite/releases/tag/AstroBWT
- Run daemon with `--testnet` in command line

In config.json:
- "coin":"dero"
- "url":"127.0.0.1:30306"
- "daemon:"true"
2020-02-29 22:41:24 +01:00
SChernykh
131085be80 Optimized CFROUND
Shorter version using BMI2 instructionns
2020-02-21 19:00:58 +01:00
SChernykh
e1b8f52e59 Fixed 32-bit compilation 2020-02-21 16:08:23 +01:00
SChernykh
0caeb41bff Tuned JIT compiler
0.3-0.4% speedup depending on CPU.
2020-02-20 20:59:22 +01:00
XMRig
c307433900
Fixed nicehash nonce overflow for CPU backend. 2020-02-06 17:19:08 +07:00
xmrig
9c8da1d4d3
Merge pull request #1529 from SChernykh/dev
Crash fix for Bullodzer CPUs
2020-02-02 23:19:49 +07:00
SChernykh
ffc9f67751 Crash fix for Bullodzer CPUs 2020-02-02 17:16:59 +01:00
XMRig
030d6e5962
Update year. 2020-02-01 20:24:00 +07:00
SChernykh
4571899664 Removed MSR mod for Bulldozer
It turned out to be useless: https://www.reddit.com/r/MoneroMining/comments/et7s7w/psa_amd_opteronfxa6a8a10_owners_needed_to_test/
2020-01-27 09:39:39 +01:00
SChernykh
cd763be05b Fix compile error 2020-01-24 14:09:07 +01:00
SChernykh
42a7194e93 Fix crash on Linux 2020-01-24 13:34:12 +01:00
SChernykh
9f1753cc4f Optimized CFROUND 2020-01-22 20:11:00 +01:00
SChernykh
d342968211 Added support for BMI2 instructions 2020-01-21 19:44:56 +01:00
SChernykh
f80177cbd3 Optimizations for AMD Bulldozer
- Added support for XOP instructions
- Enabled Ryzen code for Bulldozer because it's faster there too
2020-01-15 13:04:26 +01:00
SChernykh
665e43fecc MSR preset for Bulldozer CPUs
Also fixed verbose output for MSR presets with masks.
2020-01-14 19:27:34 +01:00
SChernykh
73722ce186 JIT compiler: removed unnecessary memcpy from generateProgram() 2020-01-13 18:00:41 +01:00
SChernykh
869209389e Update MSR preset for Intel
As per https://github.com/xmrig/xmrig/issues/1433#issuecomment-572126184
2020-01-09 08:10:36 +01:00
SChernykh
eb20dfbc94 JIT compiler tweaks 2020-01-06 13:57:48 +01:00
XMRig
88ff807700
Fix compile warnings. 2020-01-03 19:11:48 +07:00
SChernykh
c9f90e6770 Refactor Ryzen fix to fix compilation issues 2019-12-31 11:55:07 +02:00
XMRig
a5b0bc04cc
Add "cn/ultra" alias for tlo-pool.raasu.org pool. 2019-12-29 15:36:05 +07:00
XMRig
402c44b547
Added "cn-pico/tlo". 2019-12-29 00:29:19 +07:00
XMRig
ac4086b273
Fix build. 2019-12-28 02:00:08 +07:00
XMRig
f00769f758
Code style cleanup. 2019-12-28 01:45:54 +07:00
SChernykh
3a2941b719 Fix for 1st-gen Ryzen crashes 2019-12-27 12:40:38 +02:00
XMRig
dbb721cb5e
Removed "rx/v" algorithm. 2019-12-26 22:34:19 +07:00
XMRig
22eca8e0d5
Fixed memory allocation checks. 2019-12-25 04:39:21 +07:00
XMRig
449617d717
Allow use old CUDA plugin. 2019-12-20 21:10:13 +07:00
XMRig
049caabdae
Add missing algorithm name alias. 2019-12-20 04:08:47 +07:00
XMRig
2911bb3a81
Fix OpenCL. 2019-12-20 04:05:09 +07:00
Tony Butler
45412a2ace Add MoneroV (rx/v) algorithm [based on MoneroOcean/master] 2019-12-18 16:17:22 -07:00
XMRig
f4cedd7b63
Fixed MsrItem serialization. 2019-12-19 03:49:32 +07:00
XMRig
3e3d34b3ce
Allow number value for "wrmsr" option only for Intel. 2019-12-19 03:28:05 +07:00
XMRig
12fb27e2cf
Use MsrItem::kNoMask. 2019-12-19 03:20:48 +07:00
SChernykh
c01c035269 Fixed crash with GCC compiler 2019-12-18 17:32:57 +01:00
SChernykh
f85aba5d21 Fixed AVX detection 2019-12-18 12:20:21 +01:00
SChernykh
f8bf8fddd9 Update jit_compiler_x86_static.S 2019-12-18 09:13:21 +01:00
SChernykh
7459677fd5 Add vzeroupper for processors with AVX
To avoid false dependencies on upper 128 bits of YMM registers.
2019-12-18 09:12:25 +01:00
SChernykh
59e8fdb9ed Added bit masks for MSR registers 2019-12-17 23:55:22 +01:00
XMRig
5142a406b0
Less error prone log interface. 2019-12-18 02:20:31 +07:00
XMRig
f8865b1498
Added "verbose" option. 2019-12-17 21:46:11 +07:00
XMRig
969821296f
Merge branch 'feature-custom-msr' into dev 2019-12-17 16:53:28 +07:00
XMRig
a877b1d269
Added save/restore MSR registers on Linux. 2019-12-17 16:17:11 +07:00
XMRig
9cea70b77c
Rename Rx_windows.cpp to Rx_win.cpp. 2019-12-17 15:16:37 +07:00
XMRig
d2d501c821
Added RandomX option "rdmsr" and save/restore MSR registers on Windows. 2019-12-17 14:45:01 +07:00
XMRig
8bef964f68
Added support for write custom MSR. 2019-12-17 02:27:07 +07:00
SChernykh
4da37baf8c RandomSFX (Safex Cash variant) support 2019-12-16 19:36:29 +01:00
XMRig
1d4c8dda96
#1423 Implemented driver reuse. 2019-12-16 03:41:58 +07:00
XMRig
b633b593ad
Strict wrmsr error handling. 2019-12-16 02:45:07 +07:00
XMRig
8dbb83f99b
Revert changes. 2019-12-16 02:17:57 +07:00
SChernykh
2e001677df Use unique service name for WinRing0 driver
To avoid error 1072
2019-12-15 19:28:14 +01:00
XMRig
6adba6dad4
Removed unnecessary check. 2019-12-15 12:02:45 +07:00
XMRig
fb5b873524
Added missing tag. 2019-12-15 01:52:20 +07:00
XMRig
5d0fd2dc8e
Unified Linux/Windows MSR log messages. 2019-12-15 01:32:41 +07:00
SChernykh
222fcfae87 Fixed thread count for MSR mod 2019-12-14 16:30:46 +01:00
SChernykh
2e6523aa10 MSR mod for Windows 2019-12-14 16:04:37 +01:00
XMRig
7ff465053b
Added additional MSR registers for Ryzen CPUs. 2019-12-12 14:21:15 +07:00
XMRig
1c58e28124
Don't build Rx_linux.cpp on ARM. 2019-12-11 21:20:37 +07:00
XMRig
96ee721d21
Fixed MSR. 2019-12-11 20:09:25 +07:00
XMRig
de7ed2b968
Added support for AMD specific MSR registers. 2019-12-11 19:37:13 +07:00
XMRig
4fb3086c1c
Fixed --randomx-wrmsr option without parameters. 2019-12-11 19:16:01 +07:00
XMRig
96cfdda9a1
Added RandomX option "wrmsr" with command line equivalent --randomx-wrmsr=N. 2019-12-10 23:57:29 +07:00
SChernykh
ef522f6404 Update jit_compiler_x86_static.S 2019-12-09 20:30:37 +01:00
SChernykh
763691fa4b More optimizations for Ryzen 2019-12-09 20:29:05 +01:00
SChernykh
9bc13813ba Fixed assembly selection for RandomX when it's on Auto 2019-12-09 18:59:49 +01:00
XMRig
3edaebb4cf
Move "1gb-pages" option to "randomx" object. 2019-12-09 21:42:40 +07:00
XMRig
d32df84ca5
Memory allocation refactoring. 2019-12-08 23:17:39 +07:00
SChernykh
028b335bac Fix GCC compilation 2019-12-08 16:51:37 +01:00
SChernykh
ffec421408 Fixed indentation 2019-12-08 16:20:46 +01:00
SChernykh
d0df824599 Optimized dataset read for Ryzen CPUs
Removed register dependency in dataset read, +0.8% speedup on average.
2019-12-08 16:14:02 +01:00
XMRig
86e25a13e3
New summary information about 1GB pages. 2019-12-08 14:21:28 +07:00
XMRig
e9e747f0d1
#1385 "max-threads-hint" option now also limit RandomX dataset initialization threads. 2019-12-07 22:18:06 +07:00
XMRig
3a75f39935
#1386 Added priority for RandomX dataset initialization threads. 2019-12-06 22:17:04 +07:00
SChernykh
e3422979d1 Fixed compilation on systems without 1GB pages support 2019-12-06 13:55:33 +01:00