Commit graph

668 commits

Author SHA1 Message Date
SChernykh
ff82ca57f2 RandomX: rewrote dataset read code
Unified code for AMD and Intel
1% faster on Intel
0.15% faster on AMD Ryzen
2021-05-20 12:45:42 +02:00
SChernykh
d443dd86f1 RandomX: added BMI2 version for scratchpad prefetch
Saves 1 instruction and 1 byte in the main loop.
2021-05-19 17:52:16 +02:00
SChernykh
9b1f020a8b Enabled IMUL_RCP optimization for light mode mining
Better fix for #2377
2021-05-17 11:26:40 +02:00
SChernykh
29cb416107 Fixed broken light mode mining on x86 2021-05-15 21:41:39 +02:00
SChernykh
dbda2e9ccd Update sse2neon.h 2021-05-03 18:08:59 +02:00
SChernykh
b7adb34c37 Fixed Zen3 asm for cn/upx2
- Invalid rounding mode was used which caused rejected shares sometimes
- Also optimized CN implode/explode functions a bit.
2021-04-21 13:22:25 +02:00
SChernykh
3477f9fbc1 RandomX: optimized IMUL_RCP instruction
+0.4% on AMD Zen2
+0.3% on AMD Zen3
+0.1% on Intel SandyBridge
+0.3% on rx/wow on Intel SandyBridge
2021-04-19 17:43:58 +02:00
SChernykh
69186f2470 Optimized cn/upx for Zen3
0.9% faster
2021-04-19 12:29:44 +02:00
SChernykh
730d4a6cee Fix dvision by zero check in percent() 2021-04-19 12:05:07 +02:00
SChernykh
54bc91d5e3 Fixed rounding mode after running cn/upx 2021-04-19 12:02:57 +02:00
SChernykh
16fe462cad Optimized cn/upx2 for Ryzen CPUs 2021-04-17 18:18:26 +02:00
SChernykh
ed456b02cf Update CnHash.cpp 2021-04-17 15:06:31 +02:00
SChernykh
da7f5826cb Added support for Uplexa (cn/upx2 algorithm) 2021-04-17 14:53:42 +02:00
SChernykh
1741354498 Fixed cn-heavy for GCC-8 2021-04-04 10:18:27 +02:00
SChernykh
59c85eaf6a Fixed compilation for ARM 2021-04-03 17:50:52 +02:00
xmrig
864233c110
Merge pull request #2228 from esrrhs/dev
remove useless v4_random_math_init if algo is not cn/r
2021-04-02 15:49:53 +07:00
SChernykh
ec608bbd05 Don't use RandomX JIT if WITH_ASM=OFF
Because RandomX JIT use asm code
2021-04-02 10:05:46 +02:00
esrrhs
ec2793bcc9 remove useless v4_random_math_init if algo is not cn/r 2021-04-02 14:59:09 +08:00
SChernykh
dc70893e6b Optimize cn-heavy in GCC builds
+0.7% in GCC builds, but GCC is still slower than MSVC on cn-heavy.
2021-03-28 16:12:09 +02:00
SChernykh
bcfd9edaa5 Optimized cn-heavy
- Remove unnecessary type conversion when doing `idx0 = d ^ q;`
- Saves 1 CPU cycle in the main loop
- 0.2% speedup on Ryzen 5 5600X, results on other CPUs may vary
2021-03-27 22:21:01 +01:00
SChernykh
2876f17f65 Fix vld1q_u8_x4 compilation error with GCC 10.2 2021-03-12 16:26:02 +01:00
TheGreatMcPain
ba3299b61b
Update sse2neon.h to the latest master. Fixes build on armv7.
A few days after this header was introduced. Upstream updated it with
armv7 versions of `_mm_aesenc_si128` which allows xmrig to build
on armv7.
2021-03-02 01:33:25 -06:00
SChernykh
e8a99809b6 Fixed crash when GPU mining cn-heavy on Zen3 system 2021-02-18 14:49:37 +01:00
SChernykh
dc1443f3b8 Cryptonight: add prefetching to interleaved mode 2021-02-07 23:29:54 +01:00
SChernykh
8af8df25aa Optimized cn-heavy for Zen3
- Uses scratchpad interleaving to access only the closest L3 slice from each CPU core.
- Also activates MSR mod for cn-heavy because CPU prefetchers get confused with interleaving
- 7-8% speedup on Zen3
2021-02-07 22:05:11 +01:00
SChernykh
21abbe4e84 Fix compile error in Termux 2021-02-03 19:05:05 +01:00
XMRig
2c8d8ee2ab
Fixed macOS build and compile warning. 2021-02-02 13:53:45 +07:00
SChernykh
346892e170 Update jit_compiler_a64.cpp 2021-02-01 22:52:02 +01:00
SChernykh
db03573804 ARM JIT: added missing cache flush 2021-02-01 22:42:35 +01:00
SChernykh
e74573f81f Fixed code allocation for ARM 2021-02-01 22:36:11 +01:00
xmrig
0e70974d7d
Merge pull request #2076 from xmrig/feature-flexible-hugepages
Added support for flexible huge page sizes on Linux.
2021-02-02 04:07:41 +07:00
SChernykh
4108428872 Fixed crashes on ARM 2021-02-01 17:07:45 +01:00
XMRig
09624c4f9b
Added support for flexible huge page sizes on Linux. 2021-01-31 23:38:57 +07:00
xmrig
5999dccd57
Merge pull request #2058 from SChernykh/dev
RandomX JIT x86: remove unnecessary instructions
2021-01-24 13:59:56 +07:00
SChernykh
78922a0772 RandomX JIT x86: remove unnecessary instructions
Adopted from https://github.com/tevador/RandomX/pull/201
2021-01-23 22:28:50 +01:00
XMRig
672f6df6c1
Fixed Cache QoS restore on exit where it not supported. 2021-01-24 02:23:27 +07:00
XMRig
9dae559b73
Added RxMsr class. 2021-01-23 23:23:39 +07:00
XMRig
b9d813c403
Move Ryzen related fixes to RxFix class. 2021-01-23 00:27:56 +07:00
XMRig
c48e2e6af8
Added new class Msr. 2021-01-22 23:50:25 +07:00
XMRig
3730bcd434
Merge branch 'master' into feature-msr2 2021-01-22 16:55:57 +07:00
XMRig
ea367da064
#2043 Fix compile warning. 2021-01-17 17:48:35 +07:00
Richard Mitsuk Lavitt
590252bd5e fixed grammar in a couple of awkward error messages 2021-01-15 14:33:38 -06:00
SChernykh
f62f4e6108 RandomX x86 JIT: remove redundant CFROUND 2021-01-07 16:20:00 +01:00
SChernykh
ac46d6f8de Fix GCC warning 2020-12-19 19:50:52 +01:00
SChernykh
5efd00abec Another dataset AVX2 init speedup (+3.8% faster on Zen3) 2020-12-19 19:46:31 +01:00
SChernykh
633aaccd9c Added config option for AVX2 dataset init
-1 = Auto detect
0 = Always disabled
1 = Enabled if AVX2 is supported
2020-12-19 16:18:49 +01:00
SChernykh
410313d933 Auto-detect the fastest code for dataset init 2020-12-19 13:59:28 +01:00
SChernykh
515a85e66c Dataset initialization with AVX2 (WIP) 2020-12-18 14:53:54 +01:00
XMRig
6b21a51a2f
Huge pages not supported by macOS ARM. 2020-12-16 01:59:20 +07:00
XMRig
6b331b6945
Reduce JIT memory for ARM. 2020-12-15 02:52:38 +07:00
SChernykh
414588d701 Fix alignment for Linux 2020-12-14 18:32:25 +01:00
SChernykh
f89f6a8abf Fix: secure JIT and huge pages are incompatible on Windows 2020-12-14 18:22:58 +01:00
XMRig
87fafcf91b
Fixed JIT on macOS. 2020-12-12 22:40:48 +07:00
XMRig
2966b80ba1
Fixed macOS build. 2020-12-12 22:15:15 +07:00
XMRig
179f09081f
Alternative secure JIT for macOS. 2020-12-12 21:32:36 +07:00
XMRig
775867fc3e
Fixed secure JIT on Linux and code cleanup. 2020-12-12 19:18:47 +07:00
XMRig
497863441a
Remove duplicated code. 2020-12-12 12:39:11 +07:00
XMRig
ec62ded279
Added generic secure JIT support for RandomX. 2020-12-11 23:17:54 +07:00
SChernykh
0da3390d09 More static analysis fixes 2020-12-08 16:05:58 +01:00
SChernykh
cafd868773 Fixed errors found by static analysis 2020-12-08 12:16:59 +01:00
XMRig
c8ee6f7db8
Move Profiler and more cleanup. 2020-12-04 09:23:40 +07:00
XMRig
63bd45c397
Added Cvt class. 2020-12-02 16:31:45 +07:00
SChernykh
d557fe7f39 Fix RandomX init when switching to other algo and back 2020-11-29 22:02:48 +01:00
SChernykh
f16d1837f8 Optimized JIT compiler
More branch-free code
2020-11-29 14:05:50 +01:00
SChernykh
c10ec90b60 Make single thread bench cheat-resistant
Each hash is dependent on the previous hash to make multi-threaded cheating impossible.
2020-11-15 20:38:27 +01:00
SChernykh
9a1e867da2 Fixed MSR mod names in JSON API 2020-11-14 19:55:43 +01:00
XMRig
3b6cfd9c4f
#1937 Print path to existing WinRing0 service without verbose option. 2020-11-12 23:32:49 +07:00
cohcho
eb36d2beef MemoryPool: fix alignment modification 2020-11-10 16:49:10 +00:00
cohcho
a64ff6b7c7 CompiledVm: define default constructor 2020-11-09 16:29:42 +00:00
SChernykh
c8c0abdb00 Separate MSR mod for Zen/Zen2 and Zen3
Another +0.5% speedup for Zen2
2020-11-08 19:40:44 +01:00
xmrig
5df1686810
Merge pull request #1932 from SChernykh/dev
New MSR mod for Ryzen
2020-11-07 13:09:21 +07:00
SChernykh
1e3e8ff8ee Update RxConfig.cpp 2020-11-06 22:59:18 +01:00
SChernykh
d4750239ea New MSR mod for Ryzen
+3.5% on Zen2, +1-2% on Zen3
2020-11-06 22:56:09 +01:00
XMRig
51690ebad6
#1918 Fixed check for 1GB huge pages on ARM Linux. 2020-11-02 21:26:35 +07:00
SChernykh
f1a24b7ddd Fix compilation on ARMv8 with GCC 9.3.0 2020-11-02 13:50:10 +01:00
XMRig
905713f1ca
Merge branch 'feature-bench-submit' into dev 2020-10-30 23:25:09 +07:00
SChernykh
6b7b3511ce Also fix RelWithDebIfno build in Visual Studio 2020-10-27 14:25:43 +01:00
SChernykh
50bdaba526 Fixed Debug build in Visual Studio 2020-10-27 14:08:36 +01:00
XMRig
4914fefb1f
Added "msr" field for CPU backend. 2020-10-25 16:36:37 +07:00
cohcho
99b58580e9 MSR: supress kernel module warning 2020-10-23 13:09:13 +00:00
XMRig
87b4d97798
New Async wrapper. 2020-10-21 08:09:44 +07:00
XMRig
36b1523194
Code cleanup. 2020-10-16 19:35:36 +07:00
xmrig
9fcc542676
Merge pull request #1889 from cohcho/fix_uv_issue
uv: fix performance issue
2020-10-13 22:35:29 +07:00
SChernykh
4f7186cb0e Added argon2/chukwav2 algorithm
New Turtlecoin algorithm. Source: https://github.com/turtlecoin/turtlecoin/blob/development/src/crypto/hash.h#L57
2020-10-12 08:26:57 +02:00
cohcho
65fa1d9bf3 uv: fix performance issue
unix implementation of uv_async_t has been wasting cpu cycles for nothing since 1.29.0 release
implement efficient callback scheduling for linux
2020-10-12 04:09:09 +00:00
SChernykh
4bac3e7695 Fix 32-bit compilation 2020-10-07 18:19:35 +02:00
xmrig
59bd6d4187
Merge pull request #1878 from SChernykh/dev
Fixed ARM compilation
2020-10-07 23:11:39 +07:00
SChernykh
166c011d37 Fixed ARM compilation 2020-10-07 18:09:42 +02:00
xmrig
1f55c6eb02
Merge pull request #1877 from SChernykh/dev
Fix FreeBSD compilation
2020-10-07 23:03:07 +07:00
SChernykh
c2bdae70fe Fix FreeBSD compilation 2020-10-07 18:00:36 +02:00
xmrig
1289942567
Merge pull request #1876 from SChernykh/dev
RandomX: added `huge-pages-jit` config parameter
2020-10-07 22:48:57 +07:00
SChernykh
44dcded866 RandomX: added huge-pages-jit config parameter
Set to false by default, gives 0.2% boost on Ryzen 7 3700X with 16 threads, but hashrate might be unstable on Ryzen between launches. Use with caution.
2020-10-07 17:42:55 +02:00
cohcho
a705ab775b RandomX: align args
tempHash/output must be 16-byte aligned for randomx_calculate_hash{,_first,_next}
2020-10-07 14:47:18 +00:00
cohcho
c710ee5fb5 RxVM: fix compilation error 2020-10-07 09:27:25 +00:00
SChernykh
a8466a139c RandomX: allocate 2 MB pages for generated code, if possible
+0.2% boost on Ryzen 7 3700X
2020-10-07 10:35:10 +02:00
xmrig
ba47219185
Merge pull request #1870 from cohcho/fix_miner_state_machine
Miner: fix state machine
2020-10-07 12:25:17 +07:00
cohcho
fa5b872782 RxVm: fix randomx_create_vm call
randomx_create_vm requires either cache or dataset, but not both
2020-10-06 19:45:43 +00:00
cohcho
7bdeba4d08 Nonce: refactor static init 2020-10-06 13:34:19 +00:00
xmrig
116fb3d3f9
Merge pull request #1864 from cohcho/soft_aes_optimization2
soft_aes: fix previous optimization
2020-10-05 12:20:41 +07:00
cohcho
5f0f2506e8 soft_aes: fix previous optimization
Previously removed unrolled variant is faster on some CPUs
Some CPUs are faster with added unrolled variant
The best variant depends on number of threads on some CPUs
2020-10-04 14:47:58 +00:00
SChernykh
ebf259fa7c RandomX: removed rx/loki
Loki forks to PoS on October 9th.
2020-10-02 17:02:52 +02:00
XMRig
d45bb24a32
Renamed WITH_SSE to WITH_SSE4_1 and make it work on all platforms. 2020-10-01 11:00:08 +07:00
SChernykh
7b4f768114 RandomX: optimized soft AES code
Unrolled loop was 5-10% slower depending on CPU.
2020-09-29 21:22:11 +02:00
xmrig
dfab81e9fa
Merge pull request #1858 from SChernykh/dev
RandomX: removed duplicate constants in Blake2b
2020-09-27 16:51:03 +07:00
SChernykh
3025c265e8 RandomX: removed duplicate constatns in Blake2b 2020-09-27 11:50:08 +02:00
xmrig
ee603ab9e2
Merge pull request #1857 from SChernykh/dev
RandomX: isolate SSE4.1 code to fix crashes on old CPUs
2020-09-27 16:47:56 +07:00
SChernykh
84f8a0dc54 RandomX: isolate SSE4.1 code to fix crashes on old CPUs 2020-09-27 11:46:32 +02:00
cohcho
9be3b69109 soft_aes: fix previous optimization
the best order of hash/fill/prefetch depends on hw/soft AES
only hw AES is faster after previous optimization
2020-09-25 15:26:19 +00:00
SChernykh
1e26e58660 Fix for ARM compilation 2020-09-23 11:44:08 +02:00
SChernykh
9768bf65d1 RandomX improved performance of GCC compiled binaries
JIT compilator was slower compared to MSVC compiled binary. Up to +0.1% speedup on rx/wow in Linux.
2020-09-22 13:48:11 +02:00
SChernykh
891a46382e RandomX: AES improvements
- A bit faster hardware AES code when compiled with MSVC
- More reliable software AES benchmark
2020-09-21 17:51:08 +02:00
SChernykh
c7476e076b RandomX refactoring, moved more stuff to compile time
Small x86 JIT compiler speedup.
2020-09-18 20:51:25 +02:00
SChernykh
8d1168385a RandomX: returned old soft AES impl and auto-select between the two 2020-09-15 20:48:27 +02:00
cohcho
30be1cd102 reserve at most 1 bit for wrapping detection 2020-09-13 18:42:16 +00:00
SChernykh
a05393727c RandomX: added performance profiler (for developers)
Also optimized Blake2b SSE4.1 code size to avoid code cache pollution.
2020-09-12 23:07:52 +02:00
xmrig
adf833b60a
Merge pull request #1827 from cohcho/nonce_iteration_without_tests
nonce iteration optimization
2020-09-10 19:33:23 +07:00
SChernykh
4a9db89527 RandomX: added SSE4.1-optimized Blake2b
+0.15% on `rx/0`
+0.3% on `rx/wow`
2020-09-10 14:28:40 +02:00
cohcho
060c1af4c4 fix nonce mask 2020-09-09 19:39:52 +00:00
cohcho
b826985d05 nonce iteration optimization
efficient and correct nonce iteration without duplicates
2020-09-09 10:03:37 +00:00
SChernykh
a84b45b1bb RandomX: added parameter for scratchpad prefetch mode
`scratchpad_prefetch_mode` can have 4 values:
0: off
1: use `prefetcht0` instruction (default, same as previous XMRig versions)
2: use `prefetchnta` instruction (faster on Coffee Lake and a few other CPUs)
3: use `mov` instruction
2020-09-04 16:16:07 +02:00
XMRig
72c8404d18
Fix compile warnings. 2020-08-24 10:04:46 +07:00
XMRig
879e160ba3
Fix compile warning. 2020-08-23 14:22:08 +07:00
XMRig
3e4bf8cd6c
Fix compile warning 2020-08-17 06:08:14 +07:00
XMRig
00b4ae9c36
Fixed compile warning and updated build.uv.sh. 2020-08-16 16:03:27 +07:00
SChernykh
5926dee354 RandomX JIT: optimized address mask calculation 2020-08-12 16:45:16 +02:00
XMRig
ae3ff0f570
Fixed RandomX cache initialization if 1GB pages fails to allocate on a first NUMA node. 2020-08-01 12:30:02 +07:00
SChernykh
abb78302b8 Try to allocate scratchpad from dataset's 1 GB huge pages, if normal huge pages are not available 2020-07-31 13:37:22 +02:00
SChernykh
838cc08680 Force 2 MB pages size in allocateLargePagesMemory() on Linux 2020-07-31 09:55:49 +02:00
XMRig
1acd88ed39
Cleanup 2020-07-22 21:27:40 +07:00
SChernykh
5bc89fdc8b Fixed RandomX initialization for VS debug builds 2020-07-21 10:10:07 +02:00
XMRig
70c7f33a20
Added command line options --cache-qos (--randomx-cache-qos) and --argon2-impl (--cpu-argon2-impl). 2020-07-20 09:17:59 +07:00
XMRig
e0eed7d5d6
Fixed build without MSR support. 2020-07-16 05:15:35 +07:00
SChernykh
1bf159d1e8 Removed cache QoS warning at exit on unsupported CPUs 2020-07-13 20:43:49 +02:00
SChernykh
72c385c870 Cache QoS: fix for seting MSR 2020-07-13 20:30:44 +02:00
SChernykh
c83429c55c RandomX: added cache QoS support
False by default. If set to true, all non-mining CPU cores will not have access to L3 cache.
2020-07-13 17:23:18 +02:00
Jim Huang
b665d2d865 Adopt new SSE2NEON and reduce ARM-specific changes
This patch updated SSE2NEON [1], which contains more functions
provided by Intel intrinsics, only implemented with NEON-based
counterparts to produce the exact semantics of the intrinsics.
Consequently, ARM-specific changes against CryptoNight_arm can
be reduced as well.

[1] https://github.com/DLTcollab/sse2neon/
2020-07-11 01:55:11 +08:00
SChernykh
3d740e81a2 RandomX: tweaked Ryzen code
Very small speedup
2020-07-05 16:06:59 +02:00
SChernykh
59313d9cc3 Print error message when MSR mod fails
Make sure user knows that hashrate is worse than it could be.
2020-06-26 19:54:06 +02:00
SChernykh
5724d8beb6 KawPow: optimized CPU share verification
- 2 times faster CPU share verification (11 -> 5 ms)
- 1.5 times faster light cache initialization
2020-06-26 12:31:26 +02:00
SChernykh
dc0aee1432 KawPow: fixed crash on old CPUs
- Use `popcnt` instruction only when it's supported
2020-06-10 21:49:43 +02:00
XMRig
dbc8e20e53
Merge branch 'dev' into evo 2020-06-07 21:25:31 +07:00
SChernykh
75c57f7563 Fixed GCC 10.1 issues
- Fixed uninitialized `state->x` warning
- Fixed broken code with `-O3` or `-Ofast`
2020-06-07 16:23:17 +02:00
XMRig
5e1199ea48
Merge branch 'dev' into evo 2020-06-07 20:15:12 +07:00
Matt Smith
a28bddcbdf Stop linker from making stack executable
Add .note.GNU-stack section to end of AstroBWT ASM.

Signed-off-by: Matt Smith <matt@offtopica.uk>
2020-06-07 13:57:37 +01:00
SChernykh
7f00cb59d2 Conceal (CCX) support 2020-06-07 01:01:45 +02:00
XMRig
f18bfeb77d
Merge branch 'evo' of https://github.com/SChernykh/xmrig into pr1713 2020-06-05 19:17:01 +07:00
SChernykh
0dbf41f761 Reduced memory for KawPow 2020-06-05 14:01:49 +02:00
SChernykh
2e3d087750 Merge remote-tracking branch 'upstream/evo' into evo 2020-05-28 22:06:10 +02:00
SChernykh
6676126376 Fixed hashrate and diff display for KawPow 2020-05-28 22:03:28 +02:00
XMRig
eb1ed497e7
Log cleanup. 2020-05-29 02:11:29 +07:00