Commit graph

513 commits

Author SHA1 Message Date
xmrig
ee603ab9e2
Merge pull request #1857 from SChernykh/dev
RandomX: isolate SSE4.1 code to fix crashes on old CPUs
2020-09-27 16:47:56 +07:00
SChernykh
84f8a0dc54 RandomX: isolate SSE4.1 code to fix crashes on old CPUs 2020-09-27 11:46:32 +02:00
cohcho
9be3b69109 soft_aes: fix previous optimization
the best order of hash/fill/prefetch depends on hw/soft AES
only hw AES is faster after previous optimization
2020-09-25 15:26:19 +00:00
SChernykh
1e26e58660 Fix for ARM compilation 2020-09-23 11:44:08 +02:00
SChernykh
9768bf65d1 RandomX improved performance of GCC compiled binaries
JIT compilator was slower compared to MSVC compiled binary. Up to +0.1% speedup on rx/wow in Linux.
2020-09-22 13:48:11 +02:00
SChernykh
891a46382e RandomX: AES improvements
- A bit faster hardware AES code when compiled with MSVC
- More reliable software AES benchmark
2020-09-21 17:51:08 +02:00
SChernykh
c7476e076b RandomX refactoring, moved more stuff to compile time
Small x86 JIT compiler speedup.
2020-09-18 20:51:25 +02:00
SChernykh
8d1168385a RandomX: returned old soft AES impl and auto-select between the two 2020-09-15 20:48:27 +02:00
cohcho
30be1cd102 reserve at most 1 bit for wrapping detection 2020-09-13 18:42:16 +00:00
SChernykh
a05393727c RandomX: added performance profiler (for developers)
Also optimized Blake2b SSE4.1 code size to avoid code cache pollution.
2020-09-12 23:07:52 +02:00
xmrig
adf833b60a
Merge pull request #1827 from cohcho/nonce_iteration_without_tests
nonce iteration optimization
2020-09-10 19:33:23 +07:00
SChernykh
4a9db89527 RandomX: added SSE4.1-optimized Blake2b
+0.15% on `rx/0`
+0.3% on `rx/wow`
2020-09-10 14:28:40 +02:00
cohcho
060c1af4c4 fix nonce mask 2020-09-09 19:39:52 +00:00
cohcho
b826985d05 nonce iteration optimization
efficient and correct nonce iteration without duplicates
2020-09-09 10:03:37 +00:00
SChernykh
a84b45b1bb RandomX: added parameter for scratchpad prefetch mode
`scratchpad_prefetch_mode` can have 4 values:
0: off
1: use `prefetcht0` instruction (default, same as previous XMRig versions)
2: use `prefetchnta` instruction (faster on Coffee Lake and a few other CPUs)
3: use `mov` instruction
2020-09-04 16:16:07 +02:00
XMRig
72c8404d18
Fix compile warnings. 2020-08-24 10:04:46 +07:00
XMRig
879e160ba3
Fix compile warning. 2020-08-23 14:22:08 +07:00
XMRig
3e4bf8cd6c
Fix compile warning 2020-08-17 06:08:14 +07:00
XMRig
00b4ae9c36
Fixed compile warning and updated build.uv.sh. 2020-08-16 16:03:27 +07:00
SChernykh
5926dee354 RandomX JIT: optimized address mask calculation 2020-08-12 16:45:16 +02:00
XMRig
ae3ff0f570
Fixed RandomX cache initialization if 1GB pages fails to allocate on a first NUMA node. 2020-08-01 12:30:02 +07:00
SChernykh
abb78302b8 Try to allocate scratchpad from dataset's 1 GB huge pages, if normal huge pages are not available 2020-07-31 13:37:22 +02:00
SChernykh
838cc08680 Force 2 MB pages size in allocateLargePagesMemory() on Linux 2020-07-31 09:55:49 +02:00
XMRig
1acd88ed39
Cleanup 2020-07-22 21:27:40 +07:00
SChernykh
5bc89fdc8b Fixed RandomX initialization for VS debug builds 2020-07-21 10:10:07 +02:00
XMRig
70c7f33a20
Added command line options --cache-qos (--randomx-cache-qos) and --argon2-impl (--cpu-argon2-impl). 2020-07-20 09:17:59 +07:00
XMRig
e0eed7d5d6
Fixed build without MSR support. 2020-07-16 05:15:35 +07:00
SChernykh
1bf159d1e8 Removed cache QoS warning at exit on unsupported CPUs 2020-07-13 20:43:49 +02:00
SChernykh
72c385c870 Cache QoS: fix for seting MSR 2020-07-13 20:30:44 +02:00
SChernykh
c83429c55c RandomX: added cache QoS support
False by default. If set to true, all non-mining CPU cores will not have access to L3 cache.
2020-07-13 17:23:18 +02:00
Jim Huang
b665d2d865 Adopt new SSE2NEON and reduce ARM-specific changes
This patch updated SSE2NEON [1], which contains more functions
provided by Intel intrinsics, only implemented with NEON-based
counterparts to produce the exact semantics of the intrinsics.
Consequently, ARM-specific changes against CryptoNight_arm can
be reduced as well.

[1] https://github.com/DLTcollab/sse2neon/
2020-07-11 01:55:11 +08:00
SChernykh
3d740e81a2 RandomX: tweaked Ryzen code
Very small speedup
2020-07-05 16:06:59 +02:00
SChernykh
59313d9cc3 Print error message when MSR mod fails
Make sure user knows that hashrate is worse than it could be.
2020-06-26 19:54:06 +02:00
SChernykh
5724d8beb6 KawPow: optimized CPU share verification
- 2 times faster CPU share verification (11 -> 5 ms)
- 1.5 times faster light cache initialization
2020-06-26 12:31:26 +02:00
SChernykh
dc0aee1432 KawPow: fixed crash on old CPUs
- Use `popcnt` instruction only when it's supported
2020-06-10 21:49:43 +02:00
XMRig
dbc8e20e53
Merge branch 'dev' into evo 2020-06-07 21:25:31 +07:00
SChernykh
75c57f7563 Fixed GCC 10.1 issues
- Fixed uninitialized `state->x` warning
- Fixed broken code with `-O3` or `-Ofast`
2020-06-07 16:23:17 +02:00
XMRig
5e1199ea48
Merge branch 'dev' into evo 2020-06-07 20:15:12 +07:00
Matt Smith
a28bddcbdf Stop linker from making stack executable
Add .note.GNU-stack section to end of AstroBWT ASM.

Signed-off-by: Matt Smith <matt@offtopica.uk>
2020-06-07 13:57:37 +01:00
SChernykh
7f00cb59d2 Conceal (CCX) support 2020-06-07 01:01:45 +02:00
XMRig
f18bfeb77d
Merge branch 'evo' of https://github.com/SChernykh/xmrig into pr1713 2020-06-05 19:17:01 +07:00
SChernykh
0dbf41f761 Reduced memory for KawPow 2020-06-05 14:01:49 +02:00
SChernykh
2e3d087750 Merge remote-tracking branch 'upstream/evo' into evo 2020-05-28 22:06:10 +02:00
SChernykh
6676126376 Fixed hashrate and diff display for KawPow 2020-05-28 22:03:28 +02:00
XMRig
eb1ed497e7
Log cleanup. 2020-05-29 02:11:29 +07:00
XMRig
7a3233ab4b
Use long tags. 2020-05-28 20:32:41 +07:00
SChernykh
22b937cc1c KawPow WIP 2020-05-27 16:19:57 +02:00
XMRig
0a7324f500
Merge branch 'dev' 2020-05-23 11:08:53 +07:00
Bohan Yu
a797d808b5 Change cases of Windows include file and link library
When cross-compiling on case sensitive systems, such as Linux, there will be an Error.
2020-05-13 21:00:52 +08:00
XMRig
2e34bf7a1b
Removed unnecessary check. 2020-05-09 01:36:57 +07:00
XMRig
7f31f45b6d
Fix build. 2020-05-09 01:26:05 +07:00
XMRig
3cbf0dc0ee
Removed code duplicate. 2020-05-09 01:13:46 +07:00
XMRig
c828e6b793
Code cleanup. 2020-05-05 01:55:00 +07:00
XMRig
b34e3e1a7b
Remove unused code. 2020-05-04 02:07:38 +07:00
SChernykh
80d944bf82 Optimized RandomX dataset initialization
- Use single Argon2 implemenation
- Auto-select the fastest Argon2 implementation for RandomX
2020-05-03 20:44:59 +02:00
XMRig
c18478a6b4
Small cleanups. 2020-05-03 13:38:34 +07:00
XMRig
8aeba61706
Add 3rdparty prefix to all rapidjson includes. 2020-04-29 14:55:04 +07:00
XMRig
0cc90b152d
Move CnAlgo 2020-04-23 12:34:26 +07:00
SChernykh
bfd017d064 Refactored CFROUND 2020-04-21 15:44:04 +02:00
SChernykh
680e4dd865 Fix code style 2020-04-09 14:31:42 +02:00
SChernykh
abb3340cc7 RandomX JIT refactoring
- Smaller memory footprint
- A bit faster overall
2020-04-09 14:24:54 +02:00
SChernykh
92810ad761 Fixed VM destruction 2020-04-08 08:31:53 +02:00
SChernykh
39bd3ca1da Fix off-by-one error 2020-04-07 18:53:08 +02:00
SChernykh
4d0edde66d Fixed pool lock 2020-04-07 18:48:02 +02:00
SChernykh
69cbfd682a Use node number instead of affinity 2020-04-07 18:46:22 +02:00
SChernykh
6ae37a9519 Pooled allocation of RandomX VMs
+0.5% speedup on Zen2 when the whole L3 cache is used (16 threads on 3700X/3800X, 32 threads on 3950X).
2020-04-07 18:31:35 +02:00
SChernykh
539943c655 Fix MacOS compilation 2020-03-11 16:35:52 +01:00
SChernykh
e22f798085 AVX2 optimized code for AstroBWT
Added "astrobwt-avx2" parameter in config.json, it's turned off ("false") by default.

4-5% speedup on CPUs with proper AVX2 support (AMD Ryzen starting with Zen2, Intel Core starting with Haswell).

There will be no speedup on the following CPUs:

- Intel Pentium/Celeron don't support AVX2
- AMD Zen/Zen+ have only half-speed AVX

GCC compiled version is faster without AVX2, MSVC compiled version is faster with AVX2
2020-03-10 22:35:14 +01:00
SChernykh
9405d8ed92 Activate MSR mod only for RandomX algorithms 2020-03-09 19:10:26 +01:00
XMRig
16a83a9f61
Move files. 2020-03-09 01:22:34 +07:00
SChernykh
b7840d9ab6 Fixed invalid AstroBWT hashes after algo switching 2020-03-07 16:41:33 +01:00
XMRig
13ac54ada9
v5.9.0-dev 2020-03-07 21:27:55 +07:00
XMRig
1f36ea2a8e
Added "coin": "keva" and post PR cleanup. 2020-03-07 20:38:44 +07:00
XMRig
ab90af37b3
Merge branch 'master' of https://github.com/kevacoin-project/xmrig into feature-rx-keva 2020-03-07 17:13:08 +07:00
SChernykh
05dc9821c5 Fixed compilation withut randomx/argon2 2020-03-06 07:22:57 +01:00
SChernykh
eeadea53e2 AstroBWT 20-50% speedup
Skips hashes with large stage 2 size. Added configurable `astrobwt-max-size` parameter, default value is 550, min 400, max 1200, optimal value ranges from 500 to 600 depending on CPU.

- Intel CPUs get 20-25% speedup
- 1st- and 2nd-gen Ryzens get 30% speedup
- 3rd-gen Ryzens get up to 50% speedup
2020-03-05 12:20:21 +01:00
kevacoin
0528ccd01e Added Keva. 2020-03-04 16:23:33 -08:00
XMRig
8dc87576c5
Sync changes with proxy. 2020-03-01 14:04:58 +07:00
XMRig
616c52f266
#1572 Fix compile warning. 2020-03-01 11:59:53 +07:00
XMRig
cdd9ea2496
Make "astrobwt" as primary user visible algorithm name. 2020-03-01 10:21:29 +07:00
SChernykh
14ef99ca67 AstroBWT algorithm (DERO) support
To test:

- Download https://github.com/deroproject/derosuite/releases/tag/AstroBWT
- Run daemon with `--testnet` in command line

In config.json:
- "coin":"dero"
- "url":"127.0.0.1:30306"
- "daemon:"true"
2020-02-29 22:41:24 +01:00
SChernykh
131085be80 Optimized CFROUND
Shorter version using BMI2 instructionns
2020-02-21 19:00:58 +01:00
SChernykh
e1b8f52e59 Fixed 32-bit compilation 2020-02-21 16:08:23 +01:00
SChernykh
0caeb41bff Tuned JIT compiler
0.3-0.4% speedup depending on CPU.
2020-02-20 20:59:22 +01:00
XMRig
c307433900
Fixed nicehash nonce overflow for CPU backend. 2020-02-06 17:19:08 +07:00
xmrig
9c8da1d4d3
Merge pull request #1529 from SChernykh/dev
Crash fix for Bullodzer CPUs
2020-02-02 23:19:49 +07:00
SChernykh
ffc9f67751 Crash fix for Bullodzer CPUs 2020-02-02 17:16:59 +01:00
XMRig
030d6e5962
Update year. 2020-02-01 20:24:00 +07:00
SChernykh
4571899664 Removed MSR mod for Bulldozer
It turned out to be useless: https://www.reddit.com/r/MoneroMining/comments/et7s7w/psa_amd_opteronfxa6a8a10_owners_needed_to_test/
2020-01-27 09:39:39 +01:00
SChernykh
cd763be05b Fix compile error 2020-01-24 14:09:07 +01:00
SChernykh
42a7194e93 Fix crash on Linux 2020-01-24 13:34:12 +01:00
SChernykh
9f1753cc4f Optimized CFROUND 2020-01-22 20:11:00 +01:00
SChernykh
d342968211 Added support for BMI2 instructions 2020-01-21 19:44:56 +01:00
SChernykh
f80177cbd3 Optimizations for AMD Bulldozer
- Added support for XOP instructions
- Enabled Ryzen code for Bulldozer because it's faster there too
2020-01-15 13:04:26 +01:00
SChernykh
665e43fecc MSR preset for Bulldozer CPUs
Also fixed verbose output for MSR presets with masks.
2020-01-14 19:27:34 +01:00
SChernykh
73722ce186 JIT compiler: removed unnecessary memcpy from generateProgram() 2020-01-13 18:00:41 +01:00
SChernykh
869209389e Update MSR preset for Intel
As per https://github.com/xmrig/xmrig/issues/1433#issuecomment-572126184
2020-01-09 08:10:36 +01:00
SChernykh
eb20dfbc94 JIT compiler tweaks 2020-01-06 13:57:48 +01:00
XMRig
88ff807700
Fix compile warnings. 2020-01-03 19:11:48 +07:00
SChernykh
c9f90e6770 Refactor Ryzen fix to fix compilation issues 2019-12-31 11:55:07 +02:00