cohcho
a705ab775b
RandomX: align args
...
tempHash/output must be 16-byte aligned for randomx_calculate_hash{,_first,_next}
2020-10-07 14:47:18 +00:00
cohcho
c710ee5fb5
RxVM: fix compilation error
2020-10-07 09:27:25 +00:00
SChernykh
a8466a139c
RandomX: allocate 2 MB pages for generated code, if possible
...
+0.2% boost on Ryzen 7 3700X
2020-10-07 10:35:10 +02:00
xmrig
ba47219185
Merge pull request #1870 from cohcho/fix_miner_state_machine
...
Miner: fix state machine
2020-10-07 12:25:17 +07:00
cohcho
fa5b872782
RxVm: fix randomx_create_vm call
...
randomx_create_vm requires either cache or dataset, but not both
2020-10-06 19:45:43 +00:00
cohcho
7bdeba4d08
Nonce: refactor static init
2020-10-06 13:34:19 +00:00
xmrig
116fb3d3f9
Merge pull request #1864 from cohcho/soft_aes_optimization2
...
soft_aes: fix previous optimization
2020-10-05 12:20:41 +07:00
cohcho
5f0f2506e8
soft_aes: fix previous optimization
...
Previously removed unrolled variant is faster on some CPUs
Some CPUs are faster with added unrolled variant
The best variant depends on number of threads on some CPUs
2020-10-04 14:47:58 +00:00
SChernykh
ebf259fa7c
RandomX: removed rx/loki
...
Loki forks to PoS on October 9th.
2020-10-02 17:02:52 +02:00
XMRig
d45bb24a32
Renamed WITH_SSE to WITH_SSE4_1 and make it work on all platforms.
2020-10-01 11:00:08 +07:00
SChernykh
7b4f768114
RandomX: optimized soft AES code
...
Unrolled loop was 5-10% slower depending on CPU.
2020-09-29 21:22:11 +02:00
xmrig
dfab81e9fa
Merge pull request #1858 from SChernykh/dev
...
RandomX: removed duplicate constants in Blake2b
2020-09-27 16:51:03 +07:00
SChernykh
3025c265e8
RandomX: removed duplicate constatns in Blake2b
2020-09-27 11:50:08 +02:00
xmrig
ee603ab9e2
Merge pull request #1857 from SChernykh/dev
...
RandomX: isolate SSE4.1 code to fix crashes on old CPUs
2020-09-27 16:47:56 +07:00
SChernykh
84f8a0dc54
RandomX: isolate SSE4.1 code to fix crashes on old CPUs
2020-09-27 11:46:32 +02:00
cohcho
9be3b69109
soft_aes: fix previous optimization
...
the best order of hash/fill/prefetch depends on hw/soft AES
only hw AES is faster after previous optimization
2020-09-25 15:26:19 +00:00
SChernykh
1e26e58660
Fix for ARM compilation
2020-09-23 11:44:08 +02:00
SChernykh
9768bf65d1
RandomX improved performance of GCC compiled binaries
...
JIT compilator was slower compared to MSVC compiled binary. Up to +0.1% speedup on rx/wow in Linux.
2020-09-22 13:48:11 +02:00
SChernykh
891a46382e
RandomX: AES improvements
...
- A bit faster hardware AES code when compiled with MSVC
- More reliable software AES benchmark
2020-09-21 17:51:08 +02:00
SChernykh
c7476e076b
RandomX refactoring, moved more stuff to compile time
...
Small x86 JIT compiler speedup.
2020-09-18 20:51:25 +02:00
SChernykh
8d1168385a
RandomX: returned old soft AES impl and auto-select between the two
2020-09-15 20:48:27 +02:00
cohcho
30be1cd102
reserve at most 1 bit for wrapping detection
2020-09-13 18:42:16 +00:00
SChernykh
a05393727c
RandomX: added performance profiler (for developers)
...
Also optimized Blake2b SSE4.1 code size to avoid code cache pollution.
2020-09-12 23:07:52 +02:00
xmrig
adf833b60a
Merge pull request #1827 from cohcho/nonce_iteration_without_tests
...
nonce iteration optimization
2020-09-10 19:33:23 +07:00
SChernykh
4a9db89527
RandomX: added SSE4.1-optimized Blake2b
...
+0.15% on `rx/0`
+0.3% on `rx/wow`
2020-09-10 14:28:40 +02:00
cohcho
060c1af4c4
fix nonce mask
2020-09-09 19:39:52 +00:00
cohcho
b826985d05
nonce iteration optimization
...
efficient and correct nonce iteration without duplicates
2020-09-09 10:03:37 +00:00
SChernykh
a84b45b1bb
RandomX: added parameter for scratchpad prefetch mode
...
`scratchpad_prefetch_mode` can have 4 values:
0: off
1: use `prefetcht0` instruction (default, same as previous XMRig versions)
2: use `prefetchnta` instruction (faster on Coffee Lake and a few other CPUs)
3: use `mov` instruction
2020-09-04 16:16:07 +02:00
XMRig
72c8404d18
Fix compile warnings.
2020-08-24 10:04:46 +07:00
XMRig
879e160ba3
Fix compile warning.
2020-08-23 14:22:08 +07:00
XMRig
3e4bf8cd6c
Fix compile warning
2020-08-17 06:08:14 +07:00
XMRig
00b4ae9c36
Fixed compile warning and updated build.uv.sh.
2020-08-16 16:03:27 +07:00
SChernykh
5926dee354
RandomX JIT: optimized address mask calculation
2020-08-12 16:45:16 +02:00
XMRig
ae3ff0f570
Fixed RandomX cache initialization if 1GB pages fails to allocate on a first NUMA node.
2020-08-01 12:30:02 +07:00
SChernykh
abb78302b8
Try to allocate scratchpad from dataset's 1 GB huge pages, if normal huge pages are not available
2020-07-31 13:37:22 +02:00
SChernykh
838cc08680
Force 2 MB pages size in allocateLargePagesMemory() on Linux
2020-07-31 09:55:49 +02:00
XMRig
1acd88ed39
Cleanup
2020-07-22 21:27:40 +07:00
SChernykh
5bc89fdc8b
Fixed RandomX initialization for VS debug builds
2020-07-21 10:10:07 +02:00
XMRig
70c7f33a20
Added command line options --cache-qos (--randomx-cache-qos) and --argon2-impl (--cpu-argon2-impl).
2020-07-20 09:17:59 +07:00
XMRig
e0eed7d5d6
Fixed build without MSR support.
2020-07-16 05:15:35 +07:00
SChernykh
1bf159d1e8
Removed cache QoS warning at exit on unsupported CPUs
2020-07-13 20:43:49 +02:00
SChernykh
72c385c870
Cache QoS: fix for seting MSR
2020-07-13 20:30:44 +02:00
SChernykh
c83429c55c
RandomX: added cache QoS support
...
False by default. If set to true, all non-mining CPU cores will not have access to L3 cache.
2020-07-13 17:23:18 +02:00
Jim Huang
b665d2d865
Adopt new SSE2NEON and reduce ARM-specific changes
...
This patch updated SSE2NEON [1], which contains more functions
provided by Intel intrinsics, only implemented with NEON-based
counterparts to produce the exact semantics of the intrinsics.
Consequently, ARM-specific changes against CryptoNight_arm can
be reduced as well.
[1] https://github.com/DLTcollab/sse2neon/
2020-07-11 01:55:11 +08:00
SChernykh
3d740e81a2
RandomX: tweaked Ryzen code
...
Very small speedup
2020-07-05 16:06:59 +02:00
SChernykh
59313d9cc3
Print error message when MSR mod fails
...
Make sure user knows that hashrate is worse than it could be.
2020-06-26 19:54:06 +02:00
SChernykh
5724d8beb6
KawPow: optimized CPU share verification
...
- 2 times faster CPU share verification (11 -> 5 ms)
- 1.5 times faster light cache initialization
2020-06-26 12:31:26 +02:00
SChernykh
dc0aee1432
KawPow: fixed crash on old CPUs
...
- Use `popcnt` instruction only when it's supported
2020-06-10 21:49:43 +02:00
XMRig
dbc8e20e53
Merge branch 'dev' into evo
2020-06-07 21:25:31 +07:00
SChernykh
75c57f7563
Fixed GCC 10.1 issues
...
- Fixed uninitialized `state->x` warning
- Fixed broken code with `-O3` or `-Ofast`
2020-06-07 16:23:17 +02:00
XMRig
5e1199ea48
Merge branch 'dev' into evo
2020-06-07 20:15:12 +07:00
Matt Smith
a28bddcbdf
Stop linker from making stack executable
...
Add .note.GNU-stack section to end of AstroBWT ASM.
Signed-off-by: Matt Smith <matt@offtopica.uk>
2020-06-07 13:57:37 +01:00
SChernykh
7f00cb59d2
Conceal (CCX) support
2020-06-07 01:01:45 +02:00
XMRig
f18bfeb77d
Merge branch 'evo' of https://github.com/SChernykh/xmrig into pr1713
2020-06-05 19:17:01 +07:00
SChernykh
0dbf41f761
Reduced memory for KawPow
2020-06-05 14:01:49 +02:00
SChernykh
2e3d087750
Merge remote-tracking branch 'upstream/evo' into evo
2020-05-28 22:06:10 +02:00
SChernykh
6676126376
Fixed hashrate and diff display for KawPow
2020-05-28 22:03:28 +02:00
XMRig
eb1ed497e7
Log cleanup.
2020-05-29 02:11:29 +07:00
XMRig
7a3233ab4b
Use long tags.
2020-05-28 20:32:41 +07:00
SChernykh
22b937cc1c
KawPow WIP
2020-05-27 16:19:57 +02:00
XMRig
0a7324f500
Merge branch 'dev'
2020-05-23 11:08:53 +07:00
Bohan Yu
a797d808b5
Change cases of Windows include file and link library
...
When cross-compiling on case sensitive systems, such as Linux, there will be an Error.
2020-05-13 21:00:52 +08:00
XMRig
2e34bf7a1b
Removed unnecessary check.
2020-05-09 01:36:57 +07:00
XMRig
7f31f45b6d
Fix build.
2020-05-09 01:26:05 +07:00
XMRig
3cbf0dc0ee
Removed code duplicate.
2020-05-09 01:13:46 +07:00
XMRig
c828e6b793
Code cleanup.
2020-05-05 01:55:00 +07:00
XMRig
b34e3e1a7b
Remove unused code.
2020-05-04 02:07:38 +07:00
SChernykh
80d944bf82
Optimized RandomX dataset initialization
...
- Use single Argon2 implemenation
- Auto-select the fastest Argon2 implementation for RandomX
2020-05-03 20:44:59 +02:00
XMRig
c18478a6b4
Small cleanups.
2020-05-03 13:38:34 +07:00
XMRig
8aeba61706
Add 3rdparty prefix to all rapidjson includes.
2020-04-29 14:55:04 +07:00
XMRig
0cc90b152d
Move CnAlgo
2020-04-23 12:34:26 +07:00
SChernykh
bfd017d064
Refactored CFROUND
2020-04-21 15:44:04 +02:00
SChernykh
680e4dd865
Fix code style
2020-04-09 14:31:42 +02:00
SChernykh
abb3340cc7
RandomX JIT refactoring
...
- Smaller memory footprint
- A bit faster overall
2020-04-09 14:24:54 +02:00
SChernykh
92810ad761
Fixed VM destruction
2020-04-08 08:31:53 +02:00
SChernykh
39bd3ca1da
Fix off-by-one error
2020-04-07 18:53:08 +02:00
SChernykh
4d0edde66d
Fixed pool lock
2020-04-07 18:48:02 +02:00
SChernykh
69cbfd682a
Use node number instead of affinity
2020-04-07 18:46:22 +02:00
SChernykh
6ae37a9519
Pooled allocation of RandomX VMs
...
+0.5% speedup on Zen2 when the whole L3 cache is used (16 threads on 3700X/3800X, 32 threads on 3950X).
2020-04-07 18:31:35 +02:00
SChernykh
539943c655
Fix MacOS compilation
2020-03-11 16:35:52 +01:00
SChernykh
e22f798085
AVX2 optimized code for AstroBWT
...
Added "astrobwt-avx2" parameter in config.json, it's turned off ("false") by default.
4-5% speedup on CPUs with proper AVX2 support (AMD Ryzen starting with Zen2, Intel Core starting with Haswell).
There will be no speedup on the following CPUs:
- Intel Pentium/Celeron don't support AVX2
- AMD Zen/Zen+ have only half-speed AVX
GCC compiled version is faster without AVX2, MSVC compiled version is faster with AVX2
2020-03-10 22:35:14 +01:00
SChernykh
9405d8ed92
Activate MSR mod only for RandomX algorithms
2020-03-09 19:10:26 +01:00
XMRig
16a83a9f61
Move files.
2020-03-09 01:22:34 +07:00
SChernykh
b7840d9ab6
Fixed invalid AstroBWT hashes after algo switching
2020-03-07 16:41:33 +01:00
XMRig
13ac54ada9
v5.9.0-dev
2020-03-07 21:27:55 +07:00
XMRig
1f36ea2a8e
Added "coin": "keva"
and post PR cleanup.
2020-03-07 20:38:44 +07:00
XMRig
ab90af37b3
Merge branch 'master' of https://github.com/kevacoin-project/xmrig into feature-rx-keva
2020-03-07 17:13:08 +07:00
SChernykh
05dc9821c5
Fixed compilation withut randomx/argon2
2020-03-06 07:22:57 +01:00
SChernykh
eeadea53e2
AstroBWT 20-50% speedup
...
Skips hashes with large stage 2 size. Added configurable `astrobwt-max-size` parameter, default value is 550, min 400, max 1200, optimal value ranges from 500 to 600 depending on CPU.
- Intel CPUs get 20-25% speedup
- 1st- and 2nd-gen Ryzens get 30% speedup
- 3rd-gen Ryzens get up to 50% speedup
2020-03-05 12:20:21 +01:00
kevacoin
0528ccd01e
Added Keva.
2020-03-04 16:23:33 -08:00
XMRig
8dc87576c5
Sync changes with proxy.
2020-03-01 14:04:58 +07:00
XMRig
616c52f266
#1572 Fix compile warning.
2020-03-01 11:59:53 +07:00
XMRig
cdd9ea2496
Make "astrobwt" as primary user visible algorithm name.
2020-03-01 10:21:29 +07:00
SChernykh
14ef99ca67
AstroBWT algorithm (DERO) support
...
To test:
- Download https://github.com/deroproject/derosuite/releases/tag/AstroBWT
- Run daemon with `--testnet` in command line
In config.json:
- "coin":"dero"
- "url":"127.0.0.1:30306"
- "daemon:"true"
2020-02-29 22:41:24 +01:00
SChernykh
131085be80
Optimized CFROUND
...
Shorter version using BMI2 instructionns
2020-02-21 19:00:58 +01:00
SChernykh
e1b8f52e59
Fixed 32-bit compilation
2020-02-21 16:08:23 +01:00
SChernykh
0caeb41bff
Tuned JIT compiler
...
0.3-0.4% speedup depending on CPU.
2020-02-20 20:59:22 +01:00
XMRig
c307433900
Fixed nicehash nonce overflow for CPU backend.
2020-02-06 17:19:08 +07:00
xmrig
9c8da1d4d3
Merge pull request #1529 from SChernykh/dev
...
Crash fix for Bullodzer CPUs
2020-02-02 23:19:49 +07:00
SChernykh
ffc9f67751
Crash fix for Bullodzer CPUs
2020-02-02 17:16:59 +01:00