Commit graph

270 commits

Author SHA1 Message Date
Luke Parker
e4e4245ee3
One Round DKG (#589)
* Upstream GBP, divisor, circuit abstraction, and EC gadgets from FCMP++

* Initial eVRF implementation

Not quite done yet. It needs to communicate the resulting points and proofs to
extract them from the Pedersen Commitments in order to return those, and then
be tested.

* Add the openings of the PCs to the eVRF as necessary

* Add implementation of secq256k1

* Make DKG Encryption a bit more flexible

No longer requires the use of an EncryptionKeyMessage, and allows pre-defined
keys for encryption.

* Make NUM_BITS an argument for the field macro

* Have the eVRF take a Zeroizing private key

* Initial eVRF-based DKG

* Add embedwards25519 curve

* Inline the eVRF into the DKG library

Due to how we're handling share encryption, we'd either need two circuits or to
dedicate this circuit to the DKG. The latter makes sense at this time.

* Add documentation to the eVRF-based DKG

* Add paragraph claiming robustness

* Update to the new eVRF proof

* Finish routing the eVRF functionality

Still needs errors and serialization, along with a few other TODOs.

* Add initial eVRF DKG test

* Improve eVRF DKG

Updates how we calculcate verification shares, improves performance when
extracting multiple sets of keys, and adds more to the test for it.

* Start using a proper error for the eVRF DKG

* Resolve various TODOs

Supports recovering multiple key shares from the eVRF DKG.

Inlines two loops to save 2**16 iterations.

Adds support for creating a constant time representation of scalars < NUM_BITS.

* Ban zero ECDH keys, document non-zero requirements

* Implement eVRF traits, all the way up to the DKG, for secp256k1/ed25519

* Add Ristretto eVRF trait impls

* Support participating multiple times in the eVRF DKG

* Only participate once per key, not once per key share

* Rewrite processor key-gen around the eVRF DKG

Still a WIP.

* Finish routing the new key gen in the processor

Doesn't touch the tests, coordinator, nor Substrate yet.
`cargo +nightly fmt && cargo +nightly-2024-07-01 clippy --all-features -p serai-processor`
does pass.

* Deduplicate and better document in processor key_gen

* Update serai-processor tests to the new key gen

* Correct amount of yx coefficients, get processor key gen test to pass

* Add embedded elliptic curve keys to Substrate

* Update processor key gen tests to the eVRF DKG

* Have set_keys take signature_participants, not removed_participants

Now no one is removed from the DKG. Only `t` people publish the key however.

Uses a BitVec for an efficient encoding of the participants.

* Update the coordinator binary for the new DKG

This does not yet update any tests.

* Add sensible Debug to key_gen::[Processor, Coordinator]Message

* Have the DKG explicitly declare how to interpolate its shares

Removes the hack for MuSig where we multiply keys by the inverse of their
lagrange interpolation factor.

* Replace Interpolation::None with Interpolation::Constant

Allows the MuSig DKG to keep the secret share as the original private key,
enabling deriving FROST nonces consistently regardless of the MuSig context.

* Get coordinator tests to pass

* Update spec to the new DKG

* Get clippy to pass across the repo

* cargo machete

* Add an extra sleep to ensure expected ordering of `Participation`s

* Update orchestration

* Remove bad panic in coordinator

It expected ConfirmationShare to be n-of-n, not t-of-n.

* Improve documentation on  functions

* Update TX size limit

We now no longer have to support the ridiculous case of having 49 DKG
participations within a 101-of-150 DKG. It does remain quite high due to
needing to _sign_ so many times. It'd may be optimal for parties with multiple
key shares to independently send their preprocesses/shares (despite the
overhead that'll cause with signatures and the transaction structure).

* Correct error in the Processor spec document

* Update a few comments in the validator-sets pallet

* Send/Recv Participation one at a time

Sending all, then attempting to receive all in an expected order, wasn't working
even with notable delays between sending messages. This points to the mempool
not working as expected...

* Correct ThresholdKeys serialization in modular-frost test

* Updating existing TX size limit test for the new DKG parameters

* Increase time allowed for the DKG on the GH CI

* Correct construction of signature_participants in serai-client tests

Fault identified by akil.

* Further contextualize DkgConfirmer by ValidatorSet

Caught by a safety check we wouldn't reuse preprocesses across messages. That
raises the question of we were prior reusing preprocesses (reusing keys)?
Except that'd have caused a variety of signing failures (suggesting we had some
staggered timing avoiding it in practice but yes, this was possible in theory).

* Add necessary calls to set_embedded_elliptic_curve_key in coordinator set rotation tests

* Correct shimmed setting of a secq256k1 key

* cargo fmt

* Don't use `[0; 32]` for the embedded keys in the coordinator rotation test

The key_gen function expects the random values already decided.

* Big-endian secq256k1 scalars

Also restores the prior, safer, Encryption::register function.
2024-09-19 21:43:26 -04:00
Luke Parker
2aac6f6998
Improve usage of constants in coordinator p2p 2024-07-17 06:54:54 -04:00
Luke Parker
e772b8a5f7
#560 take two, now that #560 has been reverted (#561)
Some checks failed
coins/ Tests / test-coins (push) Waiting to run
Coordinator Tests / build (push) Waiting to run
Full Stack Tests / build (push) Waiting to run
Lint / clippy (macos-13) (push) Waiting to run
Lint / clippy (macos-14) (push) Waiting to run
Lint / clippy (ubuntu-latest) (push) Waiting to run
Lint / clippy (windows-latest) (push) Waiting to run
Lint / deny (push) Waiting to run
Lint / fmt (push) Waiting to run
Lint / machete (push) Waiting to run
no-std build / build (push) Waiting to run
Processor Tests / build (push) Waiting to run
Reproducible Runtime / build (push) Waiting to run
Tests / test-infra (push) Waiting to run
Tests / test-substrate (push) Waiting to run
Tests / test-serai-client (push) Waiting to run
Message Queue Tests / build (push) Has been cancelled
common/ Tests / test-common (push) Has been cancelled
crypto/ Tests / test-crypto (push) Has been cancelled
* Clear upons upon round, not block

* Cache the proposal for a round

* Rebase onto develop, which reverted this PR, and re-apply this PR

* Set participation upon participation instead of constantly recalculating

* Cache message instances

* Add missing txn commit

Identified by @akildemir.

* Correct clippy lint identified upon rebase

* Fix tendermint chain sync (#581)

* fix p2p Reqres protocol

* stabilize tributary chain sync

* fix pr comments

---------

Co-authored-by: akildemir <34187742+akildemir@users.noreply.github.com>
2024-07-16 19:42:15 -04:00
Luke Parker
41ce5b1738 Use the serai_abi::Call in the actual Transaction type
We prior required they had the same encoding, yet this ensures they do by
making them one and the same. This does require an large, ugly, From/TryInto
block which is deemed preferable for moving this more and more into syntax
(from semantics).

Further improvements (notably re: Extra) is possible, and this already lets us
strip some members from the Call enum.
2024-06-03 23:38:22 -04:00
Luke Parker
2a05cf3225
June 2024 nightly update
Replaces #571.
2024-06-01 21:46:49 -04:00
Luke Parker
b39c751403
Reduce target peers a bit 2024-04-23 12:59:45 -04:00
Luke Parker
cc7202e0bf
Correct recv to try_recv when exhausting channel 2024-04-23 12:40:21 -04:00
Luke Parker
19e68f7f75
Correct selection of to-try peers to prevent infinite loops when to-try < target 2024-04-23 12:04:30 -04:00
Luke Parker
d94c9a4a5e
Use a constant for the target amount of peer 2024-04-23 11:59:51 -04:00
Luke Parker
43dc036660
Use a HashSet for which networks to try peer finding for
Prevents a flood of retries from individually failed attempts within a batch of
peer connection attempts.
2024-04-23 10:55:56 -04:00
Luke Parker
95591218bb
Remove cbor 2024-04-23 07:01:07 -04:00
Luke Parker
7dd587a864
Inline broadcast_raw now that it doesn't have multiple callers 2024-04-23 06:44:21 -04:00
Luke Parker
023275bcb6
Properly diversify ReqResMessageKind/GossipMessageKind 2024-04-23 06:37:41 -04:00
Luke Parker
8cef9eff6f
Move keep alive, heartbeat, block to request/response 2024-04-23 05:44:58 -04:00
Luke Parker
5a3ea80943
Add missing continue to prevent dialing a node we're connected to 2024-04-21 08:36:52 -04:00
Luke Parker
fddbebc7c0
Replace expect with debug log 2024-04-21 08:02:34 -04:00
Luke Parker
e01848aa9e
Correct boolean NOT on is_fresh_dial 2024-04-21 07:30:31 -04:00
Luke Parker
320b5627b5
Retry if initial dials fail, not just upon disconnect 2024-04-21 07:26:16 -04:00
Luke Parker
be7780e69d
Restart coordinator peer finding upon disconnections 2024-04-21 07:02:49 -04:00
Luke Parker
593aefd229
Extend time in sync test 2024-04-18 02:51:38 -04:00
Luke Parker
fea16df567
Only reply to heartbeats after a certain distance 2024-04-18 01:39:34 -04:00
Luke Parker
4960c3222e
Ensure we don't reply to stale heartbeats 2024-04-18 01:24:38 -04:00
Luke Parker
6b4df4f2c0
Only have some nodes respond to latent heartbeats
Also only respond if they're more than 2 blocks behind to minimize redundant
sending of blocks.
2024-04-17 21:54:10 -04:00
Luke Parker
bc44fbdbac
Add TODO to coordinator P2P 2024-03-23 23:32:21 -04:00
Luke Parker
4cacce5e55
Perform key share amortization on-chain to avoid discrepancies 2024-03-23 23:32:14 -04:00
Luke Parker
b7d49af1d5
Track total peer count in the coordinator 2024-03-23 18:02:48 -04:00
Luke Parker
4914420a37
Don't add as an explicit peer if already connected 2024-03-22 23:51:51 -04:00
Luke Parker
f11a08c436
Peer finding which won't get stuck on one specific network 2024-03-22 23:47:43 -04:00
Luke Parker
35b58a45bd
Split peer finding into a dedicated task 2024-03-22 23:40:15 -04:00
Luke Parker
0889627e60
Typo fix for prior commit 2024-03-11 02:20:51 -04:00
Luke Parker
ace41c79fd
Tidy the BlockHasEvents cache 2024-03-11 01:44:00 -04:00
Luke Parker
f7d16b3fc5
Fix 0 - 1 which caused a panic 2024-03-09 05:37:41 -05:00
Luke Parker
6374d9987e
Correct how we save the block to scan from 2024-03-09 03:48:44 -05:00
Luke Parker
c93f6bf901
Replace yield_now with sleep 100 to prevent hammering a task, despite still being over-eager 2024-03-09 03:34:31 -05:00
Luke Parker
61a81e53e1
Further optimize cosign DB 2024-03-09 03:31:06 -05:00
Luke Parker
89b237af7e
Correct the return value of block_has_events 2024-03-09 02:44:04 -05:00
Luke Parker
2347bf5fd3
Bound cosign work and ensure it progress forward even when cosigns don't occur
Should resolve the DB load observed on testnet.
2024-03-09 02:20:23 -05:00
Luke Parker
f0694172ef
Fix potential generation of invalid SignData in shim 2024-02-09 02:52:08 -05:00
akildemir
347d4cf413
Fix tendermint distinct precommit bug (#517)
* fix tendermint distinct precommit bug

* remove conflicting precommit error
2024-02-08 13:47:37 -05:00
Luke Parker
4913873b10
Slash reports (#523)
* report_slashes plumbing in Substrate

Notably delays the SetRetired event until it provides a slash report or the set
after it becomes the set to report its slashes.

* Add dedicated AcceptedHandover event

* Add SlashReport TX to Tributary

* Create SlashReport TXs

* Handle SlashReport TXs

* Add logic to generate a SlashReport to the coordinator

* Route SlashReportSigner into the processor

* Finish routing the SlashReport signing/TX publication

* Add serai feature to processor's serai-client
2024-01-29 03:48:53 -05:00
Luke Parker
f3429ec1ef
Inside publish (for a Serai transaction from the coordinator), use RetiredDb over latest session
Not only is this more performant, the definition of retired won't be if a newer
session is active. It will be if the session has posted a slash report or the
stake for that session has unlocked.

Initial commit towards implementing SlashReports.
2024-01-05 23:40:15 -05:00
Luke Parker
7eb388e546
PR to track down CI failures (#501)
* Use an extended timeout for DKGs specifically

* Add a log statement when message-queue connection fails

* Add a 60 second keep-alive to connections

* Use zalloc for processor/message-queue/coordinator

An additional layer which protects us against edge cases with Zeroizing
(objects which don't support it or don't miss it).

* Add further logs to message-queue

* Further increase re-attempt timeouts in CI

* Remove misplaced continue inmessage-queue client

Fixes observed CI failures.

* Revert "Further increase re-attempt timeouts in CI"

This reverts commit 3723530cf6.
2024-01-04 01:08:13 -05:00
Luke Parker
02776c54a8
Increase reattempt delays in the GH CI, which is extremely latent 2023-12-30 22:11:04 -05:00
Luke Parker
ec8dfd4639
Correct SignData serialization test from creating 256 signers of data
This overflows the u8 allowed and caused a CI failure. The actual
code/assumption is fine.
2023-12-30 19:08:29 -05:00
Luke Parker
b493e3e31f
Validator DHT (#494)
* Route validators for any active set through sc-authority-discovery

Additionally adds an RPC route to retrieve their P2P addresses.

* Have the coordinator get peers from substrate

* Have the RPC return one address, not up to 3

Prevents the coordinator from believing it has 3 peers when it has one.

* Add missing feature to serai-client

* Correct network argument in serai-client for p2p_validators call

* Add a test in serai-client to check DHT population with a much quicker failure than the coordinator tests

* Update to latest Substrate

Removes distinguishing BABE/AuthorityDiscovery keys which causes
sc_authority_discovery to populate as desired.

* Update to a properly tagged substrate commit

* Add all dialed to peers to GossipSub

* cargo fmt

* Reduce common code in serai-coordinator-tests with amore involved new_test

* Use a recursive async function to spawn `n` DockerTests with the necessary networking configuration

* Merge UNIQUE_ID and ONE_AT_A_TIME

* Tidy up the new recursive code in tests/coordinator

* Use a Mutex in CONTEXT to let it be set multiple times

* Make complimentary edits to full-stack tests

* Augment coordinator P2p connection logs

* Drop lock acquisitions before recursing

* Better scope lock acquisitions in full-stack, preventing a deadlock

* Ensure OUTER_OPS is reset across the test boundary

* Add cargo deny allowance for dockertest fork
2023-12-22 21:09:18 -05:00
Luke Parker
00774c29d7
Replace remaining direct uses of futures with futures_util
Slight downscope which helps combat the antipattern which is the futures glob
crate. While futures_util is still a large crate, it has better defaults and
is smaller by virtue of not pulling the executor.
2023-12-18 19:45:08 -05:00
Luke Parker
a4c82632fb
Use pub(crate) for create_db items, not pub 2023-12-18 17:15:02 -05:00
Luke Parker
c8747e23c5 Remove offline participants from future DKG protocols so long as the threshold is met
Makes RemoveParticipantDueToDkg a voted-on event instead of a Provided.
This removes the requirement for offline parties to be able to fully validate
blame, yet unfortunately lets an dishonest supermajority have an honest node
label any arbitrary node as dishonest.

Corrects a variety of `.i(...)` calls which panicked when they shouldn't have.

Cleans up a couple no-longer-used storage values.
2023-12-18 17:14:51 -05:00
Luke Parker
c2fffb9887
Correct a couple years of accumulated typos 2023-12-17 02:06:51 -05:00
Luke Parker
065d314e2a
Further expand clippy workspace lints
Achieves a notable amount of reduced async and clones.
2023-12-17 00:04:49 -05:00