serai

mirror of https://github.com/serai-dex/serai.git synced 2024-12-23 12:09:37 +00:00

Author	SHA1	Message	Date
Luke Parker	a62d2d05ad	Correct log which didn't work as intended	2024-04-20 19:55:17 -04:00
Luke Parker	967cc16748	Correct log targets in tendermint-machine	2024-04-20 19:55:06 -04:00
Luke Parker	ab4b8cc2d5	Better logs in tendermint-machine	2024-04-20 18:13:57 -04:00
Luke Parker	387ccbad3a	Extend time in sync test	2024-04-18 16:39:16 -04:00
Luke Parker	26cdfdd824	fmt	2024-04-18 16:39:03 -04:00
Luke Parker	68e77384ac	Don't broadcast added blocks Online validators should inherently have them. Offline validators will receive from the sync protocol. This does somewhat eliminate the class of nodes who would follow the blockchain (without validating it), yet that's fine for the performance benefit.	2024-04-18 16:38:52 -04:00
Luke Parker	68da88c1f3	Only reply to heartbeats after a certain distance	2024-04-18 16:38:43 -04:00
Luke Parker	2b481ab71e	Ensure we don't reply to stale heartbeats	2024-04-18 16:38:21 -04:00
Luke Parker	05e6d81948	Only have some nodes respond to latent heartbeats Also only respond if they're more than 2 blocks behind to minimize redundant sending of blocks.	2024-04-18 16:38:16 -04:00
Luke Parker	10124ac4a8	Add Testnet 2 Config Starts Tuesday, April 16th, with confirmed keys/boot nodes.	2024-04-11 15:49:32 -04:00
Luke Parker	bc44fbdbac	Add TODO to coordinator P2P	2024-03-23 23:32:21 -04:00
Luke Parker	4cacce5e55	Perform key share amortization on-chain to avoid discrepancies	2024-03-23 23:32:14 -04:00
Luke Parker	b7d49af1d5	Track total peer count in the coordinator	2024-03-23 18:02:48 -04:00
Luke Parker	4914420a37	Don't add as an explicit peer if already connected	2024-03-22 23:51:51 -04:00
Luke Parker	f11a08c436	Peer finding which won't get stuck on one specific network	2024-03-22 23:47:43 -04:00
Luke Parker	35b58a45bd	Split peer finding into a dedicated task	2024-03-22 23:40:15 -04:00
Luke Parker	af9b1ad5f9	Initial pruning of backlogged consensus messages	2024-03-22 23:18:53 -04:00
Luke Parker	2f07d04d88	Extend timeout for rebroadcast of consensus messages in coordinator	2024-03-22 16:06:31 -04:00
Luke Parker	0889627e60	Typo fix for prior commit	2024-03-11 02:20:51 -04:00
Luke Parker	ace41c79fd	Tidy the BlockHasEvents cache	2024-03-11 01:44:00 -04:00
Luke Parker	f7d16b3fc5	Fix 0 - 1 which caused a panic	2024-03-09 05:37:41 -05:00
Luke Parker	6374d9987e	Correct how we save the block to scan from	2024-03-09 03:48:44 -05:00
Luke Parker	c93f6bf901	Replace yield_now with sleep 100 to prevent hammering a task, despite still being over-eager	2024-03-09 03:34:31 -05:00
Luke Parker	61a81e53e1	Further optimize cosign DB	2024-03-09 03:31:06 -05:00
Luke Parker	89b237af7e	Correct the return value of block_has_events	2024-03-09 02:44:04 -05:00
Luke Parker	2347bf5fd3	Bound cosign work and ensure it progress forward even when cosigns don't occur Should resolve the DB load observed on testnet.	2024-03-09 02:20:23 -05:00
Luke Parker	454bebaa77	Have the TendermintMachine domain-separate by genesis Enbables support for multiple machines over the same DB.	2024-03-08 01:22:02 -05:00
Luke Parker	e266bc2e32	Stop validators from equivocating on reboot Part of https://github.com/serai-dex/serai/issues/345. The lack of full DB persistence does mean enough nodes rebooting at the same time may cause a halt. This will prevent slashes.	2024-03-07 22:56:35 -05:00
Luke Parker	f0694172ef	Fix potential generation of invalid SignData in shim	2024-02-09 02:52:08 -05:00
akildemir	347d4cf413	Fix tendermint distinct precommit bug (#517 ) * fix tendermint distinct precommit bug * remove conflicting precommit error	2024-02-08 13:47:37 -05:00
akildemir	ad0ecc5185	complete various todos in tributary (#520 ) * complete various todos * fix pr comments * Document bounds on unique hashes in TransactionKind --------- Co-authored-by: Luke Parker <lukeparker5132@gmail.com>	2024-02-05 03:50:55 -05:00
Luke Parker	4913873b10	Slash reports (#523 ) * report_slashes plumbing in Substrate Notably delays the SetRetired event until it provides a slash report or the set after it becomes the set to report its slashes. * Add dedicated AcceptedHandover event * Add SlashReport TX to Tributary * Create SlashReport TXs * Handle SlashReport TXs * Add logic to generate a SlashReport to the coordinator * Route SlashReportSigner into the processor * Finish routing the SlashReport signing/TX publication * Add serai feature to processor's serai-client	2024-01-29 03:48:53 -05:00
Luke Parker	f3429ec1ef	Inside publish (for a Serai transaction from the coordinator), use RetiredDb over latest session Not only is this more performant, the definition of retired won't be if a newer session is active. It will be if the session has posted a slash report or the stake for that session has unlocked. Initial commit towards implementing SlashReports.	2024-01-05 23:40:15 -05:00
Luke Parker	7eb388e546	PR to track down CI failures (#501 ) * Use an extended timeout for DKGs specifically * Add a log statement when message-queue connection fails * Add a 60 second keep-alive to connections * Use zalloc for processor/message-queue/coordinator An additional layer which protects us against edge cases with Zeroizing (objects which don't support it or don't miss it). * Add further logs to message-queue * Further increase re-attempt timeouts in CI * Remove misplaced continue inmessage-queue client Fixes observed CI failures. * Revert "Further increase re-attempt timeouts in CI" This reverts commit `3723530cf6`.	2024-01-04 01:08:13 -05:00
Luke Parker	02776c54a8	Increase reattempt delays in the GH CI, which is extremely latent	2023-12-30 22:11:04 -05:00
Luke Parker	ec8dfd4639	Correct SignData serialization test from creating 256 signers of data This overflows the u8 allowed and caused a CI failure. The actual code/assumption is fine.	2023-12-30 19:08:29 -05:00
Luke Parker	b493e3e31f	Validator DHT (#494 ) * Route validators for any active set through sc-authority-discovery Additionally adds an RPC route to retrieve their P2P addresses. * Have the coordinator get peers from substrate * Have the RPC return one address, not up to 3 Prevents the coordinator from believing it has 3 peers when it has one. * Add missing feature to serai-client * Correct network argument in serai-client for p2p_validators call * Add a test in serai-client to check DHT population with a much quicker failure than the coordinator tests * Update to latest Substrate Removes distinguishing BABE/AuthorityDiscovery keys which causes sc_authority_discovery to populate as desired. * Update to a properly tagged substrate commit * Add all dialed to peers to GossipSub * cargo fmt * Reduce common code in serai-coordinator-tests with amore involved new_test * Use a recursive async function to spawn `n` DockerTests with the necessary networking configuration * Merge UNIQUE_ID and ONE_AT_A_TIME * Tidy up the new recursive code in tests/coordinator * Use a Mutex in CONTEXT to let it be set multiple times * Make complimentary edits to full-stack tests * Augment coordinator P2p connection logs * Drop lock acquisitions before recursing * Better scope lock acquisitions in full-stack, preventing a deadlock * Ensure OUTER_OPS is reset across the test boundary * Add cargo deny allowance for dockertest fork	2023-12-22 21:09:18 -05:00
Luke Parker	00774c29d7	Replace remaining direct uses of futures with futures_util Slight downscope which helps combat the antipattern which is the futures glob crate. While futures_util is still a large crate, it has better defaults and is smaller by virtue of not pulling the executor.	2023-12-18 19:45:08 -05:00
Luke Parker	a4c82632fb	Use pub(crate) for create_db items, not pub	2023-12-18 17:15:02 -05:00
Luke Parker	c8747e23c5	Remove offline participants from future DKG protocols so long as the threshold is met Makes RemoveParticipantDueToDkg a voted-on event instead of a Provided. This removes the requirement for offline parties to be able to fully validate blame, yet unfortunately lets an dishonest supermajority have an honest node label any arbitrary node as dishonest. Corrects a variety of `.i(...)` calls which panicked when they shouldn't have. Cleans up a couple no-longer-used storage values.	2023-12-18 17:14:51 -05:00
Luke Parker	c2fffb9887	Correct a couple years of accumulated typos	2023-12-17 02:06:51 -05:00
Luke Parker	065d314e2a	Further expand clippy workspace lints Achieves a notable amount of reduced async and clones.	2023-12-17 00:04:49 -05:00
Luke Parker	ea3af28139	Add workspace lints	2023-12-17 00:04:47 -05:00
Luke Parker	2532423d42	Remove the RemoveParticipant protocol for having new DKGs specify the participants which were removed Obvious code cleanup is obvious.	2023-12-14 23:51:57 -05:00
Luke Parker	b60e3c2524	Replace PSTTrait and PstTxType with PublishSeraiTransaction	2023-12-14 16:06:08 -05:00
Luke Parker	77edd00725	Handle the combination of DKG removals with re-attempts With a DKG removal comes a reduction in the amount of participants which was ignored by re-attempts. Now, we determine n/i based on the parties removed, and deterministically obtain the context of who was removd.	2023-12-13 14:03:07 -05:00
Luke Parker	6a172825aa	Reattempts (#483 ) * Schedule re-attempts and add a (not filled out) match statement to actually execute them A comment explains the methodology. To copy it here: """ This is because we always re-attempt any protocol which had participation. That doesn't mean we should re-attempt this protocol. The alternatives were: 1) Note on-chain we completed a protocol, halting re-attempts upon 34%. 2) Vote on-chain to re-attempt a protocol. This schema doesn't have any additional messages upon the success case (whereas alternative #1 does) and doesn't have overhead (as alternative #2 does, sending votes and then preprocesses. This only sends preprocesses). """ Any signing protocol which reaches sufficient participation will be re-attempted until it no longer does. * Have the Substrate scanner track DKG removals/completions for the Tributary code * Don't keep trying to publish a participant removal if we've already set keys * Pad out the re-attempt match a bit more * Have CosignEvaluator reload from the DB * Correctly schedule cosign re-attempts * Actuall spawn new DKG removal attempts * Use u32 for Batch ID in SubstrateSignableId, finish Batch re-attempt routing The batch ID was an opaque [u8; 5] which also included the network, yet that's redundant and unhelpful. * Clarify a pair of TODOs in the coordinator * Remove old TODO * Final comment cleanup * Correct usage of TARGET_BLOCK_TIME in reattempt scheduler It's in ms and I assumed it was in s. * Have coordinator tests drop BatchReattempts which aren't relevant yet may exist * Bug fix and pointless oddity removal We scheduled a re-attempt upon receiving 2/3rds of preprocesses and upon receiving 2/3rds of shares, so any signing protocol could cause two re-attempts (not one more). The coordinator tests randomly generated the Batch ID since it was prior an opaque byte array. While that didn't break the test, it was pointless and did make the already-succeeded check before re-attempting impossible to hit. * Add log statements, correct dead-lock in coordinator tests * Increase pessimistic timeout on recv_message to compensate for tighter best-case timeouts * Further bump timeout by a minute AFAICT, GH failed by just a few seconds. This also is worst-case in a single instance, making it fine to be decently long. * Further further bump timeout due to lack of distinct error	2023-12-12 12:28:53 -05:00
Luke Parker	11fdb6da1d	Coordinator Cleanup (#481 ) * Move logic for evaluating if a cosign should occur to its own file Cleans it up and makes it more robust. * Have expected_next_batch return an error instead of retrying While convenient to offer an error-free implementation, it potentially caused very long lived lock acquisitions in handle_processor_message. * Unify and clean DkgConfirmer and DkgRemoval Does so via adding a new file for the common code, SigningProtocol. Modifies from_cache to return the preprocess with the machine, as there's no reason not to. Also removes an unused Result around the type. Clarifies the security around deterministic nonces, removing them for saved-to-disk cached preprocesses. The cached preprocesses are encrypted as the DB is not a proper secret store. Moves arguments always present in the protocol from function arguments into the struct itself. Removes the horribly ugly code in DkgRemoval, fixing multiple issues present with it which would cause it to fail on use. * Set SeraiBlockNumber in cosign.rs as it's used by the cosigning protocol * Remove unnecessary Clone from lambdas in coordinator * Remove the EventDb from Tributary scanner We used per-Transaction DB TXNs so on error, we don't have to rescan the entire block yet only the rest of it. We prevented scanning multiple transactions by tracking which we already had. This is over-engineered and not worth it. * Implement borsh for HasEvents, removing the manual encoding * Merge DkgConfirmer and DkgRemoval into signing_protocol.rs Fixes a bug in DkgConfirmer which would cause it to improperly handle indexes if any validator had multiple key shares. * Strictly type DataSpecification's Label * Correct threshold_i_map_to_keys_and_musig_i_map It didn't include the participant's own index and accordingly was offset. * Create TributaryBlockHandler This struct contains all variables prior passed to handle_block and stops them from being passed around again and again. This also ensures fatal_slash is only called while handling a block, as needed as it expects to operate under perfect consensus. * Inline accumulate, store confirmation nonces with shares Inlining accumulate makes sense due to the amount of data accumulate needed to be passed. Storing confirmation nonces with shares ensures that both are available or neither. Prior, one could be yet the other may not have been (requiring an assert in runtime to ensure we didn't bungle it somehow). * Create helper functions for handling DkgRemoval/SubstrateSign/Sign Tributary TXs * Move Label into SignData All of our transactions which use SignData end up with the same common usage pattern for Label, justifying this. Removes 3 transactions, explicitly de-duplicating their handlers. * Remove CurrentlyCompletingKeyPair for the non-contextual DkgKeyPair * Remove the manual read/write for TributarySpec for borsh This struct doesn't have any optimizations booned by the manual impl. Using borsh reduces our scope. * Use temporary variables to further minimize LoC in tributary handler * Remove usage of tuples for non-trivial Tributary transactions * Remove serde from dkg serde could be used to deserialize intenrally inconsistent objects which could lead to panics or faults. The BorshDeserialize derives have been replaced with a manual implementation which won't produce inconsistent objects. * Abstract Future generics using new trait definitions in coordinator * Move published_signed_transaction to tributary/mod.rs to reduce the size of main.rs * Split coordinator/src/tributary/mod.rs into spec.rs and transaction.rs	2023-12-10 20:21:44 -05:00
Luke Parker	6caf45ea1d	Downscope usage of futures	2023-12-10 19:32:52 -05:00
Luke Parker	7122e0faf4	Cache the block's events within TemporalSerai Event retrieval was prior: - Retrieve all events in the block, which may be hundreds of KB - Filter to just a few Since it's frequent to want multiple sets of events, each filtered in their own way, this caused the retrieval to happen multiple times. Now, it only will happen once. Also has the scoped clients take a reference, not an owned TemporalSerai.	2023-12-08 10:46:10 -05:00

1 2 3 4 5 ...

306 commits