* Rewrite tendermint's message handling loop to much more clearly match the paper
No longer checks relevant branches upon messages, yet all branches upon any
state change. This is slower, yet easier to review and likely without one or
two rare edge cases.
When reviewing, please see page 5 of https://arxiv.org/pdf/1807.04938.pdf.
Lines from the specified algorithm can be found in the code by searching for
"// L".
* Sane rebroadcasting of consensus messages
Instead of broadcasting the last n messages on the Tributary side of things, we
now have the machine rebroadcast the message tape for the current block.
* Only rebroadcast messages which didn't error in some way
* Only rebroadcast our own messages for tendermint
Instead of saving, for every sent message, if it was sent or not, we track the
latest block/round participated in. These two keys are comprehensive to all
prior block/rounds. We then use three keys for the latest round's
proposal/prevote/precommit, enabling tracking current state as necessary to
prevent equivocations with just 5 keys.
The storage of the latest three messages also enables proper rebroadcasting of
the current round (not implemented in this commit).
Online validators should inherently have them. Offline validators will receive
from the sync protocol.
This does somewhat eliminate the class of nodes who would follow the blockchain
(without validating it), yet that's fine for the performance benefit.
Part of https://github.com/serai-dex/serai/issues/345.
The lack of full DB persistence does mean enough nodes rebooting at the same
time may cause a halt. This will prevent slashes.
* complete various todos
* fix pr comments
* Document bounds on unique hashes in TransactionKind
---------
Co-authored-by: Luke Parker <lukeparker5132@gmail.com>
* report_slashes plumbing in Substrate
Notably delays the SetRetired event until it provides a slash report or the set
after it becomes the set to report its slashes.
* Add dedicated AcceptedHandover event
* Add SlashReport TX to Tributary
* Create SlashReport TXs
* Handle SlashReport TXs
* Add logic to generate a SlashReport to the coordinator
* Route SlashReportSigner into the processor
* Finish routing the SlashReport signing/TX publication
* Add serai feature to processor's serai-client
Not only is this more performant, the definition of retired won't be if a newer
session is active. It will be if the session has posted a slash report or the
stake for that session has unlocked.
Initial commit towards implementing SlashReports.
* Use an extended timeout for DKGs specifically
* Add a log statement when message-queue connection fails
* Add a 60 second keep-alive to connections
* Use zalloc for processor/message-queue/coordinator
An additional layer which protects us against edge cases with Zeroizing
(objects which don't support it or don't miss it).
* Add further logs to message-queue
* Further increase re-attempt timeouts in CI
* Remove misplaced continue inmessage-queue client
Fixes observed CI failures.
* Revert "Further increase re-attempt timeouts in CI"
This reverts commit 3723530cf6.
* Route validators for any active set through sc-authority-discovery
Additionally adds an RPC route to retrieve their P2P addresses.
* Have the coordinator get peers from substrate
* Have the RPC return one address, not up to 3
Prevents the coordinator from believing it has 3 peers when it has one.
* Add missing feature to serai-client
* Correct network argument in serai-client for p2p_validators call
* Add a test in serai-client to check DHT population with a much quicker failure than the coordinator tests
* Update to latest Substrate
Removes distinguishing BABE/AuthorityDiscovery keys which causes
sc_authority_discovery to populate as desired.
* Update to a properly tagged substrate commit
* Add all dialed to peers to GossipSub
* cargo fmt
* Reduce common code in serai-coordinator-tests with amore involved new_test
* Use a recursive async function to spawn `n` DockerTests with the necessary networking configuration
* Merge UNIQUE_ID and ONE_AT_A_TIME
* Tidy up the new recursive code in tests/coordinator
* Use a Mutex in CONTEXT to let it be set multiple times
* Make complimentary edits to full-stack tests
* Augment coordinator P2p connection logs
* Drop lock acquisitions before recursing
* Better scope lock acquisitions in full-stack, preventing a deadlock
* Ensure OUTER_OPS is reset across the test boundary
* Add cargo deny allowance for dockertest fork
Slight downscope which helps combat the antipattern which is the futures glob
crate. While futures_util is still a large crate, it has better defaults and
is smaller by virtue of not pulling the executor.
Makes RemoveParticipantDueToDkg a voted-on event instead of a Provided.
This removes the requirement for offline parties to be able to fully validate
blame, yet unfortunately lets an dishonest supermajority have an honest node
label any arbitrary node as dishonest.
Corrects a variety of `.i(...)` calls which panicked when they shouldn't have.
Cleans up a couple no-longer-used storage values.