Commit graph

296 commits

Author SHA1 Message Date
Luke Parker
4913873b10
Slash reports (#523)
* report_slashes plumbing in Substrate

Notably delays the SetRetired event until it provides a slash report or the set
after it becomes the set to report its slashes.

* Add dedicated AcceptedHandover event

* Add SlashReport TX to Tributary

* Create SlashReport TXs

* Handle SlashReport TXs

* Add logic to generate a SlashReport to the coordinator

* Route SlashReportSigner into the processor

* Finish routing the SlashReport signing/TX publication

* Add serai feature to processor's serai-client
2024-01-29 03:48:53 -05:00
Luke Parker
3aa8007700
Add missing unwap to processor's test fn 2024-01-06 01:01:19 -05:00
Luke Parker
1ba2d8d832
Make monero-serai Block::number not panic on invalid blocks 2024-01-06 00:03:14 -05:00
Luke Parker
7eb388e546
PR to track down CI failures (#501)
* Use an extended timeout for DKGs specifically

* Add a log statement when message-queue connection fails

* Add a 60 second keep-alive to connections

* Use zalloc for processor/message-queue/coordinator

An additional layer which protects us against edge cases with Zeroizing
(objects which don't support it or don't miss it).

* Add further logs to message-queue

* Further increase re-attempt timeouts in CI

* Remove misplaced continue inmessage-queue client

Fixes observed CI failures.

* Revert "Further increase re-attempt timeouts in CI"

This reverts commit 3723530cf6.
2024-01-04 01:08:13 -05:00
Luke Parker
c2fffb9887
Correct a couple years of accumulated typos 2023-12-17 02:06:51 -05:00
Luke Parker
065d314e2a
Further expand clippy workspace lints
Achieves a notable amount of reduced async and clones.
2023-12-17 00:04:49 -05:00
Luke Parker
ea3af28139
Add workspace lints 2023-12-17 00:04:47 -05:00
Luke Parker
77edd00725
Handle the combination of DKG removals with re-attempts
With a DKG removal comes a reduction in the amount of participants which was
ignored by re-attempts.

Now, we determine n/i based on the parties removed, and deterministically
obtain the context of who was removd.
2023-12-13 14:03:07 -05:00
Luke Parker
6a172825aa
Reattempts (#483)
* Schedule re-attempts and add a (not filled out) match statement to actually execute them

A comment explains the methodology. To copy it here:

"""
This is because we *always* re-attempt any protocol which had participation. That doesn't
mean we *should* re-attempt this protocol.

The alternatives were:
1) Note on-chain we completed a protocol, halting re-attempts upon 34%.
2) Vote on-chain to re-attempt a protocol.

This schema doesn't have any additional messages upon the success case (whereas
alternative #1 does) and doesn't have overhead (as alternative #2 does, sending votes and
then preprocesses. This only sends preprocesses).
"""

Any signing protocol which reaches sufficient participation will be
re-attempted until it no longer does.

* Have the Substrate scanner track DKG removals/completions for the Tributary code

* Don't keep trying to publish a participant removal if we've already set keys

* Pad out the re-attempt match a bit more

* Have CosignEvaluator reload from the DB

* Correctly schedule cosign re-attempts

* Actuall spawn new DKG removal attempts

* Use u32 for Batch ID in SubstrateSignableId, finish Batch re-attempt routing

The batch ID was an opaque [u8; 5] which also included the network, yet that's
redundant and unhelpful.

* Clarify a pair of TODOs in the coordinator

* Remove old TODO

* Final comment cleanup

* Correct usage of TARGET_BLOCK_TIME in reattempt scheduler

It's in ms and I assumed it was in s.

* Have coordinator tests drop BatchReattempts which aren't relevant yet may exist

* Bug fix and pointless oddity removal

We scheduled a re-attempt upon receiving 2/3rds of preprocesses and upon
receiving 2/3rds of shares, so any signing protocol could cause two re-attempts
(not one more).

The coordinator tests randomly generated the Batch ID since it was prior an
opaque byte array. While that didn't break the test, it was pointless and did
make the already-succeeded check before re-attempting impossible to hit.

* Add log statements, correct dead-lock in coordinator tests

* Increase pessimistic timeout on recv_message to compensate for tighter best-case timeouts

* Further bump timeout by a minute

AFAICT, GH failed by just a few seconds.

This also is worst-case in a single instance, making it fine to be decently long.

* Further further bump timeout due to lack of distinct error
2023-12-12 12:28:53 -05:00
Luke Parker
11fdb6da1d
Coordinator Cleanup (#481)
* Move logic for evaluating if a cosign should occur to its own file

Cleans it up and makes it more robust.

* Have expected_next_batch return an error instead of retrying

While convenient to offer an error-free implementation, it potentially caused
very long lived lock acquisitions in handle_processor_message.

* Unify and clean DkgConfirmer and DkgRemoval

Does so via adding a new file for the common code, SigningProtocol.

Modifies from_cache to return the preprocess with the machine, as there's no
reason not to. Also removes an unused Result around the type.

Clarifies the security around deterministic nonces, removing them for
saved-to-disk cached preprocesses. The cached preprocesses are encrypted as the
DB is not a proper secret store.

Moves arguments always present in the protocol from function arguments into the
struct itself.

Removes the horribly ugly code in DkgRemoval, fixing multiple issues present
with it which would cause it to fail on use.

* Set SeraiBlockNumber in cosign.rs as it's used by the cosigning protocol

* Remove unnecessary Clone from lambdas in coordinator

* Remove the EventDb from Tributary scanner

We used per-Transaction DB TXNs so on error, we don't have to rescan the entire
block yet only the rest of it. We prevented scanning multiple transactions by
tracking which we already had.

This is over-engineered and not worth it.

* Implement borsh for HasEvents, removing the manual encoding

* Merge DkgConfirmer and DkgRemoval into signing_protocol.rs

Fixes a bug in DkgConfirmer which would cause it to improperly handle indexes
if any validator had multiple key shares.

* Strictly type DataSpecification's Label

* Correct threshold_i_map_to_keys_and_musig_i_map

It didn't include the participant's own index and accordingly was offset.

* Create TributaryBlockHandler

This struct contains all variables prior passed to handle_block and stops them
from being passed around again and again.

This also ensures fatal_slash is only called while handling a block, as needed
as it expects to operate under perfect consensus.

* Inline accumulate, store confirmation nonces with shares

Inlining accumulate makes sense due to the amount of data accumulate needed to
be passed.

Storing confirmation nonces with shares ensures that both are available or
neither. Prior, one could be yet the other may not have been (requiring an
assert in runtime to ensure we didn't bungle it somehow).

* Create helper functions for handling DkgRemoval/SubstrateSign/Sign Tributary TXs

* Move Label into SignData

All of our transactions which use SignData end up with the same common usage
pattern for Label, justifying this.

Removes 3 transactions, explicitly de-duplicating their handlers.

* Remove CurrentlyCompletingKeyPair for the non-contextual DkgKeyPair

* Remove the manual read/write for TributarySpec for borsh

This struct doesn't have any optimizations booned by the manual impl. Using
borsh reduces our scope.

* Use temporary variables to further minimize LoC in tributary handler

* Remove usage of tuples for non-trivial Tributary transactions

* Remove serde from dkg

serde could be used to deserialize intenrally inconsistent objects which could
lead to panics or faults.

The BorshDeserialize derives have been replaced with a manual implementation
which won't produce inconsistent objects.

* Abstract Future generics using new trait definitions in coordinator

* Move published_signed_transaction to tributary/mod.rs to reduce the size of main.rs

* Split coordinator/src/tributary/mod.rs into spec.rs and transaction.rs
2023-12-10 20:21:44 -05:00
Justin Berman
397fca748f
monero-serai: make it clear that not providing a change address is fingerprintable (#472)
* Make it clear not providing a change address is fingerprintable

When no change address is provided, all change is shunted to the
fee. This PR makes it clear to the caller that it is fingerprintable
when the caller does this.

* Review comments
2023-12-08 07:42:02 -05:00
Luke Parker
3a6c7ad796 Use TX IDs for Bitcoin Eventualities
They're a bit more binding, smaller, provided by the Rust bitcoin library,
sane, and we don't have to worry about malleability since all of our inputs are
SegWit.
2023-12-06 04:37:11 -05:00
Luke Parker
99c6375605
fmt 2023-12-03 00:06:13 -05:00
Luke Parker
6e8a5f9cb1
cargo update, remove unneeded dependencies from the processor 2023-12-03 00:05:03 -05:00
Luke Parker
b823413c9b
Use parity-db in current Dockerfiles (#455)
* Use redb and in Dockerfiles

The motivation for redb was to remove the multiple rocksdb compile times from
CI.

* Correct feature flagging of coordinator and message-queue in Dockerfiles

* Correct message-queue DB type alias

* Use consistent table typing in redb

* Correct rebase artifacts

* Correct removal of binaries feature from message-queue

* Correct processor feature flagging

* Replace redb with parity-db

It still has much better compile times yet doesn't block when creating multiple
transactions. It also is actively maintained and doesn't grow our tree. The MPT
aspects are irrelevant.

* Correct stray Redb

* clippy warning

* Correct txn get
2023-11-30 04:22:37 -05:00
Luke Parker
f0ff3a18d2
Use debug builds in our Dockerfiles to reduce CI times (#462)
* Use debug builds in our Dockerfiles to reduce CI times

Also enables only spawning the mdns service when debug in the coordinator.

* Correct underflow in processor

Prior undetected due to relase builds not having bounds checks enabled.

* Restore Serai release due to CI/RPC failures caused by compiling it in debug mode

This is *probably* worth an issue filed upstream, if it can be tracked down.

* Correct failing debug asserts in Monero

These debug asserts assumed there was a change address to take the remainder.
If there's no change address, the remainder is shunted to the fee, causing the
fee to be distinct from the estimate.

We presumably need to modify monero-serai such that change: None isn't valid,
and users must use Change::Fingerprintable(None).
2023-11-29 00:24:37 -05:00
Luke Parker
571195bfda
Resolve #360 (#456)
* Remove NetworkId from processor-messages

Because intent binds to the sender/receiver, it's not needed for intent.

The processor knows what the network is.

The coordinator knows which to use because it's sending this message to the
processor for that network.

Also removes the unused zeroize.

* ProcessorMessage::Completed use Session instead of key

* Move SubstrateSignId to Session

* Finish replacing key with session
2023-11-26 12:14:23 -05:00
Luke Parker
de14687a0d
Fix the processor's Monero time monotonicity
Monero doesn't assert the time increases with each block, solely that it
doesn't decrease. Now, the block number is added to the time to ensure it
increases.
2023-11-25 04:07:31 -05:00
Luke Parker
d60e007126
Add a binaries feature to the processor to reduce dependencies when used as a lib
processor isn't intended to be used as a library, yet serai-processor-tests
does pull it in as a lib. This caused serai-processor-tests to need to compile
rocksdb, which added multiple minutes to the compilation time.
2023-11-25 04:04:52 -05:00
Luke Parker
b296be8515
Replace bincode with borsh (#452)
* Add SignalsConfig to chain_spec

* Correct multiexp feature flagging for rand_core std

* Remove bincode for borsh

Replaces a non-canonical encoding with a canonical encoding which additionally
should be faster.

Also fixes an issue where we used bincode in transcripts where it cannot be
trusted.

This ended up fixing a myriad of other bugs observed, unfortunately.
Accordingly, it either has to be merged or the bug fixes from it must be ported
to a new PR.

* Make serde optional, minimize usage

* Make borsh an optional dependency of substrate/ crates

* Remove unused dependencies

* Use [u8; 64] where possible in the processor messages

* Correct borsh feature flagging
2023-11-25 04:01:11 -05:00
econsta
9ab2a2cfe0
processor/db.rs macro implimentation (#437)
* processor/db.rs macro implimentation

* ran clippy and fmt

* incorporated recommendations

* used empty uple instead of [u8; 0]

* ran fmt
2023-11-22 03:58:26 -05:00
Luke Parker
4b85d4b03b
Rename SubstrateSigner to BatchSigner 2023-11-20 22:21:52 -05:00
econsta
05d8c32be8
Multisig db (#444)
* implement db macro for processor/multisigs/db.rs

* ran fmt

* cargo +nightly fmt

---------

Co-authored-by: Luke Parker <lukeparker5132@gmail.com>
2023-11-20 21:40:54 -05:00
econsta
8634d90b6b
implement db macro for processor/signer.rs (#438)
* implement db macro for processor/signer.rs

* Use ()

* CompletedDb -> CompletionsDb

* () -> &()

---------

Co-authored-by: Luke Parker <lukeparker5132@gmail.com>
2023-11-20 03:03:45 -05:00
econsta
aa9daea6b0
implement db macro for processor/substrate_signer (#439)
* implement db macro for processor/substrate_signer

* Use ()

* Correct AttemptDb usage of ()

* () -> &()

---------

Co-authored-by: Luke Parker <lukeparker5132@gmail.com>
2023-11-20 03:03:35 -05:00
Luke Parker
797604ad73
Replace usage of io::Error::new(io::ErrorKind::Other, with io::Error::other
Newly possible with Rust 1.74.
2023-11-19 18:31:37 -05:00
Luke Parker
6e4ecbc90c
Support trailing commas in create_db 2023-11-18 20:54:37 -05:00
Luke Parker
369af0fab5
\#339 addendum 2023-11-15 20:23:19 -05:00
Luke Parker
96f1d26f7a
Add a cosigning protocol to ensure finalizations are unique (#433)
* Add a function to deterministically decide which Serai blocks should be co-signed

Has a 5 minute latency between co-signs, also used as the maximal latency
before a co-sign is started.

* Get all active tributaries we're in at a specific block

* Add and route CosignSubstrateBlock, a new provided TX

* Split queued cosigns per network

* Rename BatchSignId to SubstrateSignId

* Add SubstrateSignableId, a meta-type for either Batch or Block, and modularize around it

* Handle the CosignSubstrateBlock provided TX

* Revert substrate_signer.rs to develop (and patch to still work)

Due to SubstrateSigner moving when the prior multisig closes, yet cosigning
occurring with the most recent key, a single SubstrateSigner can be reused.
We could manage multiple SubstrateSigners, yet considering the much lower
specifications for cosigning, I'd rather treat it distinctly.

* Route cosigning through the processor

* Add note to rename SubstrateSigner post-PR

I don't want to do so now in order to preserve the diff's clarity.

* Implement cosign evaluation into the coordinator

* Get tests to compile

* Bug fixes, mark blocks without cosigners available as cosigned

* Correct the ID Batch preprocesses are saved under, add log statements

* Create a dedicated function to handle cosigns

* Correct the flow around Batch verification/queueing

Verifying `Batch`s could stall when a `Batch` was signed before its
predecessors/before the block it's contained in was cosigned (the latter being
inevitable as we can't sign a block containing a signed batch before signing
the batch).

Now, Batch verification happens on a distinct async task in order to not block
the handling of processor messages. This task is the sole caller of verify in
order to ensure last_verified_batch isn't unexpectedly mutated.

When the processor message handler needs to access it, or needs to queue a
Batch, it associates the DB TXN with a lock preventing the other task from
doing so.

This lock, as currently implemented, is a poor and inefficient design. It
should be modified to the pattern used for cosign management. Additionally, a
new primitive of a DB-backed channel may be immensely valuable.

Fixes a standing potential deadlock and a deadlock introduced with the
cosigning protocol.

* Working full-stack tests

After the last commit, this only required extending a timeout.

* Replace "co-sign" with "cosign" to make finding text easier

* Update the coordinator tests to support cosigning

* Inline prior_batch calculation to prevent panic on rotation

Noticed when doing a final review of the branch.
2023-11-15 16:57:21 -05:00
Luke Parker
54f1929078
Route blame between Processor and Coordinator (#427)
* Have processor report errors during the DKG to the coordinator

* Add RemoveParticipant, InvalidDkgShare to coordinator

* Route DKG blame around coordinator

* Allow public construction of AdditionalBlameMachine

Necessary for upcoming work on handling DKG blame in the processor and
coordinator.

Additionally fixes a publicly reachable panic when commitments parsed with one
ThresholdParams are used in a machine using another set of ThresholdParams.

Renames InvalidProofOfKnowledge to InvalidCommitments.

* Remove unused error from dleq

* Implement support for VerifyBlame in the processor

* Have coordinator send the processor share message relevant to Blame

* Remove desync between processors reporting InvalidShare and ones reporting GeneratedKeyPair

* Route blame on sign between processor and coordinator

Doesn't yet act on it in coordinator.

* Move txn usage as needed for stable Rust to build

* Correct InvalidDkgShare serialization
2023-11-12 07:24:41 -05:00
Luke Parker
ed2445390f
Replace post-detection of if a Plan is forwarded by noting if it's from the scanner 2023-11-09 14:54:38 -05:00
Luke Parker
52a0c56016
Rename Network::address to Network::external_address
Improves clarity since we now have 4 addresses.
2023-11-09 14:31:46 -05:00
Luke Parker
42e8f2c8d8
Add OutputType::Forwarded to ensure a user's transfer in isn't misclassified
If a user transferred in without an InInstruction, and the amount exactly
matched a forwarded output, the user's output would fulfill the
forwarding. Then the forwarded output would come along, have no InInstruction,
and be refunded (to the prior multisig) when the user should've been refunded.

Adding this new address type resolves such concerns.
2023-11-09 14:24:13 -05:00
Luke Parker
ec51fa233a
Document an accepted false positive 2023-11-09 12:41:15 -05:00
Luke Parker
ce4091695f
Document a choice of variable name 2023-11-09 12:38:06 -05:00
Luke Parker
24919cfc54
Resolve race condition regarding when forwarded output is set
The higher-level scanner code in multisigs/mod.rs now creates a series of plans
with limited context. These include forwarding and refunding plans, moving all
handling of forwarding flags on the scanner's clock and therefore safe.

Also simplifies the refunding a decent bit.
2023-11-09 12:37:07 -05:00
Luke Parker
bf41009c5a
Document critical race condition due to two distinct clocks operating over the same data 2023-11-09 08:41:22 -05:00
Luke Parker
e8e9e212df
Move additional functions which retry until success into Network trait 2023-11-09 07:16:15 -05:00
Luke Parker
19187d2c30
Implement calculation of monotonic network times for Bitcoin and Monero 2023-11-09 07:02:52 -05:00
Luke Parker
43ae6794db
Remove invalid TODOs from processor signers 2023-11-09 03:53:30 -05:00
Luke Parker
978134a9d1
Remove events from SubstrateSigner
Same vibes as prior commit.
2023-11-09 01:56:09 -05:00
Luke Parker
2eb155753a
Remove the Signer events pseudo-channel for a returned message
Also replaces SignerEvent with usage of ProcessorMessage directly.
2023-11-09 01:26:30 -05:00
Luke Parker
7d72e224f0
Remove Output::amount and move Payment from Amount to Balance
This code is still largely designed around the idea a payment for a network is
fungible with any other, which isn't true. This starts moving past that.

Asserts are added to ensure the integrity of coin to the scheduler (which is
now per key per coin, not per key alone) and in Bitcoin/Monero prepare_send.
2023-11-08 23:33:25 -05:00
Luke Parker
ffedba7a05
Update processor tests to refund logic 2023-11-08 21:59:11 -05:00
Luke Parker
06e627a562
Support refunds as possible for invalidly received outputs on Serai 2023-11-08 11:26:28 -05:00
Luke Parker
a688350f44
Have processor's Network::new sleep until booted, not panic 2023-11-08 03:21:28 -05:00
Luke Parker
56fd11ab8d
Use a single long-lived RPC connection when authenticated
The prior system spawned a new connection per request to enable parallelism,
yet kept hitting hyper::IncompleteMessages I couldn't track down. This
attempts to resolve those by a long-lived socket.

Halves the amount of requests per-authenticated RPC call, and accordingly is
likely still better overall.

I don't believe this is resolved yet but this is still worth pushing.
2023-11-07 17:42:19 -05:00
Luke Parker
c03fb6c71b
Add dedicated BatchSignId 2023-11-06 20:06:36 -05:00
Luke Parker
cddb44ae3f
Bitcoin tweaks + cargo update
Removes bitcoin-serai's usage of sha2 for bitcoin-hashes. While sha2 is still
in play due to modular-frost (more specifically, due to ciphersuite), this
offers a bit more performance (assuming equivalency between sha2 and
bitcoin-hashes' impl) due to removing a static for a const.

Makes secp256k1 a dev dependency for bitcoin-serai. While secp256k1 is still
pulled in via bitcoin, it's hopefully slightly better to compile now and makes
usage of secp256k1 an implementation detail of bitcoin (letting it change it
freely).

Also offers slightly more efficient signing as we don't decode to a signature
just to re-encode for the transaction.

Removes a 20s sleep for a check every second, up to 20 times, for reduced test
times in the processor.
2023-11-06 07:38:36 -05:00
hinto.janai
bd3272a9f2 replace lazy_static! with once_cell::sync::Lazy 2023-11-06 05:31:46 -05:00