serai/processor/src/tests/batch_signer.rs

159 lines
4.3 KiB
Rust
Raw Normal View History

use std::collections::HashMap;
2023-04-10 16:48:48 +00:00
use rand_core::{RngCore, OsRng};
2023-04-10 16:48:48 +00:00
use ciphersuite::group::GroupEncoding;
2023-04-10 16:48:48 +00:00
use frost::{
curve::Ristretto,
Participant,
dkg::tests::{key_gen, clone_without},
};
use sp_application_crypto::{RuntimePublic, sr25519::Public};
use serai_db::{DbTxn, Db, MemDb};
2023-11-07 00:50:32 +00:00
use scale::Encode;
#[rustfmt::skip]
use serai_client::{primitives::*, in_instructions::primitives::*, validator_sets::primitives::Session};
2023-04-10 16:48:48 +00:00
use messages::{
substrate,
Add a cosigning protocol to ensure finalizations are unique (#433) * Add a function to deterministically decide which Serai blocks should be co-signed Has a 5 minute latency between co-signs, also used as the maximal latency before a co-sign is started. * Get all active tributaries we're in at a specific block * Add and route CosignSubstrateBlock, a new provided TX * Split queued cosigns per network * Rename BatchSignId to SubstrateSignId * Add SubstrateSignableId, a meta-type for either Batch or Block, and modularize around it * Handle the CosignSubstrateBlock provided TX * Revert substrate_signer.rs to develop (and patch to still work) Due to SubstrateSigner moving when the prior multisig closes, yet cosigning occurring with the most recent key, a single SubstrateSigner can be reused. We could manage multiple SubstrateSigners, yet considering the much lower specifications for cosigning, I'd rather treat it distinctly. * Route cosigning through the processor * Add note to rename SubstrateSigner post-PR I don't want to do so now in order to preserve the diff's clarity. * Implement cosign evaluation into the coordinator * Get tests to compile * Bug fixes, mark blocks without cosigners available as cosigned * Correct the ID Batch preprocesses are saved under, add log statements * Create a dedicated function to handle cosigns * Correct the flow around Batch verification/queueing Verifying `Batch`s could stall when a `Batch` was signed before its predecessors/before the block it's contained in was cosigned (the latter being inevitable as we can't sign a block containing a signed batch before signing the batch). Now, Batch verification happens on a distinct async task in order to not block the handling of processor messages. This task is the sole caller of verify in order to ensure last_verified_batch isn't unexpectedly mutated. When the processor message handler needs to access it, or needs to queue a Batch, it associates the DB TXN with a lock preventing the other task from doing so. This lock, as currently implemented, is a poor and inefficient design. It should be modified to the pattern used for cosign management. Additionally, a new primitive of a DB-backed channel may be immensely valuable. Fixes a standing potential deadlock and a deadlock introduced with the cosigning protocol. * Working full-stack tests After the last commit, this only required extending a timeout. * Replace "co-sign" with "cosign" to make finding text easier * Update the coordinator tests to support cosigning * Inline prior_batch calculation to prevent panic on rotation Noticed when doing a final review of the branch.
2023-11-15 21:57:21 +00:00
coordinator::{self, SubstrateSignableId, SubstrateSignId, CoordinatorMessage},
ProcessorMessage,
};
2023-11-21 03:20:58 +00:00
use crate::batch_signer::BatchSigner;
2023-04-10 16:48:48 +00:00
#[tokio::test]
2023-11-21 03:20:58 +00:00
async fn test_batch_signer() {
2023-08-25 00:35:50 +00:00
let keys = key_gen::<_, Ristretto>(&mut OsRng);
2023-04-10 16:48:48 +00:00
let participant_one = Participant::new(1).unwrap();
let id: u32 = 5;
2023-04-10 16:48:48 +00:00
let block = BlockHash([0xaa; 32]);
let batch = Batch {
network: NetworkId::Monero,
id,
2023-04-10 16:48:48 +00:00
block,
instructions: vec![
InInstructionWithBalance {
instruction: InInstruction::Transfer(SeraiAddress([0xbb; 32])),
balance: Balance { coin: Coin::Bitcoin, amount: Amount(1000) },
2023-04-10 16:48:48 +00:00
},
InInstructionWithBalance {
instruction: InInstruction::Dex(DexCall::SwapAndAddLiquidity(SeraiAddress([0xbb; 32]))),
balance: Balance { coin: Coin::Monero, amount: Amount(9999999999999999) },
2023-04-10 16:48:48 +00:00
},
],
};
Add a cosigning protocol to ensure finalizations are unique (#433) * Add a function to deterministically decide which Serai blocks should be co-signed Has a 5 minute latency between co-signs, also used as the maximal latency before a co-sign is started. * Get all active tributaries we're in at a specific block * Add and route CosignSubstrateBlock, a new provided TX * Split queued cosigns per network * Rename BatchSignId to SubstrateSignId * Add SubstrateSignableId, a meta-type for either Batch or Block, and modularize around it * Handle the CosignSubstrateBlock provided TX * Revert substrate_signer.rs to develop (and patch to still work) Due to SubstrateSigner moving when the prior multisig closes, yet cosigning occurring with the most recent key, a single SubstrateSigner can be reused. We could manage multiple SubstrateSigners, yet considering the much lower specifications for cosigning, I'd rather treat it distinctly. * Route cosigning through the processor * Add note to rename SubstrateSigner post-PR I don't want to do so now in order to preserve the diff's clarity. * Implement cosign evaluation into the coordinator * Get tests to compile * Bug fixes, mark blocks without cosigners available as cosigned * Correct the ID Batch preprocesses are saved under, add log statements * Create a dedicated function to handle cosigns * Correct the flow around Batch verification/queueing Verifying `Batch`s could stall when a `Batch` was signed before its predecessors/before the block it's contained in was cosigned (the latter being inevitable as we can't sign a block containing a signed batch before signing the batch). Now, Batch verification happens on a distinct async task in order to not block the handling of processor messages. This task is the sole caller of verify in order to ensure last_verified_batch isn't unexpectedly mutated. When the processor message handler needs to access it, or needs to queue a Batch, it associates the DB TXN with a lock preventing the other task from doing so. This lock, as currently implemented, is a poor and inefficient design. It should be modified to the pattern used for cosign management. Additionally, a new primitive of a DB-backed channel may be immensely valuable. Fixes a standing potential deadlock and a deadlock introduced with the cosigning protocol. * Working full-stack tests After the last commit, this only required extending a timeout. * Replace "co-sign" with "cosign" to make finding text easier * Update the coordinator tests to support cosigning * Inline prior_batch calculation to prevent panic on rotation Noticed when doing a final review of the branch.
2023-11-15 21:57:21 +00:00
let actual_id = SubstrateSignId {
session: Session(0),
Add a cosigning protocol to ensure finalizations are unique (#433) * Add a function to deterministically decide which Serai blocks should be co-signed Has a 5 minute latency between co-signs, also used as the maximal latency before a co-sign is started. * Get all active tributaries we're in at a specific block * Add and route CosignSubstrateBlock, a new provided TX * Split queued cosigns per network * Rename BatchSignId to SubstrateSignId * Add SubstrateSignableId, a meta-type for either Batch or Block, and modularize around it * Handle the CosignSubstrateBlock provided TX * Revert substrate_signer.rs to develop (and patch to still work) Due to SubstrateSigner moving when the prior multisig closes, yet cosigning occurring with the most recent key, a single SubstrateSigner can be reused. We could manage multiple SubstrateSigners, yet considering the much lower specifications for cosigning, I'd rather treat it distinctly. * Route cosigning through the processor * Add note to rename SubstrateSigner post-PR I don't want to do so now in order to preserve the diff's clarity. * Implement cosign evaluation into the coordinator * Get tests to compile * Bug fixes, mark blocks without cosigners available as cosigned * Correct the ID Batch preprocesses are saved under, add log statements * Create a dedicated function to handle cosigns * Correct the flow around Batch verification/queueing Verifying `Batch`s could stall when a `Batch` was signed before its predecessors/before the block it's contained in was cosigned (the latter being inevitable as we can't sign a block containing a signed batch before signing the batch). Now, Batch verification happens on a distinct async task in order to not block the handling of processor messages. This task is the sole caller of verify in order to ensure last_verified_batch isn't unexpectedly mutated. When the processor message handler needs to access it, or needs to queue a Batch, it associates the DB TXN with a lock preventing the other task from doing so. This lock, as currently implemented, is a poor and inefficient design. It should be modified to the pattern used for cosign management. Additionally, a new primitive of a DB-backed channel may be immensely valuable. Fixes a standing potential deadlock and a deadlock introduced with the cosigning protocol. * Working full-stack tests After the last commit, this only required extending a timeout. * Replace "co-sign" with "cosign" to make finding text easier * Update the coordinator tests to support cosigning * Inline prior_batch calculation to prevent panic on rotation Noticed when doing a final review of the branch.
2023-11-15 21:57:21 +00:00
id: SubstrateSignableId::Batch((batch.network, batch.id).encode().try_into().unwrap()),
2023-11-07 00:50:32 +00:00
attempt: 0,
};
let mut signing_set = vec![];
while signing_set.len() < usize::from(keys.values().next().unwrap().params().t()) {
let candidate = Participant::new(
u16::try_from((OsRng.next_u64() % u64::try_from(keys.len()).unwrap()) + 1).unwrap(),
)
.unwrap();
if signing_set.contains(&candidate) {
continue;
}
signing_set.push(candidate);
}
let mut signers = HashMap::new();
let mut dbs = HashMap::new();
2023-04-10 16:48:48 +00:00
let mut preprocesses = HashMap::new();
for i in 1 ..= keys.len() {
let i = Participant::new(u16::try_from(i).unwrap()).unwrap();
let keys = keys.get(&i).unwrap().clone();
let mut signer = BatchSigner::<MemDb>::new(NetworkId::Monero, Session(0), vec![keys]);
let mut db = MemDb::new();
let mut txn = db.txn();
match signer.sign(&mut txn, batch.clone()).await.unwrap() {
// All participants should emit a preprocess
coordinator::ProcessorMessage::BatchPreprocess {
id,
block: batch_block,
preprocesses: mut these_preprocesses,
} => {
assert_eq!(id, actual_id);
assert_eq!(batch_block, block);
assert_eq!(these_preprocesses.len(), 1);
if signing_set.contains(&i) {
preprocesses.insert(i, these_preprocesses.swap_remove(0));
}
}
_ => panic!("didn't get preprocess back"),
2023-04-10 16:48:48 +00:00
}
txn.commit();
signers.insert(i, signer);
dbs.insert(i, db);
2023-04-10 16:48:48 +00:00
}
let mut shares = HashMap::new();
for i in &signing_set {
let mut txn = dbs.get_mut(i).unwrap().txn();
match signers
.get_mut(i)
.unwrap()
.handle(
&mut txn,
Add a cosigning protocol to ensure finalizations are unique (#433) * Add a function to deterministically decide which Serai blocks should be co-signed Has a 5 minute latency between co-signs, also used as the maximal latency before a co-sign is started. * Get all active tributaries we're in at a specific block * Add and route CosignSubstrateBlock, a new provided TX * Split queued cosigns per network * Rename BatchSignId to SubstrateSignId * Add SubstrateSignableId, a meta-type for either Batch or Block, and modularize around it * Handle the CosignSubstrateBlock provided TX * Revert substrate_signer.rs to develop (and patch to still work) Due to SubstrateSigner moving when the prior multisig closes, yet cosigning occurring with the most recent key, a single SubstrateSigner can be reused. We could manage multiple SubstrateSigners, yet considering the much lower specifications for cosigning, I'd rather treat it distinctly. * Route cosigning through the processor * Add note to rename SubstrateSigner post-PR I don't want to do so now in order to preserve the diff's clarity. * Implement cosign evaluation into the coordinator * Get tests to compile * Bug fixes, mark blocks without cosigners available as cosigned * Correct the ID Batch preprocesses are saved under, add log statements * Create a dedicated function to handle cosigns * Correct the flow around Batch verification/queueing Verifying `Batch`s could stall when a `Batch` was signed before its predecessors/before the block it's contained in was cosigned (the latter being inevitable as we can't sign a block containing a signed batch before signing the batch). Now, Batch verification happens on a distinct async task in order to not block the handling of processor messages. This task is the sole caller of verify in order to ensure last_verified_batch isn't unexpectedly mutated. When the processor message handler needs to access it, or needs to queue a Batch, it associates the DB TXN with a lock preventing the other task from doing so. This lock, as currently implemented, is a poor and inefficient design. It should be modified to the pattern used for cosign management. Additionally, a new primitive of a DB-backed channel may be immensely valuable. Fixes a standing potential deadlock and a deadlock introduced with the cosigning protocol. * Working full-stack tests After the last commit, this only required extending a timeout. * Replace "co-sign" with "cosign" to make finding text easier * Update the coordinator tests to support cosigning * Inline prior_batch calculation to prevent panic on rotation Noticed when doing a final review of the branch.
2023-11-15 21:57:21 +00:00
CoordinatorMessage::SubstratePreprocesses {
id: actual_id.clone(),
preprocesses: clone_without(&preprocesses, i),
},
)
.await
.unwrap()
2023-04-10 16:48:48 +00:00
{
Add a cosigning protocol to ensure finalizations are unique (#433) * Add a function to deterministically decide which Serai blocks should be co-signed Has a 5 minute latency between co-signs, also used as the maximal latency before a co-sign is started. * Get all active tributaries we're in at a specific block * Add and route CosignSubstrateBlock, a new provided TX * Split queued cosigns per network * Rename BatchSignId to SubstrateSignId * Add SubstrateSignableId, a meta-type for either Batch or Block, and modularize around it * Handle the CosignSubstrateBlock provided TX * Revert substrate_signer.rs to develop (and patch to still work) Due to SubstrateSigner moving when the prior multisig closes, yet cosigning occurring with the most recent key, a single SubstrateSigner can be reused. We could manage multiple SubstrateSigners, yet considering the much lower specifications for cosigning, I'd rather treat it distinctly. * Route cosigning through the processor * Add note to rename SubstrateSigner post-PR I don't want to do so now in order to preserve the diff's clarity. * Implement cosign evaluation into the coordinator * Get tests to compile * Bug fixes, mark blocks without cosigners available as cosigned * Correct the ID Batch preprocesses are saved under, add log statements * Create a dedicated function to handle cosigns * Correct the flow around Batch verification/queueing Verifying `Batch`s could stall when a `Batch` was signed before its predecessors/before the block it's contained in was cosigned (the latter being inevitable as we can't sign a block containing a signed batch before signing the batch). Now, Batch verification happens on a distinct async task in order to not block the handling of processor messages. This task is the sole caller of verify in order to ensure last_verified_batch isn't unexpectedly mutated. When the processor message handler needs to access it, or needs to queue a Batch, it associates the DB TXN with a lock preventing the other task from doing so. This lock, as currently implemented, is a poor and inefficient design. It should be modified to the pattern used for cosign management. Additionally, a new primitive of a DB-backed channel may be immensely valuable. Fixes a standing potential deadlock and a deadlock introduced with the cosigning protocol. * Working full-stack tests After the last commit, this only required extending a timeout. * Replace "co-sign" with "cosign" to make finding text easier * Update the coordinator tests to support cosigning * Inline prior_batch calculation to prevent panic on rotation Noticed when doing a final review of the branch.
2023-11-15 21:57:21 +00:00
ProcessorMessage::Coordinator(coordinator::ProcessorMessage::SubstrateShare {
id,
shares: mut these_shares,
}) => {
assert_eq!(id, actual_id);
assert_eq!(these_shares.len(), 1);
shares.insert(*i, these_shares.swap_remove(0));
}
_ => panic!("didn't get share back"),
2023-04-10 16:48:48 +00:00
}
txn.commit();
2023-04-10 16:48:48 +00:00
}
for i in &signing_set {
let mut txn = dbs.get_mut(i).unwrap().txn();
match signers
.get_mut(i)
.unwrap()
.handle(
&mut txn,
Add a cosigning protocol to ensure finalizations are unique (#433) * Add a function to deterministically decide which Serai blocks should be co-signed Has a 5 minute latency between co-signs, also used as the maximal latency before a co-sign is started. * Get all active tributaries we're in at a specific block * Add and route CosignSubstrateBlock, a new provided TX * Split queued cosigns per network * Rename BatchSignId to SubstrateSignId * Add SubstrateSignableId, a meta-type for either Batch or Block, and modularize around it * Handle the CosignSubstrateBlock provided TX * Revert substrate_signer.rs to develop (and patch to still work) Due to SubstrateSigner moving when the prior multisig closes, yet cosigning occurring with the most recent key, a single SubstrateSigner can be reused. We could manage multiple SubstrateSigners, yet considering the much lower specifications for cosigning, I'd rather treat it distinctly. * Route cosigning through the processor * Add note to rename SubstrateSigner post-PR I don't want to do so now in order to preserve the diff's clarity. * Implement cosign evaluation into the coordinator * Get tests to compile * Bug fixes, mark blocks without cosigners available as cosigned * Correct the ID Batch preprocesses are saved under, add log statements * Create a dedicated function to handle cosigns * Correct the flow around Batch verification/queueing Verifying `Batch`s could stall when a `Batch` was signed before its predecessors/before the block it's contained in was cosigned (the latter being inevitable as we can't sign a block containing a signed batch before signing the batch). Now, Batch verification happens on a distinct async task in order to not block the handling of processor messages. This task is the sole caller of verify in order to ensure last_verified_batch isn't unexpectedly mutated. When the processor message handler needs to access it, or needs to queue a Batch, it associates the DB TXN with a lock preventing the other task from doing so. This lock, as currently implemented, is a poor and inefficient design. It should be modified to the pattern used for cosign management. Additionally, a new primitive of a DB-backed channel may be immensely valuable. Fixes a standing potential deadlock and a deadlock introduced with the cosigning protocol. * Working full-stack tests After the last commit, this only required extending a timeout. * Replace "co-sign" with "cosign" to make finding text easier * Update the coordinator tests to support cosigning * Inline prior_batch calculation to prevent panic on rotation Noticed when doing a final review of the branch.
2023-11-15 21:57:21 +00:00
CoordinatorMessage::SubstrateShares {
id: actual_id.clone(),
shares: clone_without(&shares, i),
},
)
.await
.unwrap()
2023-04-10 16:48:48 +00:00
{
ProcessorMessage::Substrate(substrate::ProcessorMessage::SignedBatch {
batch: signed_batch,
}) => {
assert_eq!(signed_batch.batch, batch);
assert!(Public::from_raw(keys[&participant_one].group_key().to_bytes())
.verify(&batch_message(&batch), &signed_batch.signature));
}
_ => panic!("didn't get signed batch back"),
2023-04-10 16:48:48 +00:00
}
txn.commit();
2023-04-10 16:48:48 +00:00
}
}