Commit graph

3 commits

Author SHA1 Message Date
Luke Parker
aea6ac104f
Remove Tendermint for GRANDPA
Updates to polkadot-v0.9.40, with a variety of dependency updates accordingly.
Substrate thankfully now uses k256 0.13, pathing the way for #256. We couldn't
upgrade to polkadot-v0.9.40 without this due to polkadot-v0.9.40 having
fundamental changes to syncing. While we could've updated tendermint, it's not
worth the continued development effort given its inability to work with
multiple validator sets.

Purges sc-tendermint. Keeps tendermint-machine for #163.

Closes #137, #148, #157, #171. #96 and #99 should be re-scoped/clarified. #134
and #159 also should be clarified. #169 is also no longer a priority since
we're only considering temporal deployments of tendermint. #170 also isn't
since we're looking at effectively sharded validator sets, so there should
be no singular large set needing high performance.
2023-03-26 16:49:18 -04:00
Luke Parker
6085a8bb9d
Update runtime commentary in Tendermint 2022-12-05 09:04:50 -05:00
Luke Parker
8f4d6f79f3
Initial Tendermint implementation (#145)
* Machine without timeouts

* Time code

* Move substrate/consensus/tendermint to substrate/tendermint

* Delete the old paper doc

* Refactor out external parts to generics

Also creates a dedicated file for the message log.

* Refactor <V, B> to type V, type B

* Successfully compiling

* Calculate timeouts

* Fix test

* Finish timeouts

* Misc cleanup

* Define a signature scheme trait

* Implement serialization via parity's scale codec

Ideally, this would be generic. Unfortunately, the generic API serde 
doesn't natively support borsh, nor SCALE, and while there is a serde 
SCALE crate, it's old. While it may be complete, it's not worth working 
with.

While we could still grab bincode, and a variety of other formats, it 
wasn't worth it to go custom and for Serai, we'll be using SCALE almost 
everywhere anyways.

* Implement usage of the signature scheme

* Make the infinite test non-infinite

* Provide a dedicated signature in Precommit of just the block hash

Greatly simplifies verifying when syncing.

* Dedicated Commit object

Restores sig aggregation API.

* Tidy README

* Document tendermint

* Sign the ID directly instead of its SCALE encoding

For a hash, which is fixed-size, these should be the same yet this helps 
move past the dependency on SCALE. It also, for any type where the two 
values are different, smooths integration.

* Litany of bug fixes

Also attempts to make the code more readable while updating/correcting 
documentation.

* Remove async recursion

Greatly increases safety as well by ensuring only one message is 
processed at once.

* Correct timing issues

1) Commit didn't include the round, leaving the clock in question.

2) Machines started with a local time, instead of a proper start time.

3) Machines immediately started the next block instead of waiting for 
the block time.

* Replace MultiSignature with sr25519::Signature

* Minor SignatureScheme API changes

* Map TM SignatureScheme to Substrate's sr25519

* Initial work on an import queue

* Properly use check_block

* Rename import to import_queue

* Implement tendermint_machine::Block for Substrate Blocks

Unfortunately, this immediately makes Tendermint machine capable of 
deployment as  crate since it uses a git reference. In the future, a 
Cargo.toml patch section for serai/substrate should be investigated. 
This is being done regardless as it's the quickest way forward and this 
is for Serai.

* Dummy Weights

* Move documentation to the top of the file

* Move logic into TendermintImport itself

Multiple traits exist to verify/handle blocks. I'm unsure exactly when 
each will be called in the pipeline, so the easiest solution is to have 
every step run every check.

That would be extremely computationally expensive if we ran EVERY check, 
yet we rely on Substrate for execution (and according checks), which are 
limited to just the actual import function.

Since we're calling this code from many places, it makes sense for it to 
be consolidated under TendermintImport.

* BlockImport, JustificationImport, Verifier, and import_queue function

* Update consensus/lib.rs from PoW to Tendermint

Not possible to be used as the previous consensus could. It will not
produce blocks nor does it currenly even instantiate a machine. This is
just he next step.

* Update Cargo.tomls for substrate packages

* Tendermint SelectChain

This is incompatible with Substrate's expectations, yet should be valid 
for ours

* Move the node over to the new SelectChain

* Minor tweaks

* Update SelectChain documentation

* Remove substrate/node lib.rs

This shouldn't be used as a library AFAIK. While runtime should be, and 
arguably should even be published, I have yet to see node in the same 
way. Helps tighten API boundaries.

* Remove unused macro_use

* Replace panicking todos with stubs and // TODO

Enables progress.

* Reduce chain_spec and use more accurate naming

* Implement block proposal logic

* Modularize to get_proposal

* Trigger block importing

Doesn't wait for the response yet, which it needs to.

* Get the result of block importing

* Split import_queue into a series of files

* Provide a way to create the machine

The BasicQueue returned obscures the TendermintImport struct. 
Accordingly, a Future scoped with access is returned upwards, which when 
awaited will create the machine. This makes creating the machine 
optional while maintaining scope boundaries.

Is sufficient to create a 1-node net which produces and finalizes 
blocks.

* Don't import justifications multiple times

Also don't broadcast blocks which were solely proposed.

* Correct justication import pipeline

Removes JustificationImport as it should never be used.

* Announce blocks

By claiming File, they're not sent ovber the P2P network before they 
have a justification, as desired. Unfortunately, they never were. This 
works around that.

* Add an assert to verify proposed children aren't best

* Consolidate C and I generics into a TendermintClient trait alias

* Expand sanity checks

Substrate doesn't expect nor officially support children with less work 
than their parents. It's a trick used here. Accordingly, ensure the 
trick's validity.

* When resetting, use the end time of the round which was committed to

The machine reset to the end time of the current round. For a delayed 
network connection, a machine may move ahead in rounds and only later 
realize a prior round succeeded. Despite acknowledging that round's 
success, it would maintain its delay when moving to the next block, 
bricking it.

Done by tracking the end time for each round as they occur.

* Move Commit from including the round to including the round's end_time

The round was usable to build the current clock in an accumulated 
fashion, relative to the previous round. The end time is the absolute 
metric of it, which can be used to calculate the round number (with all 
previous end times).

Substrate now builds off the best block, not genesis, using the end time 
included in the justification to start its machine in a synchronized 
state.

Knowing the end time of a round, or the round in which block was 
committed to, is necessary for nodes to sync up with Tendermint. 
Encoding it in the commit ensures it's long lasting and makes it readily 
available, without the load of an entire transaction.

* Add a TODO on Tendermint

* Misc bug fixes

* More misc bug fixes

* Clean up lock acquisition

* Merge weights and signing scheme into validators, documenting needed changes

* Add pallet sessions to runtime, create pallet-tendermint

* Update node to use pallet sessions

* Update support URL

* Partial work on correcting pallet calls

* Redo Tendermint folder structure

* TendermintApi, compilation fixes

* Fix the stub round robin

At some point, the modulus was removed causing it to exceed the 
validators list and stop proposing.

* Use the validators list from the session pallet

* Basic Gossip Validator

* Correct Substrate Tendermint start block

The Tendermint machine uses the passed in number as the block's being 
worked on number. Substrate passed in the already finalized block's 
number.

Also updates misc comments.

* Clean generics in Tendermint with a monolith with associated types

* Remove the Future triggering the machine for an async fn

Enables passing data in, such as the network.

* Move TendermintMachine from start_num, time to last_num, time

Provides an explicitly clear API clearer to program around.

Also adds additional time code to handle an edge case.

* Connect the Tendermint machine to a GossipEngine

* Connect broadcast

* Remove machine from TendermintImport

It's not used there at all.

* Merge Verifier into block_import.rs

These two files were largely the same, just hooking into sync structs 
with almost identical imports. As this project shapes up, removing dead 
weight is appreciated.

* Create a dedicated file for being a Tendermint authority

* Deleted comment code related to PoW

* Move serai_runtime specific code from tendermint/client to node

Renames serai-consensus to sc_tendermint

* Consolidate file structure in sc_tendermint

* Replace best_* with finalized_*

We test their equivalency yet still better to use finalized_* in 
general.

* Consolidate references to sr25519 in sc_tendermint

* Add documentation to public structs/functions in sc_tendermint

* Add another missing comment

* Make sign asynchronous

Some relation to https://github.com/serai-dex/serai/issues/95.

* Move sc_tendermint to async sign

* Implement proper checking of inherents

* Take in a Keystore and validator ID

* Remove unnecessary PhantomDatas

* Update node to latest sc_tendermint

* Configure node for a multi-node testnet

* Fix handling of the GossipEngine

* Use a rounded genesis to obtain sufficient synchrony within the Docker env

* Correct Serai d-f names in Docker

* Remove an attempt at caching I don't believe would ever hit

* Add an already in chain check to block import

While the inner should do this for us, we call verify_order on our end 
*before* inner to ensure sequential import. Accordingly, we need to 
provide our own check.

Removes errors of "non-sequential import" when trying to re-import an 
existing block.

* Update the consensus documentation

It was incredibly out of date.

* Add a _ to the validator arg in slash

* Make the dev profile a local testnet profile

Restores a dev profile which only has one validator, locally running.

* Reduce Arcs in TendermintMachine, split Signer from SignatureScheme

* Update sc_tendermint per previous commit

* Restore cache

* Remove error case which shouldn't be an error

* Stop returning errors on already existing blocks entirely

* Correct Dave, Eve, and Ferdie to not run as validators

* Rename dev to devnet

--dev still works thanks to the |. Acheieves a personal preference of 
mine with some historical meaning.

* Add message expiry to the Tendermint gossip

* Localize the LibP2P protocol to the blockchain

Follows convention by doing so. Theoretically enables running multiple 
blockchains over a single LibP2P connection.

* Add a version to sp-runtime in tendermint-machine

* Add missing trait

* Bump Substrate dependency

Fixes #147.

* Implement Schnorr half-aggregation from https://eprint.iacr.org/2021/350.pdf

Relevant to https://github.com/serai-dex/serai/issues/99.

* cargo update (tendermint)

* Move from polling loops to a pure IO model for sc_tendermint's gossip

* Correct protocol name handling

* Use futures mpsc instead of tokio

* Timeout futures

* Move from a yielding loop to select in tendermint-machine

* Update Substrate to the new TendermintHandle

* Use futures pin instead of tokio

* Only recheck blocks with non-fatal inherent transaction errors

* Update to the latest substrate

* Separate the block processing time from the latency

* Add notes to the runtime

* Don't spam slash

Also adds a slash condition of failing to propose.

* Support running TendermintMachine when not a validator

This supports validators who leave the current set, without crashing 
their nodes, along with nodes trying to become validators (who will now 
seamlessly transition in).

* Properly define and pass around the block size

* Correct the Duration timing

The proposer will build it, send it, then process it (on the first 
round). Accordingly, it's / 3, not / 2, as / 2 only accounted for the 
latter events.

* Correct time-adjustment code on round skip

* Have the machine respond to advances made by an external sync loop

* Clean up time code in tendermint-machine

* BlockData and RoundData structs

* Rename Round to RoundNumber

* Move BlockData to a new file

* Move Round to an Option due to the pseudo-uninitialized state we create

Before the addition of RoundData, we always created the round, and on 
.round(0), simply created it again. With RoundData, and the changes to 
the time code, we used round 0, time 0, the latter being incorrect yet 
not an issue due to lack of misuse.

Now, if we do misuse it, it'll panic.

* Clear the Queue instead of draining and filtering

There shouldn't ever be a message which passes the filter under the 
current design.

* BlockData::new

* Move more code into block.rs

Introduces type-aliases to obtain Data/Message/SignedMessage solely from 
a Network object.

Fixes a bug regarding stepping when you're not an active validator.

* Have verify_precommit_signature return if it verified the signature

Also fixes a bug where invalid precommit signatures were left standing 
and therefore contributing to commits.

* Remove the precommit signature hash

It cached signatures per-block. Precommit signatures are bound to each 
round. This would lead to forming invalid commits when a commit should 
be formed. Under debug, the machine would catch that and panic. On 
release, it'd have everyone who wasn't a validator fail to continue 
syncing.

* Slight doc changes

Also flattens the message handling function by replacing an if 
containing all following code in the function with an early return for 
the else case.

* Always produce notifications for finalized blocks via origin overrides

* Correct weird formatting

* Update to the latest tendermint-machine

* Manually step the Tendermint machine when we synced a block over the network

* Ignore finality notifications for old blocks

* Remove a TODO resolved in 8c51bc011d

* Add a TODO comment to slash

Enables searching for the case-sensitive phrase and finding it.

* cargo fmt

* Use a tmp DB for Serai in Docker

* Remove panic on slash

As we move towards protonet, this can happen (if a node goes offline), 
yet it happening brings down the entire net right now.

* Add log::error on slash

* created shared volume between containers

* Complete the sh scripts

* Pass in the genesis time to Substrate

* Correct block announcements

They were announced, yet not marked best.

* Correct pupulate_end_time

It was used as inclusive yet didn't work inclusively.

* Correct gossip channel jumping when a block is synced via Substrate

* Use a looser check in import_future

This triggered so it needs to be accordingly relaxed.

* Correct race conditions between add_block and step

Also corrects a <= to <.

* Update cargo deny

* rename genesis-service to genesis

* Update Cargo.lock

* Correct runtime Cargo.toml whitespace

* Correct typo

* Document recheck

* Misc lints

* Fix prev commit

* Resolve low-hanging review comments

* Mark genesis/entry-dev.sh as executable

* Prevent a commit from including the same signature multiple times

Yanks tendermint-machine 0.1.0 accordingly.

* Update to latest nightly clippy

* Improve documentation

* Use clearer variable names

* Add log statements

* Pair more log statements

* Clean TendermintAuthority::authority as possible

Merges it into new. It has way too many arguments, yet there's no clear path at
consolidation there, unfortunately.

Additionally provides better scoping within itself.

* Fix #158

Doesn't use lock_import_and_run for reasons commented (lack of async).

* Rename guard to lock

* Have the devnet use the current time as the genesis

Possible since it's only a single node, not requiring synchronization.

* Fix gossiping

I really don't know what side effect this avoids and I can't say I care at this
point.

* Misc lints

Co-authored-by: vrx00 <vrx00@proton.me>
Co-authored-by: TheArchitect108 <TheArchitect108@protonmail.com>
2022-12-03 18:38:02 -05:00