For logging purposes, I added code to handle negative time till start. I forgot
to only sleep on positive time till start.
Should fix the recent CI failure.
It was improperly implemented, as it assumed rounds had a constant time
interval, which they do not. It also is against the spec and was meant to
absolve us of issues with poor performance when post-starting blockchains. The
new, and much more proper, workaround for the latter is a 120-second delay
between the Substrate time and the Tributary start time.
It's largely unoptimized, and not yet exclusive to validators, yet has basic
sanity (using message content for ID instead of sender + index).
Fixes bugs as found. Notably, we used a time in milliseconds where the
Tributary expected seconds.
Also has Tributary::new jump to the presumed round number. This reduces slashes
when starting new chains (whose times will be before the current time) and was
the only way I was able to observe successful confirmations given current
surrounding infrastructure.
Reduces lock contention.
Additionally changes block_key to include the genesis. While not technically
needed, the lack of genesis introduced a side effect where any Tributary on the
the database could return the block of any other Tributary. While that wasn't a
security issue, returning it suggested it was on-chain when it wasn't. This may
have been usable to create issues.
We had a race condition where'd we be informed of blocks 1 .. 3, and
immediately add 1 .. 3. Because we immediately tried to add 2 after 1, it'd
fail since the tip was still the genesis, yet 2 needs the tip to be 1.
Adding a channel, while ugly, was the simplest way to accomplish this.
Also has any added block be broadcasted. Else there's a race condition where a
node which syncs up to the most recent block does so, yet fails to add the next
block when it's committed to.
This defines the tart of a very complex series of locks I'm really unhappy
with. At the same time, there's not immediately a better solution. This also
should work without issue.
Impls a LocalP2p for testing.
Moves rebroadcasting into Tendermint, since it's what knows if a message is
fully valid + original.
Removes TributarySpec::validators() HashMap, as its non-determinism caused
different instances to have different round robin schedules. It was already
prior moved to a Vec for this issue, so I'm unsure why this remnant existed.
Also renames the GH no-std workflow from the prior commit.
Necessary as our Tributary chains needed to agree when a Serai block has
occurred, and when a Monero block has occurred. Since those could happen at the
same time, some validators may put SeraiBlock before ExternalBlock and vice
versa, causing a chain halt. Now they can have distinct ordering queues.
The existing code was almost entirely applicable. It just needed to be scoped
with an ID. While the handle function is now a bit convoluted, I don't see a
better option.
Step moved a step forward after an externally synced/added block. This created
a race condition to add the block between the sync process and the Tendermint
machine. Now that the block routes through Tendermint, there is no such race
condition.
Previously, Tendermint needed to be live more than it needed to be correct.
Under the original intention for it, correctness would fail if any coin
desynced, which would cause the node to fail entirely. By accepting a
supermajority's view of state, despite its own, a single coin's failure would
only lead to inability to participate with that single coin.
Now that Tendermint is solely for Tributary, nodes should halt a coin-specific
chain if their view of the chain differs. They are unable to meaningless
participate regardless.
This also means a supermajority of validators can no longer fake messages from
other validators, allowing the Tributary chain to use uniform weights with much
less impact. There is still enough impact they can't be used (ability to cause
a fork), yet they should allow uniform block production (as that's solely a DoS
concern).
While we prior could've simply additionally checked signatures, add_block's
lack of a failure case would've meant it had to panic. This would've been a DoS
possible a minority-weight *which affected the entire coordinator* and
therefore *the entire validator for all coins*.