cuprate/database
hinto-janai 240e579066
database: replace sanakirja with redb (#80)
* cargo: replace `sanakirja` with `redb`

* database: update docs `sanakirja` -> `redb`

* lib: add TODO for `ConcreteEnv` generic replacement

* database: split `trait Database` -> `trait Database{Read,Write}`

* heed: add `struct HeedTable{Ro,Rw}` to match `redb` behavior

* ops: remove imports for now

* env: fix `&mut` bound on RwTx

* database: impl `redb`, type-checks

* fix heed trait impls, `Database{Read,Write}` -> `Database{Ro,Rw}`

* redb: impl `From<_>` for `RuntimeError`

* update readme

* heed: document `HeedTableR{o,w}` types

* env: doc `sync()` invariant

* database: document data & lock filenames

* misc docs, `redb` durability impl, `'db` -> `'env`

* redb: fixes

* misc docs and fixes

* Update database/README.md

Co-authored-by: Boog900 <boog900@tutanota.com>

---------

Co-authored-by: Boog900 <boog900@tutanota.com>
2024-02-29 17:40:15 +00:00
..
src database: replace sanakirja with redb (#80) 2024-02-29 17:40:15 +00:00
Cargo.toml database: replace sanakirja with redb (#80) 2024-02-29 17:40:15 +00:00
README.md database: replace sanakirja with redb (#80) 2024-02-29 17:40:15 +00:00

Database

Cuprate's database implementation.

  1. Documentation
  2. File Structure
  3. Backends
  4. Layers
  5. Resizing
  6. Flushing
  7. (De)serialization

Documentation

In general, documentation for database/ is split into 3:

Documentation location Purpose
database/README.md High level design of cuprate-database
cuprate-database Practical usage documentation/warnings/notes/etc
Source file // comments Implementation-specific details (e.g, how many reader threads to spawn?)

This README serves as the overview/design document.

For actual practical usage, cuprate-database's types and general usage are documented via standard Rust tooling.

Run:

cargo doc --package cuprate-database --open

at the root of the repo to open/read the documentation.

If this documentation is too abstract, refer to any of the source files, they are heavily commented. There are many // Regular comments that explain more implementation specific details that aren't present here or in the docs. Use the file reference below to find what you're looking for.

The code within src/ is also littered with some grep-able comments containing some keywords:

Word Meaning
INVARIANT This code makes an assumption that must be upheld for correctness
SAFETY This unsafe code is okay, for x,y,z reasons
FIXME This code works but isn't ideal
HACK This code is a brittle workaround
PERF This code is weird for performance reasons
TODO This must be implemented; There should be 0 of these in production code
SOMEDAY This should be implemented... someday

File Structure

A quick reference of the structure of the folders & files in cuprate-database.

Note that lib.rs/mod.rs files are purely for re-exporting/visibility/lints, and contain no code. Each sub-directory has a corresponding mod.rs.

src/

The top-level src/ files.

File Purpose
config.rs Database Env configuration
constants.rs General constants used throughout cuprate-database
database.rs Abstracted database; trait DatabaseR{o,w}
env.rs Abstracted database environment; trait Env
error.rs Database error types
free.rs General free functions (related to the database)
key.rs Abstracted database keys; trait Key
pod.rs Data (de)serialization; trait Pod
table.rs Database table abstraction; trait Table
tables.rs All the table definitions used by cuprate-database
transaction.rs Database transaction abstraction; trait TxR{o,w}

src/ops/

This folder contains the cupate_database::ops module.

TODO: more detailed descriptions.

File Purpose
alt_block.rs Alternative blocks
block.rs Blocks
blockchain.rs Blockchain-related
output.rs Outputs
property.rs Properties
spent_key.rs Spent keys
tx.rs Transactions

src/service/

This folder contains the cupate_database::service module.

File Purpose
free.rs General free functions used (related to cuprate_database::service)
read.rs Read thread-pool definitions and logic
request.rs Read/write Requests to the database
response.rs Read/write Response's from the database
tests.rs Thread-pool tests and test helper functions
write.rs Write thread-pool definitions and logic

src/backend/

This folder contains the actual database crates used as the backend for cuprate-database.

Each backend has its own folder.

Folder Purpose
heed/ Backend using using forked heed
sanakirja/ Backend using sanakirja

All backends follow the same file structure:

File Purpose
database.rs Implementation of trait DatabaseR{o,w}
env.rs Implementation of trait Env
error.rs Implementation of backend's errors to cuprate_database's error types
transaction.rs Implementation of trait TxR{o,w}
types.rs Type aliases for long backend-specific types

Backends

cuprate-database's traits abstract over various actual databases.

Each database's implementation is located in its respective file in src/backend/${DATABASE_NAME}.rs.

heed

The default database used is a modified fork of heed (LMDB), located at Cuprate/heed.

To generate documentation of the fork for local use:

git clone --recursive https://github.com/Cuprate/heed
cargo doc

LMDB should not need to be installed as heed has a build script that pulls it in automatically.

heed's filenames inside Cuprate's database folder (~/.local/share/cuprate/database/) are:

Filename Purpose
data.mdb Main data file
lock.mdb Database lock file

TODO: document max readers limit: 059028a30a/src/blockchain_db/lmdb/db_lmdb.cpp (L1372). Other potential processes (e.g. xmrblocks) that are also reading the data.mdb file need to be accounted for.

redb

The 2nd database backend is the 100% Rust redb.

The upstream versions from crates.io are used.

redb's filenames inside Cuprate's database folder (~/.local/share/cuprate/database/) are:

Filename Purpose
data.redb Main data file

sanakirja

sanakirja was a candidate as a backend, however there were problems with maximum value sizes.

The default maximum value size is 1012 bytes which was too small for our requirements. Using sanakirja::Slice and sanakirja::UnsizedStorage was attempted, but there were bugs found when inserting a value in-between 512..=4096 bytes.

As such, it is not implemented.

MDBX

MDBX was a candidate as a backend, however MDBX deprecated the custom key/value comparison functions, this makes it a bit trickier to implement dup tables. It is also quite similar to the main backend LMDB (of which it was originally a fork of).

As such, it is not implemented (yet).

Layers

TODO: update with accurate information when ready, update image.

Database

Trait

ConcreteEnv

Thread

Service

Resizing

TODO: document resize algorithm:

  • Exactly when it occurs
  • How much bytes are added

All backends follow the same algorithm.

Flushing

TODO: document disk flushing behavior.

  • Config options
  • Backend-specific behavior

(De)serialization

TODO: document Pod and how databases use (de)serialize objects when storing/fetching, essentially using <[u8], [u8]>.