cuprate/database
2024-05-05 14:43:59 -04:00
..
src database: final docs + cleanup (#117) 2024-05-05 15:21:28 +01:00
Cargo.toml database: add tracing 2024-05-05 14:43:59 -04:00
README.md database: final docs + cleanup (#117) 2024-05-05 15:21:28 +01:00

Database

Cuprate's database implementation.


1. Documentation

Documentation for database/ is split into 3 locations:

Documentation location Purpose
database/README.md High level design of cuprate-database
cuprate-database Practical usage documentation/warnings/notes/etc
Source file // comments Implementation-specific details (e.g, how many reader threads to spawn?)

This README serves as the overview/design document.

For actual practical usage, cuprate-database's types and general usage are documented via standard Rust tooling.

Run:

cargo doc --package cuprate-database --open

at the root of the repo to open/read the documentation.

If this documentation is too abstract, refer to any of the source files, they are heavily commented. There are many // Regular comments that explain more implementation specific details that aren't present here or in the docs. Use the file reference below to find what you're looking for.

The code within src/ is also littered with some grep-able comments containing some keywords:

Word Meaning
INVARIANT This code makes an assumption that must be upheld for correctness
SAFETY This unsafe code is okay, for x,y,z reasons
FIXME This code works but isn't ideal
HACK This code is a brittle workaround
PERF This code is weird for performance reasons
TODO This must be implemented; There should be 0 of these in production code
SOMEDAY This should be implemented... someday

2. File Structure

A quick reference of the structure of the folders & files in cuprate-database.

Note that lib.rs/mod.rs files are purely for re-exporting/visibility/lints, and contain no code. Each sub-directory has a corresponding mod.rs.

2.1 src/

The top-level src/ files.

File Purpose
constants.rs General constants used throughout cuprate-database
database.rs Abstracted database; trait DatabaseR{o,w}
env.rs Abstracted database environment; trait Env
error.rs Database error types
free.rs General free functions (related to the database)
key.rs Abstracted database keys; trait Key
resize.rs Database resizing algorithms
storable.rs Data (de)serialization; trait Storable
table.rs Database table abstraction; trait Table
tables.rs All the table definitions used by cuprate-database
tests.rs Utilities for cuprate_database testing
transaction.rs Database transaction abstraction; trait TxR{o,w}
types.rs Database-specific types
unsafe_unsendable.rs Marker type to impl Send for objects not Send

2.2 src/backend/

This folder contains the implementation for actual databases used as the backend for cuprate-database.

Each backend has its own folder.

Folder/File Purpose
heed/ Backend using using heed (LMDB)
redb/ Backend using redb
tests.rs Backend-agnostic tests

All backends follow the same file structure:

File Purpose
database.rs Implementation of trait DatabaseR{o,w}
env.rs Implementation of trait Env
error.rs Implementation of backend's errors to cuprate_database's error types
storable.rs Compatibility layer between cuprate_database::Storable and backend-specific (de)serialization
transaction.rs Implementation of trait TxR{o,w}
types.rs Type aliases for long backend-specific types

2.3 src/config/

This folder contains the cupate_database::config module; configuration options for the database.

File Purpose
config.rs Main database Config struct
reader_threads.rs Reader thread configuration for service thread-pool
sync_mode.rs Disk sync configuration for backends

2.4 src/ops/

This folder contains the cupate_database::ops module.

These are higher-level functions abstracted over the database, that are Monero-related.

File Purpose
block.rs Block related (main functions)
blockchain.rs Blockchain related (height, cumulative values, etc)
key_image.rs Key image related
macros.rs Macros specific to ops/
output.rs Output related
property.rs Database properties (pruned, version, etc)
tx.rs Transaction related

2.5 src/service/

This folder contains the cupate_database::service module.

The asynchronous request/response API other Cuprate crates use instead of managing the database directly themselves.

File Purpose
free.rs General free functions used (related to cuprate_database::service)
read.rs Read thread-pool definitions and logic
tests.rs Thread-pool tests and test helper functions
types.rs cuprate_database::service-related type aliases
write.rs Writer thread definitions and logic

3. Backends

cuprate-database's traits allow abstracting over the actual database, such that any backend in particular could be used.

Each database's implementation for those trait's are located in its respective folder in src/backend/${DATABASE_NAME}/.

3.1 heed

The default database used is heed (LMDB).

The upstream versions from crates.io are used.

LMDB should not need to be installed as heed has a build script that pulls it in automatically.

heed's filenames inside Cuprate's database folder (~/.local/share/cuprate/database/) are:

Filename Purpose
data.mdb Main data file
lock.mdb Database lock file

heed-specific notes:

3.2 redb

The 2nd database backend is the 100% Rust redb.

The upstream versions from crates.io are used.

redb's filenames inside Cuprate's database folder (~/.local/share/cuprate/database/) are:

Filename Purpose
data.redb Main data file

3.3 redb-memory

This backend is 100% the same as redb, although, it uses redb::backend::InMemoryBackend which is a key-value store that completely resides in memory instead of a file.

All other details about this should be the same as the normal redb backend.

3.4 sanakirja

sanakirja was a candidate as a backend, however there were problems with maximum value sizes.

The default maximum value size is 1012 bytes which was too small for our requirements. Using sanakirja::Slice and sanakirja::UnsizedStorage was attempted, but there were bugs found when inserting a value in-between 512..=4096 bytes.

As such, it is not implemented.

3.5 MDBX

MDBX was a candidate as a backend, however MDBX deprecated the custom key/value comparison functions, this makes it a bit trickier to implement duplicate tables. It is also quite similar to the main backend LMDB (of which it was originally a fork of).

As such, it is not implemented (yet).

4. Layers

cuprate_database is logically abstracted into 5 layers, starting from the lowest:

  1. Backend
  2. Trait
  3. ConcreteEnv
  4. ops
  5. service

Each layer is built upon the last.

4.1 Backend

This is the actual database backend implementation (or a Rust shim over one).

Examples:

  • heed (LMDB)
  • redb

cuprate_database itself just uses a backend, it does not implement one.

All backends have the following attributes:

4.2 Trait

cuprate_database provides a set of traits that abstract over the various database backends.

This allows the function signatures and behavior to stay the same but allows for swapping out databases in an easier fashion.

All common behavior of the backend's are encapsulated here and used instead of using the backend directly.

Examples:

For example, instead of calling LMDB or redb's get() function directly, DatabaseRo::get() is called.

4.3 ConcreteEnv

This is the non-generic, concrete struct provided by cuprate_database that contains all the data necessary to operate the database. The actual database backend ConcreteEnv will use internally depends on which backend feature is used.

ConcreteEnv implements trait Env, which opens the door to all the other traits.

The equivalent objects in the backends themselves are:

This is the main object used when handling the database directly, although that is not strictly necessary as a user if the service layer is used.

4.4 ops

These are Monero-specific functions that use the abstracted trait forms of the database.

Instead of dealing with the database directly (get(), delete()), the ops layer provides more abstract functions that deal with commonly used Monero operations (add_block(), pop_block()).

4.5 service

The final layer abstracts the database completely into a Monero-specific async request/response API, using tower::Service.

It handles the database using a separate writer thread & reader thread-pool, and uses the previously mentioned ops functions when responding to requests.

Instead of handling the database directly, this layer provides read/write handles that allow:

  • Sending requests for data (e.g. Outputs)
  • Receiving responses

For more information on the backing thread-pool, see Thread model.

5. Syncing

cuprate_database's database has 5 disk syncing modes.

  1. FastThenSafe
  2. Safe
  3. Async
  4. Threshold
  5. Fast

The default mode is Safe.

This means that upon each transaction commit, all the data that was written will be fully synced to disk. This is the slowest, but safest mode of operation.

Note that upon any database Drop, whether via service or dropping the database directly, the current implementation will sync to disk regardless of any configuration.

For more information on the other modes, read the documentation here.

6. Thread model

As noted in the Layers section, the base database abstractions themselves are not concerned with parallelism, they are mostly functions to be called from a single-thread.

However, the actual API cuprate_database exposes for practical usage for the main cuprated binary (and other async use-cases) is the asynchronous service API, which does have a thread model backing it.

As such, when cuprate_database::service's initialization function is called, threads will be spawned and maintained until the user drops (disconnects) the returned handles.

The current behavior is:

For example, on a system with 32-threads, cuprate_database will spawn:

  • 1 writer thread
  • 32 reader threads

whose sole responsibility is to listen for database requests, access the database (potentially in parallel), and return a response.

Note that the 1 system thread = 1 reader thread model is only the default setting, the reader thread count can be configured by the user to be any number between 1 .. amount_of_system_threads.

The reader threads are managed by rayon.

For an example of where multiple reader threads are used: given a request that asks if any key-image within a set already exists, cuprate_database will split that work between the threads with rayon.

Once the handles to these threads are Droped, the backing thread(pool) will gracefully exit, automatically.

7. Resizing

Database backends that require manually resizing will, by default, use a similar algorithm as monerod's.

Note that this only relates to the service module, where the database is handled by cuprate_database itself, not the user. In the case of a user directly using cuprate_database, it is up to them on how to resize.

Within service, the resizing logic defined here does the following:

  • If there's not enough space to fit a write request's data, start a resize
  • Each resize adds around 1_073_745_920 bytes to the current map size
  • A resize will be attempted 3 times before failing

There are other resizing algorithms that define how the database's memory map grows, although currently the behavior of monerod is closely followed.

8. (De)serialization

All types stored inside the database are either bytes already, or are perfectly bitcast-able.

As such, they do not incur heavy (de)serialization costs when storing/fetching them from the database. The main (de)serialization used is bytemuck's traits and casting functions.

Note that the data stored in the tables are still type-safe; we still refer to the key and values within our tables by the type.

The main deserialization trait for database storage is: cuprate_database::Storable.

When a type is casted into bytes, the reference is casted, i.e. this is zero-cost serialization.

However, it is worth noting that when bytes are casted into the type, it is copied. This is due to byte alignment guarantee issues with both backends, see:

Without this, bytemuck will panic with TargetAlignmentGreaterAndInputNotAligned when casting.

Copying the bytes fixes this problem, although it is more costly than necessary. However, in the main use-case for cuprate_database (the service module) the bytes would need to be owned regardless as the Request/Response API uses owned data types (T, Vec<T>, HashMap<K, V>, etc).

Practically speaking, this means lower-level database functions that normally look like such:

fn get(key: &Key) -> &Value;

end up looking like this in cuprate_database:

fn get(key: &Key) -> Value;

Since each backend has its own (de)serialization methods, our types are wrapped in compatibility types that map our Storable functions into whatever is required for the backend, e.g:

Compatibility structs also exist for any Storable containers:

Again, it's unfortunate that these must be owned, although in service's use-case, they would have to be owned anyway.