mirror of
https://github.com/monero-project/monero.git
synced 2025-01-05 10:29:34 +00:00
Changes to PORTABLE_STORAGE.md
* More information about array entries (especially nesting) * Varint encoding examples * Expanded string and integer encoding information
This commit is contained in:
parent
34941ac3e1
commit
fc9b77d855
2 changed files with 78 additions and 36 deletions
|
@ -15,15 +15,19 @@ documentation. Unfortunately, whilst the rest of the library is fairly
|
||||||
straightforward to decipher, the Portable Storage is less-so. Hence this
|
straightforward to decipher, the Portable Storage is less-so. Hence this
|
||||||
document.
|
document.
|
||||||
|
|
||||||
## Preliminaries
|
## String and Integer Encoding
|
||||||
|
|
||||||
### String and integer encoding
|
### Integers
|
||||||
|
|
||||||
#### varint
|
With few exceptions, integers serialized in epee portable storage format are serialized
|
||||||
|
as little-endian.
|
||||||
|
|
||||||
Varints are used to pack integers in an portable and space optimized way. The
|
### Varints
|
||||||
lowest 2 bits store the amount of bytes required, which means the largest value
|
|
||||||
integer that can be packed into 1 byte is 63 (6 bits).
|
Varints are used to pack integers in an portable and space optimized way. Varints are stored as little-endian integers, with the lowest 2 bits storing the amount of bytes required, which means the largest value integer that can be packed into 1 byte is 63
|
||||||
|
(6 bits).
|
||||||
|
|
||||||
|
#### Byte Sizes
|
||||||
|
|
||||||
| Lowest 2 bits | Size value | Value range |
|
| Lowest 2 bits | Size value | Value range |
|
||||||
|---------------|---------------|-----------------------------------|
|
|---------------|---------------|-----------------------------------|
|
||||||
|
@ -32,20 +36,47 @@ integer that can be packed into 1 byte is 63 (6 bits).
|
||||||
| b10 | 4 bytes | 16384 to 1073741823 |
|
| b10 | 4 bytes | 16384 to 1073741823 |
|
||||||
| b11 | 8 bytes | 1073741824 to 4611686018427387903 |
|
| b11 | 8 bytes | 1073741824 to 4611686018427387903 |
|
||||||
|
|
||||||
#### string
|
#### Represenations of Example Values
|
||||||
|
| Value | Byte Representation (hex) |
|
||||||
|
|----------------------|---------------------------|
|
||||||
|
| 0 | 00 |
|
||||||
|
| 7 | 1c |
|
||||||
|
| 101 | 95 01 |
|
||||||
|
| 17,000 | A2 09 01 00 |
|
||||||
|
| 7,942,319,744 | 03 BA 98 65 07 00 00 00 |
|
||||||
|
|
||||||
These are simply length (varint) prefixed char strings.
|
### Strings
|
||||||
|
|
||||||
## Packet format
|
These are simply length (varint) prefixed char strings without a null
|
||||||
|
terminator (though one can always add one if desired). There is no
|
||||||
|
specific encoding enforced, and in fact, many times binary blobs are
|
||||||
|
stored as these strings. This type should not be confused with the keys
|
||||||
|
in sections, as those are restricted to a maximum length of 255 and
|
||||||
|
do not use varints to encode the length.
|
||||||
|
|
||||||
|
"Howdy" => 14 48 6F 77 64 79
|
||||||
|
|
||||||
|
### Section Keys
|
||||||
|
|
||||||
|
These are similar to strings except that they are length limited to 255
|
||||||
|
bytes, and use a single byte at the front of the string to describe the
|
||||||
|
length (as opposed to a varint).
|
||||||
|
|
||||||
|
"Howdy" => 05 48 6F 77 64 79
|
||||||
|
|
||||||
|
## Binary Format Specification
|
||||||
|
|
||||||
### Header
|
### Header
|
||||||
|
|
||||||
A packet starts with a header:
|
The format must always start with the following header:
|
||||||
|
|
||||||
| Header | Type | Value |
|
| Field | Type | Value |
|
||||||
|---------------|-----------|-----------------------|
|
|------------------|----------|------------|
|
||||||
| Signature | 8 bytes | 0x0111010101010201| |
|
| Signature Part A | UInt32 | 0x01011101 |
|
||||||
| Version | byte | 0x01 |
|
| Signature Part B | UInt32 | 0x01020101 |
|
||||||
|
| Version | UInt8 | 0x01 |
|
||||||
|
|
||||||
|
In total, the 9 byte header will look like this (in hex): `01 11 01 01 01 01 02 01 01`
|
||||||
|
|
||||||
### Section
|
### Section
|
||||||
|
|
||||||
|
@ -63,18 +94,12 @@ Which is followed by the section's name-value [entries](#Entry) sequentially:
|
||||||
|
|
||||||
| Entry | Type |
|
| Entry | Type |
|
||||||
|-------------------|-----------------------|
|
|-------------------|-----------------------|
|
||||||
| Name | string<sup>1</sup> |
|
| Name | section key |
|
||||||
| Type | byte |
|
| Type | byte |
|
||||||
| Count<sup>2</sup> | varint |
|
| Count<sup>1</sup> | varint |
|
||||||
| Value(s) | (type dependant data) |
|
| Value(s) | (type dependant data) |
|
||||||
|
|
||||||
|
<sup>1</sup> Note, this is only present if the entry type has the array flag
|
||||||
<sup>1</sup> Note, the string used for the entry name is not prefixed with a
|
|
||||||
varint, it is prefixed with a single byte to specify the length of the name.
|
|
||||||
This means an entry name cannot be more that 255 chars, which seems a reasonable
|
|
||||||
restriction.
|
|
||||||
|
|
||||||
<sup>2</sup> Note, this is only present if the entry type has the array flag
|
|
||||||
(see below).
|
(see below).
|
||||||
|
|
||||||
#### Entry types
|
#### Entry types
|
||||||
|
@ -90,7 +115,7 @@ The types defined are:
|
||||||
#define SERIALIZE_TYPE_UINT32 6
|
#define SERIALIZE_TYPE_UINT32 6
|
||||||
#define SERIALIZE_TYPE_UINT16 7
|
#define SERIALIZE_TYPE_UINT16 7
|
||||||
#define SERIALIZE_TYPE_UINT8 8
|
#define SERIALIZE_TYPE_UINT8 8
|
||||||
#define SERIALIZE_TYPE_DUOBLE 9
|
#define SERIALIZE_TYPE_DOUBLE 9
|
||||||
#define SERIALIZE_TYPE_STRING 10
|
#define SERIALIZE_TYPE_STRING 10
|
||||||
#define SERIALIZE_TYPE_BOOL 11
|
#define SERIALIZE_TYPE_BOOL 11
|
||||||
#define SERIALIZE_TYPE_OBJECT 12
|
#define SERIALIZE_TYPE_OBJECT 12
|
||||||
|
@ -103,11 +128,14 @@ The entry type can be bitwise OR'ed with a flag:
|
||||||
#define SERIALIZE_FLAG_ARRAY 0x80
|
#define SERIALIZE_FLAG_ARRAY 0x80
|
||||||
```
|
```
|
||||||
|
|
||||||
This signals there are multiple *values* for the entry. When we are dealing with
|
This signals there are multiple *values* for the entry. Since only one bit is
|
||||||
an array, the next value is a varint specifying the array length followed by
|
reserved for specifying an array, we can not directly represent nested arrays.
|
||||||
the array item values. For example:
|
However, you can place each of the inner arrays inside of a section, and make
|
||||||
|
the outer array type `SERIALIZE_TYPE_OBJECT | SERIALIZE_FLAG_ARRAY`. Immediately following the type code byte is a varint specifying the length of the array.
|
||||||
|
Finally, the all the elements are serialized in sequence with no padding and
|
||||||
|
without any type information. For example:
|
||||||
|
|
||||||
<p style="padding-left:1em; font:italic larger serif">name, type, count,
|
<p style="padding-left:1em; font:italic larger serif">type, count,
|
||||||
value<sub>1</sub>, value<sub>2</sub>,..., value<sub>n</sub></p>
|
value<sub>1</sub>, value<sub>2</sub>,..., value<sub>n</sub></p>
|
||||||
|
|
||||||
#### Entry values
|
#### Entry values
|
||||||
|
@ -123,18 +151,32 @@ Note, I have not yet seen the type `SERIALIZE_TYPE_ARRAY` in use. My assumption
|
||||||
is this would be used for *untyped* arrays and so subsequent entries could be of
|
is this would be used for *untyped* arrays and so subsequent entries could be of
|
||||||
any type.
|
any type.
|
||||||
|
|
||||||
|
### Overall example
|
||||||
|
|
||||||
|
Let's put it all together and see what an entire object would look like serialized. To represent our data, let's create a JSON object (since it's a format
|
||||||
|
that most will be familiar with):
|
||||||
|
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"short_quote": "Give me liberty or give me death!",
|
||||||
|
"long_quote": "Monero is more than just a technology. It's also what the technology stands for.",
|
||||||
|
"signed_32bit_int": 20140418,
|
||||||
|
"array_of_bools": [true, false, true, true],
|
||||||
|
"nested_section": {
|
||||||
|
"double": -6.9,
|
||||||
|
"unsigned_64bit_int": 11111111111111111111
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
This would translate to:
|
||||||
|
|
||||||
|
![Epee binary storage format example](/docs/images/storage_binary_example.png)
|
||||||
|
|
||||||
## Monero specifics
|
## Monero specifics
|
||||||
|
|
||||||
### Entry values
|
### Entry values
|
||||||
|
|
||||||
#### Strings
|
|
||||||
|
|
||||||
These are prefixed with a varint to specify the string length.
|
|
||||||
|
|
||||||
#### Integers
|
|
||||||
|
|
||||||
These are stored little endian byte order.
|
|
||||||
|
|
||||||
#### Hashes, Keys, Blobs
|
#### Hashes, Keys, Blobs
|
||||||
|
|
||||||
These are stored as strings, `SERIALIZE_TYPE_STRING`.
|
These are stored as strings, `SERIALIZE_TYPE_STRING`.
|
||||||
|
|
BIN
docs/images/storage_binary_example.png
Normal file
BIN
docs/images/storage_binary_example.png
Normal file
Binary file not shown.
After Width: | Height: | Size: 526 KiB |
Loading…
Reference in a new issue