Home

TytoDB Documentation (v0.1.0-beta)

Getting started

To get started you must download the database using the command below

curl -fsSL https://tytodb.pages.dev/installer.sh | bash

To use it, you must write commands, queries, or transactions by using connection handlers like the TytoDB-Rust-Client. To configure the database or manage the program files manually, the files are usually in ~/TytoDB, since it uses $HOME/TytoDB to store its files.

Compatibility

The database is written in Rust and C, which means that it is portable across processors and systems. However, the program relies on io_uring, which means that the kernel must be version 5.1 or newer. If you are on an ARM device or want to compile the program yourself, the C part of the code (io.c) uses a global include #include , which must be available during compilation.

Overview

TytoDB is a high-performance, non-relational column-row database designed with a strong emphasis on ACID compliance. It excels in equality searches involving the primary key and is optimized for speed in database scans, as well as in insert and delete operations.

Containers

In TytoDB, data is stored in containers, which are files where the database data is stored. The disk layout of the containers involves three key components: the metadata layout, the row structure, and tombstones.

Metadata

Container files store metadata at the start of the file, organized as a tuple of two arrays:

  • The first array is a string array storing column names.
  • The second array is a byte array storing the type ID of each column.

When reading the metadata, the two arrays are interpreted by "matching" both arrays: the first column has the element zero as its name and the first element of the byte array as its type, and so on for the other columns.

Row Layout and Pattern

Rows are stored right after the metadata as an array of elements of the same size (determined by the type size). They are placed as elements in a sequential list (akin to a Vec or Array), which means that reads from the disk can be performed using a relative pointer (index of that pseudo-array).

Diagram of a data storage layout showing a metadata block followed by a sequential array of fixed-size rows, each indexed for relative pointer access.

Tombstones

After deleting a row, the space on the disk is filled with "full" bytes (0xFF, also represented as 255u8), marking that row as a tombstone. Afterwards, the pointer to that row is pushed into an in-memory BTreeSet, called the graveyard, so the space can be recycled later by having new data overwrite it. However, this does not guarantee that the space will be reused with every insert. When a row is inserted, it either gets the index at the end of the file if the graveyard is empty or pops a pointer from the structure. This pointer can be lost if the program rolls back or crashes. An operation called Vacuum is available to clean up the tombstones left behind.

Diagram illustrating a data storage layout with a metadata block followed by a sequential array of fixed-size rows, some marked as tombstones with full bytes (0xFF). Includes an in-memory BTreeSet graveyard storing up to 12,500 pointers to tombstones for reuse, with new rows either appended or overwriting graveyard pointers, and a Vacuum operation to clean up tombstones.

Alba Types

Referring to data types, Alba types is the name given to all the individual types a column can have in the database (Alba was the name of the query language the database once had in its first versions). The table below shows all the types available to be used within the program and their properties.

AlbaType ID Type Name Max Content Size Disk Size Description
0 NONE 0 bytes Absence of value (used only in metadata)
1 CHAR 1 Unicode scalar (4 bytes) 4 bytes Rust char (UTF-32)
2 INT 32-bit signed integer 4 bytes Rust i32
3 BIGINT 64-bit signed integer 8 bytes Rust i64
4 BOOL Boolean 1 byte Rust bool
5 FLOAT 64-bit IEEE-754 8 bytes Rust f64
6 TEXT up to MAX_STR_LEN dynamic String, variable-length (deprecated)
7 NANO-STRING up to 10 bytes 10 + 8 = 18 bytes Small fixed-max string with 8-byte length prefix
8 SMALL-STRING up to 100 bytes 100 + 8 = 108 bytes Fixed-max small string with length prefix
9 MEDIUM-STRING up to 500 bytes 500 + 8 = 508 bytes Fixed-max medium string with length prefix
10 BIG-STRING up to 2,000 bytes 2,000 + 8 = 2,008 bytes Fixed-max large string with length prefix
11 LARGE-STRING up to 3,000 bytes 3,000 + 8 = 3,008 bytes Fixed-max very large string with length prefix
12 NANO-BYTES up to 10 bytes 10 + 8 = 18 bytes Small fixed-max blob with length prefix
13 SMALL-BYTES up to 1,000 bytes 1,000 + 8 = 1,008 bytes Fixed-max small blob with length prefix
14 MEDIUM-BYTES up to 10,000 bytes 10,000 + 8 = 10,008 bytes Fixed-max medium blob with length prefix
15 BIG-BYTES up to 100,000 bytes 100,000 + 8 = 100,008 bytes Fixed-max large blob with length prefix
16 LARGE-BYTES up to 1,000,000 bytes 1,000,000 + 8 = 1,000,008 bytes Fixed-max very large blob with length prefix

Indexing

TytoDB performs its indexing with an on-disk hashmap, which stores two-item u64 tuples: the first 64-bit unsigned integer is the primary key (hashed using the ahash algorithm), and the second is the pointer to the row in the container. Indexing is used in most operations like insert, delete, edit, and search. It is employed during an insert to guarantee the uniqueness of a primary key (which is crucial for the indexing to keep working) and, for edit and delete operations, it can sometimes (depending on the query plan) be used to determine which rows must be affected by the operation.

Primary Keys

The primary key is always the first column of any container.

Operations

Create a Container

Creates a container using as arguments a tuple of arrays: the first one being a string array and the second one a byte array for generating the metadata. The last argument is a string that specifies the container name.

Create Row

Creates a row in a container using the following arguments: an array of strings, an array of values, and a string. The first and second arrays implement the row layout; it is not necessary to mention all the values of all columns to create a row. When not mentioned, the value's zero is assigned. This makes the explicit definition of primary keys desirable, although not required.

Edit Row

Makes a search in the specified container and edits the rows (edits go into the MVCC) returned by the search operation. This operation has four parameters: the first is an array of text which contains the names of the columns being edited, the second is the new values, the third is the name of the container that the operation will run on, and the last is a condition set for the search operation to filter the rows that match these conditions and edit only those rows.

Delete Row

Performs a search in the specified container and deletes the rows the search returns. It requires two arguments: the container name to specify where the operation will run, and the condition set for the search.

Delete Container

Deletes a container. It receives a container name as an argument.

Uses the query_type internal function to determine what sort of search must be done based on the condition set; the possible sorts are either scans or indexed searches. An indexed search uses the index hashmap to get a row, while a scan searches the dataset, returning the results of the rows matching the conditions (all rows if no conditions are entered). The operation requires three arguments: the first one is a string array that determines what columns are going to be fetched, the second is a string (container name) that determines the container where the operation will be run.

Commit

The Commit operation flushes data from the MVCC to the container file. It first iterates through the MVCC, gathering the data within it and detecting what must be written. There is a flag on every element in the MVCC structure that determines whether it is a deletion or not. If it is a deletion, the operation will write into the file at the row location an array of "full" bytes (0xFF or 255u8) with the same size as the original data and remove entries from the index file with the same key as the deleted row. Insertions can be either an insertion or an edit (the process is very similar in practice). For insertions, the commit writes the serialized row at the location it must be at and inserts into the hashmap the pointer to that row using the primary key as the key. Edits are similar to insertions; the only difference is that before writing the index into the hashmap, it removes entries with the old primary key to avoid collisions of any sort.

Rollback

Clears the MVCC and graveyard, effectively discarding all pending changes and reverting the database to its state before the transaction began.

Vacuum

Vacuum is a function that, when called by the database, runs in a container and cleans the tombstones in its file. This execution starts by scanning the entire dataset and pushing a boolean value that indicates whether the row is or is not a tombstone into a bitmap. Afterwards, two variables are declared: the first one is "cursor" and the other "back-cursor"; the cursor starts with a value of zero, and the back-cursor starts at N-1, where N is the bitmap length (count of elements). A loop runs until cursor > back-cursor, increasing the cursor value by one until it finds a tombstone in the bitmap at the index the cursor holds. Then, the loop state changes, and it starts decreasing the value of back-cursor until it finds a non-tombstone. After finding both a tombstone and a non-tombstone, both values are pushed into a vector that stores tuples of numbers (cursor and back-cursor positions).

After the loop involving direct interaction with the variables cursor and back-cursor ends, a for loop starts iterating through the vector of tuples, making swaps in the disk and the bitmap. After each swap, the references in the index map are updated, followed by an fsync. Although this slows the process down, it ensures data safety since a crash or power outage during this operation could be catastrophic, especially while there is no backup system like the one commit has with the MVCC.

MVCC

MVCC (Multi-Version Concurrency Control) is a mutex-protected hashmap that every container has in memory. Insertions into this structure are done so that the Commit operation can flush the data changes from it to the container file. After every insertion into the MVCC, appends are made to the MVCC WAL (Write-Ahead Log), which gets an fsync after every append. Both the append and fsync are run asynchronously to avoid blocking the operation that performed the push.

Networking

The database is network-based and listens on a port on a host (configured in the configurations), using TCP to operate. It employs FalcoTCP to handle the low-level networking alongside the cryptography of data being transmitted and connections. When a client sends bytes to the server, it tries to interpret them, and that is how commands are submitted to TytoDB. To handle those "bytes" the database receives and interprets, a connection handler built on top of FalcoTCP must be used. That handler will use an API/UI (whichever method is chosen) to help generate the byte instructions.

Configurations

The configurations change the way the database does certain things. The configuration file is found in the database's directory ($HOME/TytoDB/), named settings.yaml, which contains several elements:

  • min_columns: The minimum count of columns every container must have on creation (will not affect existing containers).
  • max_columns: The maximum count of columns every container can have on creation (will not affect existing containers).
  • ip: The host the database will listen to.
  • port: The port on the host the database will be listening to.
  • vacuum: The schedule of the "Vacuum" function.

Vacuum Scheduling

This setting in the YAML file is a list where every element is a string tuple. The first item is the container name (where the vacuum will happen), and the second is when the vacuum must happen, which can follow 4 different patterns:

  • "X unit": Where X is a number and unit is a unit of time (available units: "seconds", "minutes", "hours", "years", or "decades"). When using this schedule pattern, the function will be run every X * unit. For example, if X is 10 and the unit is "years", then a vacuum will happen every 10 years.
  • "HH:MM:SS": Schedules a vacuum to happen every day at the specified hour. For example, if set to "01:00:00", the database will run a vacuum every day at 1 AM.
  • "MM/DD HH:MM:SS": Schedules a vacuum to happen every year at the specified month, day, and time of day. For example, if set to "01/01 01:00:00", the database will run a vacuum every year on January 1st at 1 AM.
  • "once": Does not schedule a vacuum to happen repeatedly but runs a vacuum once on database startup.

General Example:

vacuum:
- ["users", "once"]
- ["logs", "12:00:00"]
- ["data", "5 minutes"]

Byte Instruction Protocol Specification

This document specifies the low-level binary protocol for TytoDB operations. It is derived directly from the serialization and deserialization logic in the Rust TytoDB Client source code (`commands.rs`, `types.rs`, `dynamic_int.rs`) and is intended for developers creating compatible connection handlers or alternative database implementations.

Fundamental Encoding Structures

DynamicInteger Encoding

To efficiently encode the size of variable-length data, TytoDB uses a custom `DynamicInteger` format. This structure consists of a 1-byte flag that signals the size of the subsequent integer, which holds the actual length. This allows for compact representation of small numbers while supporting lengths up to 2^64-1. All integers are written in little-endian byte order.

The encoding is determined as follows:

  • If the length fits within a u8 (0-255), the stream will contain the flag 0x00 followed by the 1-byte length.
  • If the length fits within a u16, the stream will contain the flag 0x01 followed by the 2-byte length.
  • If the length fits within a u32, the stream will contain the flag 0x02 followed by the 4-byte length.
  • For all larger lengths, the stream will contain the flag 0x03 followed by the 8-byte u64 length.

AlbaTypes Serialization

All data values are serialized using the `AlbaTypes` format. Each serialized value begins with a 1-byte Type ID, immediately followed by the encoded data. All multi-byte numerical values are encoded in little-endian byte order.

  • String (ID 0x00): A 1-byte Type ID (0x00), followed by a DynamicInteger representing the string's byte length, followed by the raw UTF-8 bytes of the string.
  • Bytes (ID 0x0b): A 1-byte Type ID (0x0b), followed by a DynamicInteger for the length, followed by the raw bytes.
  • Integers (U8-U128, I32, I64): A 1-byte Type ID (e.g., 0x01 for U8, 0x0a for I64), followed by the fixed-size, little-endian bytes of the number (1, 2, 4, 8, or 16 bytes).
  • Floats (F32, F64): A 1-byte Type ID (0x06 or 0x07), followed by the 4 or 8 bytes of the IEEE 754 representation.
  • Bool (ID 0x08): A 1-byte Type ID (0x08), followed by a single byte: 0x01 for true, 0x00 for false.

Condition Block Structure

This reusable block is used to define filters for `EditRow`, `DeleteRow`, and `Search` commands. It is a sequence of two distinct parts: the conditions themselves and the logic that combines them.

First, a single u8 byte specifies the number of condition tuples (D). This is followed by D condition tuples. Each tuple is serialized by writing the condition's column name (as a u8-prefixed string), followed by the 1-byte ID of the logical operator (e.g., `Equal`, `Higher`), followed by the value for comparison, which is serialized as a complete `AlbaType` structure.

After all condition tuples, a second u8 byte specifies the number of logic operators (L). This is followed by L logic tuples. Each logic tuple consists of a 1-byte index pointing to a condition and a 1-byte operator (0x01 for 'AND', 0x00 for 'OR') that links it to the preceding condition.

Operation Structures

1. Create Container Opcode: 0x00

The byte stream begins with the opcode 0x00. It is followed by the container name, encoded as a single u8 for its length and then the name's raw ASCII bytes. Next, a single u8 specifies the number of columns (C). This is followed by a sequence of C column name definitions, where each name is individually prefixed by its u8 length. Finally, a contiguous block of C bytes is appended, where each byte is the u8 Type ID for the corresponding column in the order they were defined.

2. Create Row Opcode: 0x01

The stream starts with opcode 0x01. It is followed by the container name, encoded as a u8-prefixed string. Next, a u8 indicates the number of columns (C) for which values are being provided. This is followed by C column names, each encoded as a u8-prefixed string. Finally, a sequence of C values are appended, with each value being fully serialized according to its `AlbaType` structure (Type ID + data).

3. Edit Row Opcode: 0x02

The stream begins with opcode 0x02, followed by the u8-prefixed container name. After the name, a u8 specifies the number of changes (C). This is followed by C pairs of a u8-prefixed column name and a fully serialized `AlbaType` value. After the changes block, a complete Condition Block is appended to specify which rows to update.

4. Delete Row Opcode: 0x03

The instruction starts with opcode 0x03, followed by the u8-prefixed container name. A single byte flag follows; if it is 0x01, a complete Condition Block is appended. If the flag is 0x00, no further data is present, indicating a deletion of all rows (if supported by the server's query plan).

5. Delete Container Opcode: 0x04

This instruction starts with opcode 0x04. It is immediately followed by the raw ASCII bytes of the container name. It is critical to note that this is an exceptional case: the container name is not prefixed with its length.

6. Search Opcode: 0x05

The search instruction is uniquely structured. It begins with opcode 0x05. This is followed by a u8 indicating the number of columns to fetch (C), then C instances of u8-prefixed column names. After the column names, a complete Condition Block is appended. Following the conditions, an 8-byte u64 specifies the length (L) of the final data block. This final block contains the target container, encoded as a standard u8-prefixed string.

7. Commit Opcode: 0x06

Starts with opcode 0x06. A 1-byte flag follows: if 0x01, a u8-prefixed container name is appended to commit changes only for that container. If 0x00, no more data follows, and the commit applies to all containers with pending changes.

8. Rollback Opcode: 0x07

Starts with opcode 0x07. A 1-byte flag follows: if 0x01, a u8-prefixed container name is appended to roll back changes for that container only. If 0x00, the rollback is for all containers.

9. Batch Create Rows Opcode: 0x08

The instruction begins with opcode 0x08, followed by a u8-prefixed container name. Next, a u8 specifies the number of columns (C), followed by C u8-prefixed column names. After the schema information, a 4-byte u32 indicates the number of rows (R) in the batch. Finally, a flattened sequence of R * C values, each serialized as a complete `AlbaType`, is appended.

10. Batch Opcode: 0x09

This instruction starts with opcode 0x09. It is followed by a 4-byte signed integer (i32). The absolute value of this integer represents the number of commands (C) in the batch. If the integer is negative, the entire batch should be executed as a single transaction. The integer is followed by C command blocks. Each block consists of a 4-byte u32 indicating the length of the subsequent command's byte stream, followed by the complete byte stream of that command (including its own opcode and data).

Implementation Constraints

The serialization logic imposes the following maximum sizes:

  • Container names are limited to 100 bytes (MAX_CONTAINER_NAME_LENGTH).
  • Column names are limited to 25 bytes (MAX_CONTAINER_COLUMN_LENGTH).
  • The number of columns in a container is limited to 255 (u8::MAX).
  • The number of conditions in a single operation is limited to 255 (u8::MAX).