DVault is a secure, encrypted file vault format designed for:
- Streaming: Ability to read individual files or parts of files without downloading the entire vault.
- Random Access: Efficiently seek to any byte in any file.
- Small File Efficiency: Efficient storage of many small files.
- Browser Compatibility: Designed to work with a Dart Virtual File System backed by SQLite in the browser.
The vault is divided into fixed-size "Pages" (e.g., 64KB).
- Independent Decryption: Each page can be decrypted independently. This allows fetching and decrypting only the needed parts of the vault.
- Streaming: The browser can request specific pages as needed.
- Caching: Pages can be cached locally (e.g., in SQLite) to avoid re-downloading.
- The vault acts as a backing store for a VFS.
- The VFS maintains a mapping of
File Path->Vault Offset + Length. - The VFS reads from the vault by calculating which Pages contain the requested byte range.
- Algorithm: AES-256-GCM (or XChaCha20-Poly1305) is recommended for authenticated encryption.
- Key Derivation: Argon2id or Scrypt for deriving keys from a passphrase.
- Page Security: Each page must have a unique Nonce/IV.
- Nonce Generation: Deterministic nonce based on Page Index + Salt, or random nonce stored with the page. Deterministic is preferred for space efficiency if using XChaCha20 or similar with large nonces, but for AES-GCM (12-byte nonce), storing it or deriving it carefully is needed.
- Authentication: Each page is authenticated (GCM tag) to prevent tampering.
Fixed-size header containing:
- Magic Bytes:
DVAULT - Version: Format version (e.g.,
2) - KDF Parameters: Salt, Iterations, Memory cost, etc.
- Page Size: Size of each encrypted page (e.g., 65536 bytes).
- TOC Pointer: Offset to the Table of Contents (usually at the end of the file).
The body of the vault consists of a sequence of Pages.
- Page Structure:
[Nonce (12 bytes)] [Ciphertext (N bytes)] [Auth Tag (16 bytes)] - Content: Pages contain a continuous stream of data. Files are packed into this stream.
- Packing: Small files are concatenated. A file may start in the middle of Page X and end in the middle of Page Y.
The TOC stores the metadata for all files in the vault.
- Location: Stored at the end of the vault (allows appending).
- Encryption: The TOC itself is stored in one or more encrypted pages.
- Content:
- List of File Entries:
- Path (UTF-8)
- Directory Structure: Implicit (Full Paths).
- The TOC stores full paths (e.g.,
photos/2023/image.jpg). - Directories are inferred from the paths.
- The TOC stores full paths (e.g.,
- Environment Variables:
- A
Map<String, String>stored in the TOC. - Allows fast access/update without modifying file content pages.
- Encrypted along with the rest of the TOC.
- A
- List of File Entries:
Both the Browser and CLI will use a simple In-Memory Index for the Table of Contents (TOC).
- TOC Loading:
- On open, the application reads the TOC (located at the end of the file) into memory.
- It builds a
Map<String, FileEntry>or a lightweight Tree structure.
- Memory Usage:
- A vault with 100,000 files will consume ~50-100MB of RAM. This is acceptable for modern Browsers and Desktop environments.
- Caching:
- Browser: Can optionally cache decrypted Pages in memory (LRU Cache) to improve performance when seeking/reading small chunks.
- CLI: Relies on OS file buffering.
- Simplicity:
- No external dependencies (SQLite).
- Pure Dart implementation.
- Identical logic for both platforms.
The CLI tool dvault must support:
- Creating a vault from a directory.
- Extracting a vault.
- Mounting a vault (FUSE) or serving it via HTTP (for browser testing).
- Listing contents.
- What methods should we allow the user to use to pass in the password?
By packing files into a continuous stream, we avoid padding overhead for small files.
- Overhead: Only the per-page overhead (Nonce + Tag) and the TOC entry size.
- Example: 1000 files of 100 bytes each = 100KB payload.
- Stored in ~2 Pages (64KB each).
- Overhead: ~2 * (12+16) bytes = 56 bytes for encryption + TOC size.
- Very efficient compared to 1000 separate encrypted blobs.