diff --git a/README.md b/README.md index 600c8f63..83afec5e 100644 --- a/README.md +++ b/README.md @@ -1,6 +1,6 @@

- Project logo + Project logo

@@ -50,7 +50,13 @@ Encoderfiles can run as: - CLI for batch processing - MCP server (Model Context Protocol) -![Build Diagram](docs/assets/diagram.png) +

+ + + + Architecture Diagram + +

### Supported Architectures @@ -105,6 +111,7 @@ cargo build --bin encoderfile --release ### Step 1: Prepare Your Model First, you need an ONNX-exported model. Export any HuggingFace model: +> Requires Python 3.13+ for ONNX export ```bash # Install optimum for ONNX export diff --git a/docs/assets/diagram.mmd b/docs/assets/diagram.mmd deleted file mode 100644 index a8475057..00000000 --- a/docs/assets/diagram.mmd +++ /dev/null @@ -1,42 +0,0 @@ -flowchart TD - subgraph Build Process - subgraph Inputs ["1. Input Assets"] - direction TB - Onnx["ONNX Model
(.onnx)"]:::asset - Tok["Tokenizer Data
(tokenizer.json)"]:::asset - Config["Runtime Config
(config.yml)"]:::asset - end - - subgraph Compile ["2. Compile Phase"] - Compiler["Encoderfile Compiler
(CLI Tool)"]:::asset - end - - subgraph Build ["3. Build Phase"] - direction TB - Builder["Wrapper Process
(Embeds Assets + Runtime)"]:::process - end - - subgraph Output ["4. Artifact"] - Binary["Single Binary Executable
(Static File)"]:::artifact - end - - subgraph Runtime ["5. Runtime Phase"] - direction TB - %% Added fa:fa-server icons - Grpc["fa:fa-server gRPC Server
(Protobuf)"]:::service - Http["fa:fa-server HTTP Server
(JSON)"]:::service - MCP["fa:fa-server MCP Server
(MCP)"]:::service - %% Added fa:fa-cloud icon - Client["fa:fa-cloud Client Apps /
MCP Agent"]:::client - end - - %% Connections - Onnx & Tok & Config --> Builder - Compiler -.->|"Orchestrates"| Builder - Builder -->|"Outputs"| Binary - - %% Runtime Connections - Binary -.->|"Executes"| Grpc - Binary -.->|"Executes"| Http - Grpc & Http & MCP-->|"Responds to"| Client - end diff --git a/docs/assets/diagram.png b/docs/assets/diagram.png deleted file mode 100644 index 616da537..00000000 Binary files a/docs/assets/diagram.png and /dev/null differ diff --git a/docs/assets/encoderfile-dark.svg b/docs/assets/encoderfile-dark.svg new file mode 100644 index 00000000..20cea943 --- /dev/null +++ b/docs/assets/encoderfile-dark.svg @@ -0,0 +1,1202 @@ + + + + + + + + + + + + + + + + + + + + + + + + +ONNX Model(.onnx) + + + + + + + +Tokenizer Data(tokenizer.json) + + + + + + + +Runtime Config(config.yml) + + + + + + + +Encoderfile Compiler(CLI Tool) + + + + + + + +Input Assets + + + + + + + +Compile Phase + + + + + + + +1. + + + + + + + +2. + + + + + + + + + + + + + + + +Wrapper Process(Embeds Assets + Runtime) + + + + + + + +Build Phase + + + + + + + +3. + + Orchestrates + + + + + + + + + + + + + + + +Artifact + + + + + + + +4. + + + + + + + +Single Binary Executable (Static File) + + Outputs + + + + + + + + + + + + + + + +gRPC Server(Protobuf) + + + + + + + +HTTP Server(JSON) + + + + + + + +MCP Server(MCP) + + + + + + + +Client Apps/MCP Agent + + + + + + + +Runtime Phase + + + + + + + +5. + + Executes + + Executes + + Responds To + + Responds To + + Responds To \ No newline at end of file diff --git a/docs/assets/encoderfile-light.svg b/docs/assets/encoderfile-light.svg new file mode 100644 index 00000000..e8b0e3bb --- /dev/null +++ b/docs/assets/encoderfile-light.svg @@ -0,0 +1,1202 @@ + + + + + + + + + + + + + + + + + + + + + + + + +ONNX Model(.onnx) + + + + + + + +Tokenizer Data(tokenizer.json) + + + + + + + +Runtime Config(config.yml) + + + + + + + +Encoderfile Compiler(CLI Tool) + + + + + + + +Input Assets + + + + + + + +Compile Phase + + + + + + + +1. + + + + + + + +2. + + + + + + + + + + + + + + + +Wrapper Process(Embeds Assets + Runtime) + + + + + + + +Build Phase + + + + + + + +3. + + Orchestrates + + + + + + + + + + + + + + + +Artifact + + + + + + + +4. + + + + + + + +Single Binary Executable (Static File) + + Outputs + + + + + + + + + + + + + + + +gRPC Server(Protobuf) + + + + + + + +HTTP Server(JSON) + + + + + + + +MCP Server(MCP) + + + + + + + +Client Apps/MCP Agent + + + + + + + +Runtime Phase + + + + + + + +5. + + Executes + + Executes + + Responds To + + Responds To + + Responds To \ No newline at end of file diff --git a/docs/building_encoderfiles/docker.md b/docs/building_encoderfiles/docker.md index cedd4c5f..f1bf208c 100644 --- a/docs/building_encoderfiles/docker.md +++ b/docs/building_encoderfiles/docker.md @@ -11,7 +11,7 @@ docker pull ghcr.io/mozilla-ai/encoderfile:latest ``` !!! note "Note on Architecture" - Images are published for both `x86_64` and `arm64`. If you're on a more exotic architecture, you'll need to build the Encoderfile CLI from source — see our guide on [Building from Source](../reference/building.md) for more details. + Images are published for both `x86_64` and `arm64`. If you're on a more exotic architecture, you'll need to build the encoderfile CLI from source — see our guide on [Building from Source](../reference/building.md) for more details. ## Mounting Assets @@ -84,8 +84,7 @@ Your path in config.yml doesn’t match where the file appears inside the contai Most of the time this is a missing -v "$(pwd):/opt/encoderfile" or a mismatched working directory. ### “cargo not found” -You’re not using the correct image. -Use ghcr.io/mozilla-ai/encoderfile:latest — it includes the full Rust toolchain needed for builds. +You’re not using the correct image. Make sure you are using `ghcr.io/mozilla-ai/encoderfile:latest` ### Paths behave differently on Windows Use absolute paths or WSL. Docker-for-Windows path translation varies by shell. diff --git a/docs/getting-started.md b/docs/getting-started.md index 67deb671..f6116c5d 100644 --- a/docs/getting-started.md +++ b/docs/getting-started.md @@ -4,7 +4,7 @@ This quick-start guide will help you build and run your first encoderfile in und ## Prerequisites -### Encoderfile CLI Tool +### encoderfile CLI Tool You need the `encoderfile` CLI tool installed: @@ -14,7 +14,7 @@ curl -fsSL https://raw.githubusercontent.com/mozilla-ai/encoderfile/main/install ``` - **Build from source** - Required for Windows, or for latest development features - - See [our guide on building Encoderfile CLI from source](reference/building.md) + - See [our guide on building encoderfile CLI from source](reference/building.md) - **Docker** - Best for CI/CD or isolated builds without installing dependencies - Check out our guide on [Building Encoderfiles with Docker](building_encoderfiles/docker.md) diff --git a/docs/index.md b/docs/index.md index cad0e2df..d1599b68 100644 --- a/docs/index.md +++ b/docs/index.md @@ -100,7 +100,11 @@ See the [API Reference](reference/api-reference.md) for complete endpoint docume Encoderfile compiles your model into a self-contained binary by embedding ONNX weights, tokenizer, and config directly into Rust code. The result is a portable executable with zero runtime dependencies. -![Encoderfile architecture diagram illustrating the build process: compiling ONNX models, tokenizers, and configs into a single binary executable that runs as a zero-dependency gRPC, HTTP, or MCP server.](assets/diagram.png "Encoderfile Architecture") + + + + Encoderfile architecture diagram illustrating the build process: compiling ONNX models, tokenizers, and configs into a single binary executable that runs as a zero-dependency gRPC, HTTP, or MCP server. + ## Documentation diff --git a/docs/reference/building.md b/docs/reference/building.md index 49e735b5..4cf257bb 100644 --- a/docs/reference/building.md +++ b/docs/reference/building.md @@ -6,9 +6,12 @@ This guide explains how to build custom encoderfile binaries from HuggingFace tr Before building encoderfiles, ensure you have: -- [Rust](https://rust-lang.org/tools/install/) - For building the CLI tool and binaries - [Python 3.13+](https://www.python.org/downloads/) - For exporting models to ONNX - [uv](https://docs.astral.sh/uv/getting-started/installation/) - Python package manager + +If you are compiling the encoderfile CLI from source, make sure you also have: + +- [Rust](https://rust-lang.org/tools/install/) - For building the CLI tool and binaries - [protoc](https://protobuf.dev/installation/) - Protocol Buffer compiler ### Installing Prerequisites @@ -214,13 +217,15 @@ encoderfile build -f config.yml The build process will: -1. Load and validate the configuration -2. Check for required model files -3. Validate the ONNX model structure -4. Generate a Rust project with embedded assets -5. Compile the project into a self-contained binary +1. Detect your system platform and download the base runtime binary +2. Load and validate the configuration +3. Check for required model files +4. Validate the ONNX model structure +5. Format assets and append to the base binary 6. Output the binary to the specified path (or `./.encoderfile` if not specified) +For more information on encoderfile file formats and build process, check out our page on [Encoderfile File Format](./file_format.md). + **Build output:** ``` ./build/my-model.encoderfile @@ -298,6 +303,33 @@ encoderfile: ## Advanced Features +### Cross-compilation + +Specify a target architecture for your encoderfile by using the `--platform` argument: + +```bash +encoderfile build -f encoderfile.yml --platform +``` + +Encoderfile releases pre-built base binaries for the following architectures: + +- `x86_64-unknown-linux-gnu` +- `aarch64-unknown-linux-gnu` +- `x86_64-apple-darwin` +- `aarch64-apple-darwin` + +If you want to build the base binary locally, you can also point to a path. For example: + +```bash +# build encoderfile base binary from source (will be at ./target/release/encoderfile-runtime) +cargo build -p encoderfile-runtime --release + +# create encoderfile +encoderfile build \ + -f encoderfile.yml \ + --base-binary-path ./target/release/encoderfile-runtime +``` + ### Lua Transforms Add custom post-processing with Lua scripts: diff --git a/docs/reference/cli.md b/docs/reference/cli.md index edf78f0d..54af7101 100644 --- a/docs/reference/cli.md +++ b/docs/reference/cli.md @@ -1,4 +1,4 @@ -# Encoderfile CLI Documentation +# encoderfile CLI Documentation ## Overview @@ -33,10 +33,15 @@ encoderfile build -f [OPTIONS] | Option | Short | Type | Required | Description | |--------|-------|------|----------|-------------| -| `--config` | `-f` | Path | Yes | Path to YAML configuration file | +| - | `-f` | Path | Yes | Path to YAML configuration file | | `--output-dir` | - | Path | No | Override output directory from config | | `--cache-dir` | - | Path | No | Override cache directory from config | | `--no-build` | - | Flag | No | Generate project files without building | +| `--base-binary-path` | - | Path | No | Specify custom local base binary | +| `--platform` | - | Option | No | Target platform for compiled binary (e.g., `aarch64-apple-darwin`, `x86_64-unknown-linux-gnu`). Equivalent of Cargo's `--target`. Default is the architecture of whatever machine you are using. | +| `--version` | - | Option | No | Override default encoderfile version | +| `--no-download` | - | Flag | No | Disable downloading of base binary | + #### Configuration File Format @@ -160,12 +165,8 @@ The `build` command performs the following steps: - `tokenizer.json` - Tokenizer configuration (or path specified in config) - `config.json` - Model configuration (or path specified in config) 3. **Validates ONNX model** - Checks the ONNX model structure and compatibility -4. **Generates project** - Creates a new Rust project in the cache directory with: - - `main.rs` - Generated from Tera templates - - `Cargo.toml` - Generated with proper dependencies -5. **Embeds assets** - Uses the `factory!` macro to embed model files at compile time -6. **Compiles binary** - Runs `cargo build --release` on the generated project -7. **Outputs binary** - Copies the binary to the specified output path +4. **Embeds assets** - Appends embedded artifacts to a pre-built base binary +5. **Outputs binary** - Copies the binary to the specified output path #### Output @@ -191,9 +192,12 @@ This binary is completely self-contained and includes: Before building, ensure you have: +- Valid ONNX model files + +If you are compiling the encoderfile CLI from source, make sure you also have: + - [Rust](https://rustup.rs/) toolchain - [protoc](https://protobuf.dev/) Protocol Buffer compiler -- Valid ONNX model files #### Troubleshooting @@ -215,12 +219,6 @@ Solution: The path specified in the config file doesn't exist. Check the path value in your YAML config. ``` -**Error: "cargo build failed"** -``` -Solution: Check that Rust and required system dependencies are installed. -Run: rustc --version && cargo --version -``` - **Error: "Cannot locate cache directory"** ``` Solution: System cannot determine the cache directory. diff --git a/docs/reference/file_format.md b/docs/reference/file_format.md new file mode 100644 index 00000000..464005b3 --- /dev/null +++ b/docs/reference/file_format.md @@ -0,0 +1,59 @@ +# Encoderfile File Format + +Encoderfiles are essentially Rust binary executables with a custom appended section containing metadata and inference assets. At runtime, an encoderfile will read its own executable and pull embedded data as needed. + +Encoderfiles are comprised of 4 parts (in order): + +- **Rust binary:** Machine code that is actually executed at runtime +- **Encoderfile manifest:** A protobuf containing encoderfile metadata and lengths, offsets, and hashes of model artifacts +- **Model Artifacts:** Appended raw binary blobs containing model weights, tokenizer information, transforms, etc. +- **Footer:** A fixed-sized (32 byte) footer that contains a magic (`b"ENCFILE\0"`), the location of the manifest, flags, and format version. + +This approach has a few significant advantages: + +- No language toolchain requirement for building encoderfiles +- Encoderfiles are forward-compatible by design: A versioned footer plus a self-describing protobuf manifest allow new artifact types and metadata to be added without changing the binary layout or breaking older runtimes. + +The official file extension for encoderfiles is `.encoderfile`. + +For implementation details, see the [Protobuf specification for encoderfile manifest](https://github.com/mozilla-ai/encoderfile/blob/main/encoderfile/proto/manifest.proto) and the [footer](https://github.com/mozilla-ai/encoderfile/blob/main/encoderfile/src/format/footer.rs). + +## Base Binaries + +The source code for the base binary to which model artifacts are appended can be found in the [encoderfile-runtime](https://github.com/mozilla-ai/encoderfile/tree/main/encoderfile-runtime) crate. By default, the encoderfile CLI pulls pre-built binaries from Github Releases. Currently, we offer pre-built binaries for `aarch64` and `x86_64` architectures of `unknown-linux-gnu` and `apple-darwin`. + +Base binaries are built in a `debian:bookworm` image and are compatible with glibc ≥ 2.36. If you are using an older version of glibc, see instructions on compiling custom base binaries below. + +### Cross-compilation & Custom Base Binaries + +Pre-built binaries make cross-compilation for major platforms and operating systems trivial. When building encoderfiles, just specify which platform you want to build the encoderfile for with the `--platform` argument. For example: + +```bash +encoderfile build \ + -f encoderfile.yml \ + --platform x86_64-unknown-linux-gnu +``` + +Platform identifiers use Rust target triples. If you do not specify a platform identifier, encoderfile CLI will auto-detect your machine's architecture and download its corresponding base binary (if not already cached). + +If your target platform is not supported by our pre-built binaries, it is easy to custom build a base binary from source code and point the encoderfile build CLI to it. To build the base binary using Cargo: + +```bash +cargo build -p encoderfile-runtime +``` + +Then, assuming your base binary is at `target/release/encoderfile-runtime`: + +```bash +encoderfile build \ + -f encoderfile.yml \ + --base-binary-path target/release/encoderfile-runtime +``` + +If you do not want to download base binaries and instead rely on cached binaries or a custom binary, you can pass the `--no-download` flag like this: + +```bash +encoderfile build \ + -f encoderfile.yml \ + --no-download +``` diff --git a/docs/transforms/index.md b/docs/transforms/index.md index 18e17782..4b29143c 100644 --- a/docs/transforms/index.md +++ b/docs/transforms/index.md @@ -40,7 +40,7 @@ If you don't see an op that you need, please don't hesitate to [create an issue] ## Creating a New Transform -To create a new transform, use the Encoderfile CLI: +To create a new transform, use the encoderfile CLI: ``` encoderfile new-transform --model-type [embedding|sequence_classification|etc.] > /path/to/your/transform/file.lua diff --git a/mkdocs.yml b/mkdocs.yml index dc78291e..d008c9f8 100644 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -30,6 +30,7 @@ nav: - Transforms: transforms/index.md - Reference: transforms/reference.md - Reference: + - Encoderfile File Format: reference/file_format.md - CLI Reference: reference/cli.md - API Reference: reference/api-reference.md - Building Guide: reference/building.md