intent-recognizer-c

A zero-dependency C implementation of semantic intent recognition using EmbeddingGemma-300M sentence embeddings.

Registers trigger phrases, embeds them with a Gemma 3 transformer, and matches utterances via cosine similarity. All weights are float32.

Getting Started

1. Install Python dependencies

pip install -r scripts/requirements.txt

2. Export model weights

python scripts/export-weights.py

Output goes to models/embeddinggemma/ containing:

embedding.bin — float32 transformer + projection weights (~1.2 GB)
tokenizer.bin — SentencePiece tokenizer (262K vocab)

3. Build and test

make
./test_embedding models/embeddinggemma

C API

#include "embedding.h"

// Load model (immutable, thread-safe, load once)
embedding_model *model = embedding_model_load("models/embeddinggemma");

// Create per-thread state (mutable scratch buffers)
// Second arg caps sequence length: 128 ≈ 5.7 MB, 0 = model max (2048 ≈ 92 MB)
embedding_state *state = embedding_state_create(model, 128);

// Get a 768-dim L2-normalized embedding
float emb[768];
embedding_model_embed(model, state, "turn on the lights", emb, 768);

// Intent recognition
intent_recognizer *ir = intent_recognizer_create(model, state, 0.7f);
intent_recognizer_register(ir, "turn on the lights", my_callback, NULL);
intent_recognizer_register(ir, "what is the weather", my_callback, NULL);

// Process from any thread (with its own state)
intent_recognizer_process(ir, model, state, "switch on the lights");

intent_recognizer_free(ir);
embedding_state_free(state);
embedding_model_free(model);

Concurrency Model

embedding_model — immutable after load, share across threads
embedding_state — mutable scratch buffers, one per concurrent call. Pass max_seq to control memory usage (~5.7 MB at 128 vs ~92 MB at 2048). For intent recognition, 64-128 is typically sufficient
intent_recognizer — register intents during setup (single-threaded), then process is thread-safe with separate states

Architecture

EmbeddingGemma-300M is a Gemma 3 bidirectional transformer:

Tokenize text (SentencePiece, 262K vocab)
Embed tokens + scale by sqrt(768)
24 transformer layers (GQA 3Q/1KV, head_dim=256, RMSNorm, gated GELU MLP, RoPE)
Mean pool across sequence
Dense 768 → 3072 → 768
L2 normalize

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
scripts		scripts
.gitignore		.gitignore
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
embedding.c		embedding.c
embedding.h		embedding.h
test_embedding.c		test_embedding.c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

intent-recognizer-c

Getting Started

1. Install Python dependencies

2. Export model weights

3. Build and test

C API

Concurrency Model

Architecture

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

intent-recognizer-c

Getting Started

1. Install Python dependencies

2. Export model weights

3. Build and test

C API

Concurrency Model

Architecture

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages