Skip to content

sr198/yeti

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Yeti

Yeti is a face recognition proof of concept with three layers in one repo:

  • A Python engine that runs detection, quality checks, optional liveness, alignment, embedding, and matching
  • A FastAPI service that exposes that engine over HTTP
  • A Next.js web app for camera-based inspection, enrollment, identification, and comparison

The project is designed to be configurable, local-first, and easy to inspect. Models are loaded from disk, runtime behavior is controlled by YAML, and enrolled templates are stored in a simple local numpy/JSON store under .yeti-store/.

What This Project Does

Yeti supports four core workflows:

  • inspect: analyze one image and return detections, quality metrics, optional liveness, and timing breakdowns
  • onboard: enroll a subject by extracting and storing a face embedding with optional metadata
  • identify: compare a probe image against enrolled templates and return top matches
  • compare: compare two images directly without using the enrollment store

In practice, the typical flow is:

  1. Detect a face in an image
  2. Reject low-quality captures
  3. Optionally reject spoof attacks
  4. Align the face to a canonical pose
  5. Extract a 512-dimensional embedding
  6. Match that embedding against stored templates using cosine similarity

Repo Layout

.
├── apps/
│   ├── api/        # FastAPI wrapper around the engine
│   └── web/        # Next.js camera-first demo app
├── docs/           # Earlier project notes and reference docs
├── examples/       # Example engine config and model manifest
├── models/         # Local ONNX models downloaded at setup time
├── scripts/        # Small repository utilities
├── src/yeti/       # Core engine implementation
├── tests/          # Python tests
├── pyproject.toml  # Python package config
└── uv.lock         # Locked Python dependency graph for uv users

Stack

  • Python 3.12+
  • FastAPI
  • OpenCV
  • ONNX Runtime
  • InsightFace
  • Next.js 15
  • React 18
  • TypeScript

Requirements

Model binaries are not meant to be committed. Download them from the checked-in manifest at examples/model-manifest.yaml into a local models/ directory.

The checked-in config at examples/config.yaml already points to the default filenames that the manifest downloads.

Setup

This repo is set up for uv and targets Python 3.12+ via requires-python = ">=3.12" in pyproject.toml.

Install uv

See the official docs: https://docs.astral.sh/uv/

On macOS and Linux:

curl -LsSf https://astral.sh/uv/install.sh | sh

uv project basics

If you are creating a brand new project from scratch, the usual bootstrap flow is:

uv init --python 3.12

That step is not needed for this repo because pyproject.toml and uv.lock already exist.

Python engine and API with uv

uv python install 3.12
uv sync --extra dev
uv run yeti-download-models --manifest examples/model-manifest.yaml --output-dir models

What this does:

  • uv python install 3.12 ensures a compatible interpreter is available
  • uv sync --extra dev creates the project environment if needed, resolves against uv.lock, and installs the dev dependencies
  • uv run yeti-download-models ... downloads the ONNX model files into your local models/ directory

The downloader reads a YAML manifest with this structure:

models:
  model-name:
    url: https://example.com/path/to/model.onnx

The output filename defaults to the URL basename. You can override it per entry with filename: your-model.onnx.

About the virtual environment:

  • You do not need to create a virtual environment manually when using uv
  • uv sync, uv run, and similar project commands will create and manage the project environment automatically, typically at .venv/
  • You also do not need to activate it to use the project; uv run ... is usually the better default

If you want an activated shell anyway:

source .venv/bin/activate

If you prefer the script path instead of the console entrypoint:

uv run python scripts/download_models.py --manifest examples/model-manifest.yaml --output-dir models

Web app

cd apps/web
npm install
cd ../..

How To Run

CLI

Run commands through uv so you do not depend on shell activation.

Inspect an image:

uv run yeti inspect --config examples/config.yaml --image path/to/image.jpg

Enroll a subject:

uv run yeti onboard --config examples/config.yaml --subject-id alice --image path/to/alice.jpg

Enroll with metadata:

uv run yeti onboard \
  --config examples/config.yaml \
  --subject-id alice \
  --image path/to/alice.jpg \
  --metadata '{"team":"demo","role":"admin"}'

Identify a subject:

uv run yeti identify --config examples/config.yaml --image path/to/query.jpg --top-k 3

Enable liveness explicitly on any command:

uv run yeti inspect --config examples/config.yaml --image path/to/image.jpg --liveness

API

Start the API:

uv run uvicorn apps.api.main:app --reload

By default the API reads examples/config.yaml. Override that with:

export YETI_CONFIG=/absolute/path/to/config.yaml

Health check:

curl -s http://127.0.0.1:8000/api/v1/health

Inspect:

curl -s -X POST \
  -F image=@path/to/image.jpg \
  -F run_liveness=false \
  http://127.0.0.1:8000/api/v1/inspect

Onboard:

curl -s -X POST \
  -F image=@path/to/alice.jpg \
  -F subject_id=alice \
  -F 'metadata={"team":"demo"}' \
  http://127.0.0.1:8000/api/v1/onboard

Identify:

curl -s -X POST \
  -F image=@path/to/query.jpg \
  -F top_k=3 \
  -F run_liveness=true \
  http://127.0.0.1:8000/api/v1/identify

Compare:

curl -s -X POST \
  -F source_image=@path/to/source.jpg \
  -F target_image=@path/to/target.jpg \
  -F run_liveness=false \
  http://127.0.0.1:8000/api/v1/compare

List enrolled subjects:

curl -s http://127.0.0.1:8000/api/v1/subjects

Web app

Start the frontend:

cd apps/web
npm run dev

Open http://localhost:3000.

If the API is not running on http://127.0.0.1:8000, set:

export NEXT_PUBLIC_YETI_API_BASE=http://127.0.0.1:8000

The web app is a camera-first Next.js app that lets you:

  • inspect a live frame or uploaded image
  • capture burst frames for enrollment
  • identify a live or uploaded face
  • compare two images side by side
  • browse enrolled subject records

Configuration Model

All engine behavior comes from examples/config.yaml. The main sections are:

  • runtime: execution providers and runtime logging
  • models: local ONNX paths
  • detection: detector thresholds, input size, face-selection rule
  • quality: gating thresholds for image usability
  • liveness: whether liveness runs by default and what spoof threshold is used
  • alignment: aligned crop size, fixed at 112
  • embedding: embedding dimension and normalization behavior
  • matcher: similarity backend, threshold, and top_k
  • storage: local template and metadata persistence paths

The default matcher uses cosine similarity over normalized embeddings and stores data locally in:

  • .yeti-store/templates.npz
  • .yeti-store/metadata.json

How embeddings are stored

The current matcher is LocalNumpyMatcher, which keeps enrollment data in two files:

  • .yeti-store/templates.npz: a numpy archive containing one embeddings array
  • .yeti-store/metadata.json: subject records, metadata, and storage version

The embeddings array is stored as float32 with shape:

(N, 512)

Where:

  • N is the number of successful onboarded templates
  • 512 is the embedding dimension produced by glintr100.onnx

Each successful onboard call appends one row to that array and one matching record to metadata.json. The matcher validates on load that the number of embedding rows matches the number of metadata records.

How cosine similarity is calculated

Embeddings are normalized during extraction when embedding.normalize: true, which is the default config behavior.

That means each embedding vector is scaled to unit length:

v_normalized = v / ||v||

Once vectors are unit-normalized, cosine similarity becomes the same as a dot product:

cosine(a, b) = a · b

That is exactly how matching is implemented:

  • identification scores all enrolled templates with self._embeddings @ query
  • pairwise comparison scores two images with source @ target

This is efficient because the enrolled embeddings are already stacked in one numpy matrix, so one query can be scored against the full gallery in a single matrix-vector multiply.

The match decision is then:

  • sort candidates by descending score
  • take the top k
  • accept only if the best score is greater than or equal to matcher.threshold

How The Pipeline Works

1. Detection

Yeti uses YuNet through OpenCV FaceDetectorYN. For each detected face it returns:

  • bounding box
  • confidence score
  • five landmarks: left eye, right eye, nose, left mouth corner, right mouth corner

If multiple faces are found, the pipeline keeps one face for downstream processing based on detection.face_selection, which supports:

  • largest_face
  • highest_score

2. Quality Gate

Before Yeti spends compute on embedding or matching, it checks whether the face is usable. The quality gate measures:

  • face size
  • brightness
  • sharpness
  • inter-eye distance
  • yaw
  • pitch
  • roll

It can reject captures for reasons such as:

  • face_too_small
  • underexposed
  • overexposed
  • blurry
  • eyes_too_close
  • yaw_too_large
  • pitch_too_large
  • roll_too_large

This is a practical FR lesson: most recognition errors start with poor input quality, not with the matcher.

3. Liveness

Yeti can optionally run anti-spoofing with 2.7_80x80_MiniFASNetV2.onnx. That model is downloaded by the manifest and used to distinguish a real face from a replay or print attack.

Current implementation details:

  • the liveness crop is larger than the tight face crop
  • input is converted from BGR to RGB
  • the crop is resized to 80x80
  • pixel values are normalized to [0, 1]
  • class index 2 is treated as the live class

The engine reports:

  • spoof_score
  • passed
  • label as live or spoof

Liveness is disabled by default in the example config for inspect, onboard, and identify, but it can be enabled per request or via config.

4. Alignment

Recognition models are sensitive to pose. Yeti aligns each accepted face to a standard 112x112 view using the five landmarks and an affine warp. This is what makes embeddings from two different captures comparable.

Conceptually, alignment reduces variation from:

  • head rotation
  • scale
  • translation
  • small pose differences

Without alignment, even a strong embedding model becomes much less stable.

5. Embedding

Yeti uses glintr100.onnx to convert the aligned face crop into a 512-dimensional embedding vector.

This is the core FR idea: the model does not classify people directly. Instead, it maps each face into a vector space where:

  • same-person vectors should be close together
  • different-person vectors should be farther apart

The implementation normalizes embeddings so cosine similarity can be used directly as a dot product.

6. Matching

The local matcher stores embeddings in numpy format and compares a query vector against all stored templates.

Two matching modes exist:

  • identification: one query against the enrolled gallery
  • verification/comparison: one image against one other image

For identification, Yeti:

  1. scores the query against all stored embeddings
  2. sorts by descending score
  3. returns the top k candidates
  4. accepts the match only if the best score meets matcher.threshold

Concepts You Need To Master regarding FR

If you want to understand face recognition beyond “run the model,” these are the core concepts this repo exposes directly.

Detection is not recognition

Finding a face in an image is a different task from telling whose face it is. A system can detect well and still recognize poorly.

Landmarks are structural anchors

The five landmark points are not just visualization data. They drive pose estimation, quality checks, and alignment.

Input quality dominates outcomes

Recognition systems degrade quickly on:

  • tiny faces
  • blur
  • harsh brightness
  • strong yaw, pitch, or roll
  • low inter-eye resolution

Good gating often improves production behavior more than lowering a match threshold.

Liveness and recognition solve different problems

A recognition embedding is built to ignore nuisance factors and preserve identity. A liveness model is built to notice texture and presentation artifacts. One does not replace the other.

Embeddings are metric-space representations

Modern FR systems usually do not identify users by classification at inference time. They compare embeddings by similarity. That is why threshold tuning matters.

Thresholds define the operating point

The matcher threshold controls the balance between false accepts and false rejects. Lower thresholds are more permissive; higher thresholds are stricter. There is no universally correct value without evaluation data.

Enrollment quality matters as much as query quality

If you enroll bad templates, the gallery becomes noisy. A recognition system cannot recover from poor enrollment examples.

Identification and verification are different tasks

  • verification asks: “do these two images belong to the same person?”
  • identification asks: “which enrolled person is this?”

The risk profile is different because identification compares against a full gallery, not a single claimed identity.

Local storage design affects product behavior

This repo stores one embedding per successful onboard call. That keeps the design simple, but it means:

  • duplicate enrollments create multiple templates
  • no clustering or template consolidation happens automatically
  • metadata is descriptive only; it is not used for scoring

Testing

Run Python tests:

uv run pytest

Run frontend checks:

cd apps/web
npm run typecheck
npm run build

Troubleshooting

  • If the models directory is empty, rerun uv run yeti-download-models --manifest examples/model-manifest.yaml --output-dir models.
  • If the detector fails to initialize, verify that models/face_detection_yunet_2023mar.onnx is a real ONNX model file and not a bad download.
  • If glintr100.onnx fails to load, verify the file is complete and not truncated.
  • If every liveness result looks wrong, check the chosen model file, the configured crop contract, and whether you actually enabled liveness for that request.
  • If the web app cannot reach the API, confirm the API is serving at http://127.0.0.1:8000 or set NEXT_PUBLIC_YETI_API_BASE.
  • If onboarding appears to work but later identification finds nothing, inspect .yeti-store/templates.npz and .yeti-store/metadata.json to confirm templates were written where you expect.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors