ØllamØnym

Template-driven, session-stable anonymization and pseudonymization for teams that want to use LLM workflows while reducing data leakage risk.

Quick Demo Video

Watch the demo on YouTube

Why This Project Exists

Most teams want LLM-powered automation, but they cannot expose raw sensitive text to third-party providers without creating legal, security, and reputational risk.

ØllamØnym gives you a practical middle path:

Detect sensitive entities locally
Replace them deterministically with reversible tokens
Keep text structurally usable for downstream LLM tasks
Restore originals only when authorized

In short: LLM utility without shipping raw private identifiers by default.

A practical positioning for teams under GDPR/compliance pressure: pseudonymization is often more useful than hard redaction/anonymization for LLM workflows, because it preserves semantic continuity and referential meaning across long or distant queries while still reducing direct identifier exposure.

This service sits between your data and cloud-hosted and/or proprietary LLMs (OpenAI, Gemini, Claude, Grok, and similar providers) to reduce direct sensitive-text exposure.

Where Sensitive Text Blocks Cloud-Hosted and/or Proprietary LLM Adoption

Many enterprise teams already have valid LLM use cases, but security/legal controls prevent sending raw text to external model providers. Typical blocked scenarios:

Customer support and CRM copilots: ticket bodies include names, emails, phones, and addresses.
Sales and customer success assistants: notes include account identifiers, contract values, and personal contacts.
Legal contract analysis: drafts expose counterparties, clauses, obligations, and signatures.
Healthcare/insurance operations: notes and claims can contain direct identifiers and case-sensitive details.
Security/SOC workflows: incident narratives may include user IDs, endpoints, and internal infrastructure references.
Internal knowledge assistants: enterprise docs frequently contain confidential project, vendor, or employee data.

This creates a common enterprise bottleneck: product teams want high-end model quality, while governance teams require strong controls on sensitive text exposure.

ØllamØnym is designed to be the privacy middleware layer that resolves this tension:

Transform sensitive entities before model calls.
Preserve enough semantic structure to keep LLM output useful.
Keep deanonymization controlled and reversible only where authorized.

Core Value Proposition

Leakage-risk reduction: sensitive fields are transformed before downstream processing.
Operational realism: anonymized text remains readable and coherent.
GDPR-aligned pseudonymization value: preserve contextual meaning for downstream LLM tasks while reducing exposure of direct identifiers.
Reversible by design: exact restoration via mapping when needed.
Template extensibility: add domain entities without changing core pipeline.
Session consistency: repeated mentions stay stable within a session.
Provider flexibility: run with local quantized models today, swap model/provider config as needs evolve.

Key Features

Hybrid Detection Engine (Deterministic + Local Quantized LLM)
- Deterministic rule extraction for patterned data (EMAIL, PHONE, links, etc.).
- Local Ollama-hosted quantized LLM extraction for contextual entities (PERSON, ORG, domain entities).
- Configurable model/runtime via template and environment.
Template-Driven Entity Taxonomy
- Define PERSON, ORG, LINKS, PRODUCT, or any custom class in JSON.
Deterministic Placeholder Mode
- Example token: <<PERSON:K7D2QH>>
Realistic Rendering Mode
- Optional fake values for human-readable anonymized output.
Generic Post-Pass Alias Propagation
- Moving-window + token-overlap propagation can link full and partial mentions in-session (e.g., Jensen Huang and Huang).
No LLM Offsets Required
- LLM returns only (entity_id, text); span resolution is deterministic in code.
Robust Span Resolution
- Boundary-safe matching prevents substring corruption (for example, avoids replacing com inside company).
Chunked + Bounded Parallel Inference
- Handles long documents with predictable concurrency.
Model Runtime Observability
- Response metadata includes requested/resolved model and quantization info.
Dockerized Deployment
- FastAPI + Ollama stack with persistent model volume.

⚠️ Caution:

Several components of this project were generated and or refactored via agentic AI. Tests were set but caution is required.

High-Level Architecture

Detection Plane
- Template compilation
- Hybrid extraction: deterministic rules + local quantized LLM
- Normalization and deduplication
Transformation Plane
- Deterministic span resolution
- Placeholder insertion
- Session-aware alias propagation (moving window + overlap policy)
- Optional realistic rendering
Reversal Plane
- Token/fake back-mapping to exact original text

Input / Output Example

Input (`POST /v2/anonymize`)

{
  "session_id": "case-42",
  "template_id": "default-pii-v1",
  "text": "Jensen Huang leads NVIDIA. Visit www.tech-private.com",
  "render_mode": "structural",
  "language": "auto"
}

Output (structural)

{
  "anonymized_text": "<<PERSON:XXXXXX>> leads <<ORG:XXXXXX>>. Visit <<LINKS:XXXXXX>>",
  "mapping": {
    "token_to_original": {
      "<<PERSON:XXXXXX>>": "Jensen Huang",
      "<<ORG:XXXXXX>>": "NVIDIA",
      "<<LINKS:XXXXXX>>": "www.tech-private.com"
    },
    "meta": {
      "session_id": "case-42",
      "template_id": "default-pii-v1",
      "template_version": 3,
      "render_mode": "structural"
    }
  }
}

Input (`POST /v2/anonymize`, realistic)

{
  "session_id": "string_id_test_23",
  "template_id": "default-pii-v1",
  "text": "AI Overview Jensen Huang is the co-founder, President, and CEO of NVIDIA ... you can find more at www.tech-private.com ... for now Jensen Huang is doing great",
  "render_mode": "realistic",
  "language": "auto"
}

Output (realistic)

{
  "anonymized_text": "AI Overview William Adams is the co-founder, President, and CEO of Elliott, Wilson and Terry and father of Shannon Gomez MD and Maria Thompson ... A pivotal figure in the AI revolution, Adams has guided Elliott, Wilson and Terry ... you can find more at www.blue-connect.com ... for now William Adams is doing great",
  "mapping": {
    "token_to_original": {
      "<<PERSON:A2N72P>>": "Jensen Huang",
      "<<ORG:T5YKPW>>": "NVIDIA",
      "<<PERSON:XG6QHD>>": "Huang",
      "<<LINKS:XA4JRC>>": "www.tech-private.com"
    },
    "token_to_fake": {
      "<<PERSON:A2N72P>>": "William Adams",
      "<<ORG:T5YKPW>>": "Elliott, Wilson and Terry",
      "<<PERSON:XG6QHD>>": "Adams",
      "<<LINKS:XA4JRC>>": "www.blue-connect.com"
    },
    "fake_to_token": {
      "William Adams": "<<PERSON:A2N72P>>",
      "Elliott, Wilson and Terry": "<<ORG:T5YKPW>>",
      "Adams": "<<PERSON:XG6QHD>>",
      "www.blue-connect.com": "<<LINKS:XA4JRC>>"
    },
    "meta": {
      "session_id": "string_id_test_23",
      "template_id": "default-pii-v1",
      "template_version": 3,
      "render_mode": "realistic",
      "model_runtime": {
        "requested_model": "llama3.1:8b-instruct-q4_K_M",
        "resolved_model": "llama3.1:latest",
        "quantization_level": "Q4_K_M"
      }
    }
  }
}

Where This Is Most Useful

LLM preprocessing gateway for enterprise copilots
Support and CRM text handling before summarization/classification
Medical/legal document workflows requiring controlled exposure
Model evaluation datasets needing repeatable anonymization
Cross-team AI enablement where security/compliance gate AI usage

Quick Start

Prerequisites

Docker + Docker Compose
PSEUDONYM_SECRET environment variable

Run

export PSEUDONYM_SECRET="replace-with-a-strong-secret"
docker compose up --build

Health

curl http://localhost:8000/health

API Endpoints

POST /v2/anonymize
POST /v2/anonymize/stream (SSE-style stream for UI visualization)
POST /v2/deanonymize
GET /v2/templates
GET /v2/templates/{template_id}
POST /v2/templates/validate
POST /v2/templates/save (create/update custom templates)
DELETE /v2/templates/{template_id} (delete custom templates)
GET / (web UX demo)

Web pages:

/ -> landing page
/web/demo.html -> interactive demo studio

Compare Llama vs Qwen

This repo now includes model-specific templates you can switch via template_id:

default-pii-v1 (Llama baseline)
default-pii-qwen2.5-7b-v1
default-pii-qwen2.5-14b-v1

Pull models in Ollama (once):

ollama pull qwen2.5:7b-instruct-q4_K_M
ollama pull qwen2.5:14b-instruct-q4_K_M

Run the same text against different templates to compare extraction quality/latency:

curl -X POST http://localhost:8000/v2/anonymize \
  -H "Content-Type: application/json" \
  -d '{
    "session_id":"bench-1",
    "template_id":"default-pii-qwen2.5-14b-v1",
    "text":"Jensen Huang leads NVIDIA. Contact: john.doe@example.com",
    "render_mode":"structural",
    "language":"auto"
  }'

Configuration Highlights

Important env vars:

TEMPLATES_DIR (bundled/read-only templates source)
CUSTOM_TEMPLATES_DIR (writable directory for create/update/delete template APIs)
OLLAMA_BASE_URL
OLLAMA_FALLBACK_URLS
LLM_MODEL
OLLAMA_KEEP_ALIVE
LLM_NUM_PREDICT
LLM_TEMPERATURE
LLM_WARMUP_ENABLED (default true, runs warmup at startup)
LLM_WARMUP_MODEL (optional override; defaults to LLM_MODEL)
LLM_WARMUP_TIMEOUT (seconds for warmup request)
LLM_WARMUP_NUM_PREDICT (token budget for warmup call)
LLM_CONCURRENCY
CHUNK_CHAR_TARGET
CHUNK_MAX_PARALLEL
RULE_PREEXTRACT_ENABLED
TOKEN_ID_LEN

Template controls:

entity definitions and examples
placeholder format and pseudonym providers
per-entity fake strategy override (entity.fake_provider) and pseudo-entity pools (entity.use_pseudo_entities, entity.pseudo_entities)
post-pass alias policy (postpass_alias)
per-template model selection (template.llm.model)

Security and Compliance Positioning

Keeps sensitive originals out of downstream prompts by default.
Pseudonymization can be a strong GDPR/compliance selling point when full anonymization would destroy the context needed for useful LLM responses.
Preserves semantic continuity across distant mentions/queries better than blunt redaction, which improves downstream LLM utility.
Supports self-hosted/local inference stacks.
Allows strict control of what is reversible and by whom (through mapping handling policy).

Note: this reduces leakage risk materially, but final security posture still depends on infrastructure, access control, logging policy, and secret management.

Adaptability by Design

Add new entity classes in template JSON without pipeline rewrites.
Tune matching behavior through policy knobs:
- postpass_alias.window_size
- postpass_alias.min_overlap_tokens
- postpass_alias.min_token_len
- postpass_alias.entity_ids
Mix strict deterministic extraction with contextual LLM extraction per use case.
Keep consistent anonymization across non-exact mentions in the same session.

What’s Next

Product / UX

UX polish and extension for:
- richer live token analytics and traces
- collaborative template lifecycle workflows
- advanced policy presets by compliance profile

Metrics and Governance

Detection quality dashboard (precision/recall by entity class)
Latency and throughput dashboards (p50/p95/p99)
Drift alerts for template/model changes
Session-memory effectiveness metrics (alias recovery success)
Audit-friendly anonymization/deanonymization event logs

Platform Evolution

Test on premise servers (better hardware) for detection accuracy, generation fidelity when llm faker is enabled and latency evaluation.
Pluggable provider strategies for advanced fake generation by entity family
Multi-tenant policy isolation
Optional external session memory (e.g., Redis) for horizontal scale

Repository Notes

Detailed implementation notes: README_IMPLEMENTATION.md
Build summary: IMPLEMENTATION.md
Full technical spec: spec_readme.md

Name		Name	Last commit message	Last commit date
Latest commit History 49 Commits
app		app
data/templates		data/templates
templates		templates
tests		tests
.gitignore		.gitignore
Dockerfile		Dockerfile
IMPLEMENTATION.md		IMPLEMENTATION.md
LICENSE		LICENSE
README.md		README.md
README_IMPLEMENTATION.md		README_IMPLEMENTATION.md
docker-compose.yml		docker-compose.yml
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
spec_readme.md		spec_readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ØllamØnym

Quick Demo Video

Why This Project Exists

Where Sensitive Text Blocks Cloud-Hosted and/or Proprietary LLM Adoption

Core Value Proposition

Key Features

⚠️ Caution:

High-Level Architecture

Input / Output Example

Input (`POST /v2/anonymize`)

Output (structural)

Input (`POST /v2/anonymize`, realistic)

Output (realistic)

Where This Is Most Useful

Quick Start

Prerequisites

Run

Health

API Endpoints

Compare Llama vs Qwen

Configuration Highlights

Security and Compliance Positioning

Adaptability by Design

What’s Next

Product / UX

Metrics and Governance

Platform Evolution

Repository Notes

About

Uh oh!

Releases

Contributors 1

Languages

Folders and files

Latest commit

History

Repository files navigation

ØllamØnym

Quick Demo Video

Why This Project Exists

Where Sensitive Text Blocks Cloud-Hosted and/or Proprietary LLM Adoption

Core Value Proposition

Key Features

⚠️ Caution:

High-Level Architecture

Input / Output Example

Input (POST /v2/anonymize)

Output (structural)

Input (POST /v2/anonymize, realistic)

Output (realistic)

Where This Is Most Useful

Quick Start

Prerequisites

Run

Health

API Endpoints

Compare Llama vs Qwen

Configuration Highlights

Security and Compliance Positioning

Adaptability by Design

What’s Next

Product / UX

Metrics and Governance

Platform Evolution

Repository Notes

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Contributors 1

Languages

Input (`POST /v2/anonymize`)

Input (`POST /v2/anonymize`, realistic)