Architecture

This document describes the internal design of the AI Anonymizing Proxy: how requests flow through the system, how PII detection works, and the rationale behind key design choices.

System overview

flowchart TD
    subgraph Client side
        APP[Application]
    end

    subgraph Proxy ["AI Anonymizing Proxy (127.0.0.1:8080)"]
        direction TB
        PRXY[proxy.go\nrequest router]
        ANON[anonymizer.go\npack-based PII detection]
        MITM[mitm/\ncert.go · mitm.go]
        REG[(DomainRegistry)]
        MET[metrics.go]
    end

    subgraph Mgmt ["Management API (127.0.0.1:8081)"]
        API[management.go\n/status /metrics /domains]
    end

    subgraph Backends
        AIAPI[AI API\nOpenAI · Anthropic · …]
        OTHER[Other HTTPS]
        OLL[Ollama\nlocal LLM]
    end

    APP -->|HTTP_PROXY| PRXY
    PRXY -->|AI domain| MITM
    MITM -->|plaintext body| ANON
    ANON -->|async cache miss| OLL
    ANON -->|anonymized body| AIAPI
    AIAPI -->|response| ANON
    ANON -->|de-anonymized| APP

    PRXY -->|other domain| OTHER
    API -->|read/write| REG
    PRXY -->|lookup| REG
    PRXY --> MET
    ANON --> MET

Request lifecycle

HTTPS CONNECT to an AI API domain (MITM path)

sequenceDiagram
    participant C as Client
    participant P as proxy.go
    participant CA as mitm/cert.go
    participant A as anonymizer.go
    participant API as AI API

    C->>P: CONNECT api.openai.com:443
    P->>P: DomainRegistry.Has(domain) → true
    P->>C: 200 Connection Established
    P->>CA: CertFor("api.openai.com")
    CA-->>P: leaf cert signed by proxy CA
    Note over C,P: TLS handshake — client uses proxy CA cert
    Note over P,API: Proxy opens separate real TLS to ai api

    loop each request over the tunnel
        C->>P: POST /v1/messages (plaintext to proxy)
        P->>P: isAuthRequest? → No
        P->>A: AnonymizeJSON(body, sessionID)
        A-->>P: anonymized body + token map stored
        P->>API: POST /v1/messages (anonymized, real TLS)
        API-->>P: response
        alt SSE / text/event-stream
            P->>A: StreamingDeanonymize(body, sessionID, domain)
            A-->>C: token replacements streamed on-the-fly
        else buffered response
            P->>A: DeanonymizeText(body, sessionID)
            A-->>P: restored text
            P-->>C: response with original values
        end
        P->>A: DeleteSession(sessionID)
    end

HTTPS CONNECT to a non-AI domain (opaque tunnel)

sequenceDiagram
    participant C as Client
    participant P as proxy.go
    participant D as ssrfSafeDialContext
    participant S as Destination server

    C->>P: CONNECT other-site.com:443
    P->>P: DomainRegistry.Has → false
    P->>P: isPrivateHost? → No
    P->>D: Dial tcp other-site.com:443
    D->>D: Resolve hostname → check IPs against private CIDRs
    D-->>P: net.Conn (or blocked if private IP)
    P->>C: 200 Connection Established
    Note over C,S: Raw bytes copied bidirectionally — no inspection

Plain HTTP to an AI API domain

sequenceDiagram
    participant C as Client
    participant P as proxy.go
    participant A as anonymizer.go
    participant API as AI API

    C->>P: POST http://api.openai.com/v1/chat (plain HTTP)
    P->>P: isAuthRequest? → No
    P->>A: AnonymizeJSON(body, sessionID)
    A-->>P: anonymized body
    P->>API: POST (anonymized)
    API-->>P: response
    P->>A: DeanonymizeText(response, sessionID)
    A-->>P: restored response
    P-->>C: response
    P->>A: DeleteSession(sessionID)

Anonymization pipeline

flowchart TD
    IN([Request body]) --> PARSE{Valid JSON?}
    PARSE -->|Yes| WALK[Walk string leaves\nrecursively]
    PARSE -->|No| PLAIN[Treat as plain text]
    WALK --> RX
    PLAIN --> RX

    RX[Pack-based regex pass\npatterns loaded from enabled packs\nwith confidence scores] --> MATCH{Any match?}

    MATCH -->|No| NOCONF[effectiveConfidence = 0.0\ntext may still contain\nAI-detectable PII]
    MATCH -->|Yes| MINC[track minConfidence\nacross all matches]

    MINC --> THRESH{minConfidence\n≥ aiThreshold?}
    THRESH -->|Yes, regex is sufficient| OUT

    THRESH -->|No| AI{useAI enabled?}
    NOCONF --> AI

    AI -->|No| OUT
    AI -->|Yes| CACHE{Cache hit\nfor PII value?}

    CACHE -->|Hit| APPLY[Use cached\ntoken immediately]
    APPLY --> OUT

    CACHE -->|Miss| ASYNC[Dispatch background\nOllama goroutine\nvia inflight dedup map]
    ASYNC --> OUT

    OUT([Return anonymized text\ntoken map stored in session])

    ASYNC -.->|populates cache\nfor next request| CACHE

Pack-based pattern loading

Patterns are organized into named packs that self-register via init() in internal/anonymizer/packs/. The anonymizer loads patterns only from packs listed in the enabledPacks configuration. Packs enabled by default: GLOBAL, DE, SECRETS.

A positional decay multiplier reduces confidence for patterns in later packs:

effectiveConfidence = baseConfidence × (1.0 - (position - 1) × packDecayRate)

Patterns with a Validate function (e.g. Luhn for credit cards, ISO 7064 for Steuer-ID) reject regex matches that fail checksum validation, reducing false positives.

Pattern confidence scores (GLOBAL pack, base values):

PII type	Pack	Example	Confidence	Validator
Email	GLOBAL	`user@example.com`	0.95	—
API key	GLOBAL	`Bearer sk-abc…` (≥ 20 chars)	0.90	—
Credit card	GLOBAL	`4111 1111 1111 1111`	0.85	Luhn
Steuer-ID	DE	`65929970489`	0.70	ISO 7064
SVNR	DE	`12150385A123`	0.80	—
KFZ	DE	`B AB 1234`	0.75	—
SSH key	SECRETS	`-----BEGIN RSA PRIVATE KEY-----`	0.99	—
JWT	SECRETS	`eyJhbGci...`	0.95	—
Bearer token	SECRETS	`Bearer <token>`	0.92	—
DB connection	SECRETS	`postgresql://user:pass@host`	0.93	—
AWS key	SECRETS	`AKIAIOSFODNN7EXAMPLE`	0.97	—
GitHub token	SECRETS	`ghp_ABC...`	0.97	—

Tokens are formatted as [PII_<TYPE>_<16hex>] — e.g. [PII_EMAIL_c160f8cc4b2e1a3d] — where <TYPE> is the uppercased PII type name and <16hex> is the first 16 hex digits of md5(original). Maximum token length: 33 bytes ([PII_CREDITCARD_XXXXXXXXXXXXXXXX]). The type label gives the LLM semantic context without revealing the original value; the system instruction injected into every anonymized request prohibits the LLM from substituting plausible-looking values in place of tokens. The hash is deterministic so the same value always produces the same token within and across requests.

Ollama cache states

Each PII value (not the full request body) progresses through three states. The cache is keyed by the original value string so a recurring email address or phone number gets a hit regardless of which message body it appears in. The self-transition on Inflight is the in-flight deduplication: a second request containing the same value while Ollama is still querying reuses the running goroutine rather than spawning a new one.

stateDiagram-v2
    [*] --> Uncached : new PII value

    Uncached --> Inflight : cache miss — goroutine dispatched\ncurrent request uses fallback token immediately

    Inflight --> Inflight : duplicate request for same value\ninflight dedup — no second goroutine\ncurrent request uses fallback token immediately

    Inflight --> Cached : Ollama query succeeded\ndetections stored in cache

    Inflight --> Uncached : Ollama query failed\nor semaphore full (request dropped)\nnext request will retry dispatch

    Cached --> Cached : cache hit — AI detections\napplied to current request immediately

    note right of Cached
        S3-FIFO eviction when
        capacity (50 000 entries)
        is reached. Evicted entries
        are deleted from bbolt so
        disk usage stays bounded.
    end note

De-anonymization and streaming

Each request gets a random sessionID. The token→original map is stored in anonymizer.sessions[sessionID] during anonymization and deleted after the response is delivered.

For SSE (Content-Type: text/event-stream), StreamingDeanonymize wraps the response body in a pipe-based reader. AI API providers deliver text content in different SSE formats, and a single token like [PII_EMAIL_c160f8cc4b2e1a3d] frequently arrives split across multiple events.

The streaming system uses a provider-aware StreamingDeanonymizer interface to handle each provider's SSE format. The domain parameter selects the appropriate implementation:

Provider	Text field	Domains
Anthropic	`delta.text` / `delta.thinking`	`api.anthropic.com`
OpenAI	`choices[0].delta.content`	`api.openai.com`, `api.mistral.ai`, `api.together.xyz`, `api.perplexity.ai`, `api.huggingface.co`
Gemini	`candidates[0].content.parts[0].text`	`generativelanguage.googleapis.com`
Cohere	`delta.message.content.text`	`api.cohere.ai`
Replicate	Plain text in `data`	`api.replicate.com`
Passthrough	Raw replacement	Unknown domains

The shared framework in internal/anonymizer/streaming.go handles SSE framing:

Helper	Responsibility
`readLoop`	Top-level goroutine: reads chunks from the source and dispatches complete lines
`assembleLines`	Splits raw bytes on newlines, strips `\r`, dispatches to `processLine`
`processLine`	Classifies each SSE line (comment, non-data, data) and delegates data payloads to the provider
`safeCutPoint`	Calculates how many accumulated bytes can be flushed without splitting a partial token
`handleStreamEnd`	Flushes partial lines and calls `provider.Flush()` at EOF or on read error

Provider-specific implementations live in separate files (streaming_anthropic.go, streaming_openai.go, etc.) and handle JSON parsing, text accumulation, and re-serialization.

The 33-byte suffix guard (tokenSuffixLen) is retained in the accumulator — enough to cover the longest possible token ([PII_CREDITCARD_XXXXXXXXXXXXXXXX] = 33 chars). Non-delta events (ping, message_start, etc.) also pass through the replacer so tokens embedded in any part of the SSE stream are deanonymized.

SSRF protection

flowchart TD
    REQ([Dial request\nhostname:port]) --> ISIP{Literal IP\nin request?}
    ISIP -->|Yes, private| BLOCK([Block — log + error])
    ISIP -->|No| RESOLVE[Resolve hostname\nnet.DefaultResolver]
    RESOLVE --> CHECK{Any resolved IP\nin private CIDRs?}
    CHECK -->|Yes| BLOCK
    CHECK -->|No| DIAL[Dial first resolved IP\ndirectly]
    DIAL --> CONN([net.Conn])

Blocked CIDRs: 10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16, 127.0.0.0/8, 169.254.0.0/16, ::1/128, fc00::/7, fe80::/10.

The check runs at dial time (not at request-parse time) to close the TOCTOU gap exploited by DNS rebinding, where a hostname resolves to a public IP during the check but switches to a private IP when the TCP connection is established.

TLS / MITM cert lifecycle

flowchart LR
    START([Proxy startup]) --> LOAD{ca-cert.pem\nca-key.pem\nexist?}
    LOAD -->|Yes| PARSE[Parse CA cert + key]
    LOAD -->|No| GEN[GenerateCA\nRSA-4096, 10 yr validity]
    GEN --> PARSE
    PARSE --> CA[(mitm.CA\ncert + key + cache)]

    REQ([CONNECT host]) --> CCHECK{cache has\ncert for host?}
    CCHECK -->|Hit, not expired| TLSCFG
    CCHECK -->|Miss or expired| SIGN[GenerateKey RSA-2048\nSignCert 7 day validity]
    SIGN --> STORE[Store in cache\nmax 10 000 entries\nfull clear on overflow]
    STORE --> TLSCFG[tls.Config.GetCertificate]
    TLSCFG --> ALPN{ALPN negotiated?}
    ALPN -->|h2| H2[http2.Server.ServeConn]
    ALPN -->|http/1.1| H1[http.Server\nsingleConnListener]

Domain registry and persistence

flowchart TD
    START([Proxy startup]) --> FILE{ai-domains.json\nexists?}
    FILE -->|Yes| LOAD[Load persisted domains\ntakes precedence]
    FILE -->|No / corrupt| CFG[Load from\nproxy-config.json]
    LOAD --> REG[(DomainRegistry\nmap + RWMutex)]
    CFG --> REG

    REG -->|DomainRegistry.Has| PROXY[proxy: intercept or tunnel?]

    ADD[POST /domains/add] --> LOCK[Lock → mutate map → snapshot]
    RM[POST /domains/remove] --> LOCK
    LOCK --> ATOMIC[Write temp file\nos.Rename → ai-domains.json]
    ATOMIC --> REG

Writes use an atomic rename (write to a temp file, then os.Rename) so the persisted file is never partially written. The DomainRegistry mutex is released before the write; Has calls are never blocked by disk I/O.

Packages

Package	Responsibility
`cmd/proxy`	Entry point: wires config, shared registry, metrics, both HTTP servers
`internal/config`	Layered config loading: defaults → `proxy-config.json` → env vars
`internal/anonymizer`	Pack-based PII detection, token replacement, session maps, streaming de-anon
`internal/anonymizer/packs`	Self-registering PII detection pattern packs (GLOBAL, DE, US, SECRETS, etc.)
`internal/proxy`	Request router: MITM tunnel, opaque tunnel, plain-HTTP forwarding, SSRF
`internal/mitm`	CA management, per-host leaf cert generation/caching, TLS handshake, ALPN
`internal/management`	Management HTTP API + persistent `DomainRegistry`
`internal/metrics`	Atomic request/error/token counters; latency stats; JSON snapshot
`internal/logger`	Structured, level-gated logger (debug/info/warn/error) → stderr

Metrics architecture

All hot-path counters (RequestsTotal, TokensReplaced, etc.) are sync/atomic.Int64 — no mutex in the request path. Latency accumulators use one sync.Mutex each, updated once per request at the call site.

Anonymizer cache counters

The piiTokens block in the /metrics snapshot includes observability for the low-confidence detection path:

Counter	Where incremented	What it signals
`cacheHits[<type>]`	`tokenForMatch` — cache hit	Cache is warm for this PII type
`cacheMisses[<type>]`	`tokenForMatch` — cache miss	Value not yet seen by Ollama
`cacheFallbacks`	`tokenForMatch` — cache miss	Fallback token used; increments with every miss
`ollamaDispatches`	`dispatchOllamaAsync` — before goroutine launch	Goroutine was spawned
`ollamaErrors`	`dispatchOllamaAsync` — semaphore full or HTTP error	Ollama unavailable or overloaded

Per-type counters are pre-allocated for all known PII types (including pack-added types) at startup; zero-count types are omitted from the JSON output. Counter maps are written only during initialisation so concurrent reads in Snapshot() require no additional lock.

Cache effectiveness signal: cacheFallbacks / ollamaDispatches trending toward 0 after warm-up means recurring values are now served from cache. A ratio near 1 after warm-up indicates Ollama is unreachable, values are high-cardinality, or aiConfidenceThreshold is routing too many matches through the low-confidence path.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Architecture

System overview

Request lifecycle

HTTPS CONNECT to an AI API domain (MITM path)

HTTPS CONNECT to a non-AI domain (opaque tunnel)

Plain HTTP to an AI API domain

Anonymization pipeline

Pack-based pattern loading

Ollama cache states

De-anonymization and streaming

SSRF protection

TLS / MITM cert lifecycle

Domain registry and persistence

Packages

Metrics architecture

Anonymizer cache counters

FilesExpand file tree

architecture.md

Latest commit

History

architecture.md

File metadata and controls

Architecture

System overview

Request lifecycle

HTTPS CONNECT to an AI API domain (MITM path)

HTTPS CONNECT to a non-AI domain (opaque tunnel)

Plain HTTP to an AI API domain

Anonymization pipeline

Pack-based pattern loading

Ollama cache states

De-anonymization and streaming

SSRF protection

TLS / MITM cert lifecycle

Domain registry and persistence

Packages

Metrics architecture

Anonymizer cache counters