Skip to content

feat: add ai-peyeeye plugin for PII redaction & rehydration#13300

Open
tim-peyeeye wants to merge 1 commit intoapache:masterfrom
tim-peyeeye:feat/ai-peyeeye-plugin
Open

feat: add ai-peyeeye plugin for PII redaction & rehydration#13300
tim-peyeeye wants to merge 1 commit intoapache:masterfrom
tim-peyeeye:feat/ai-peyeeye-plugin

Conversation

@tim-peyeeye
Copy link
Copy Markdown

@tim-peyeeye tim-peyeeye commented Apr 26, 2026

What this PR adds

A new AI plugin, ai-peyeeye, that performs PII redaction on the request body sent upstream to an LLM and rehydrates the LLM's response before it reaches the client. It is designed to sit alongside ai-proxy / ai-proxy-multi on the same route, at priority 1074 (ahead of ai-proxy's 1040), and calls the peyeeye.ai HTTP API:

  • POST /v1/redact — pre-call, replaces detected entities with deterministic tokens (e.g. [EMAIL_1], [CARD_2]).
  • POST /v1/rehydrate — post-call, swaps the tokens in the LLM's response back to the original values.
  • DELETE /v1/sessions/<id> — best-effort cleanup for stateful sessions.

Behavior

  • Pre-call redact (access phase): extracts text from the OpenAI-style messages[].content array (string or content-part text), batches it to /v1/redact, and rewrites the body in place. The detection engine is regex + checksum-validated (Luhn for cards, mod-97 for IBAN, SSN/IP shape) on the peyeeye side, so the gateway just sends/receives strings.
  • Post-call rehydrate (body_filter phase): buffers the upstream response, then calls /v1/rehydrate with the buffered body and the session id (or sealed skey_… blob).
  • Two session modes:
    • stateful (default): peyeeye returns a ses_… session id; the plugin stores it on the request context and DELETEs it after rehydrate.
    • stateless: peyeeye returns an AES-GCM-sealed skey_… blob containing the token map; nothing is retained server-side.
  • Fail-closed length-guard: if /v1/redact returns a different number of texts than were sent, the request is failed with HTTP 500. Unredacted text is never forwarded upstream.
  • Fail-closed shape-guard: if /v1/redact returns an unexpected response shape (missing texts, missing session/key for the chosen mode), the request is failed with HTTP 500.
  • Best-effort rehydrate: if /v1/rehydrate fails (network, 5xx), the redacted output is preserved rather than risking PII leakage by falling back to the raw upstream response.
  • Auth required: missing api_key (in config or PEYEEYE_API_KEY env var) fails schema validation.
  • Empty-body short-circuit: requests with no extractable text skip the redact call entirely.

Files added

  • apisix/plugins/ai-peyeeye.lua — the plugin (546 lines)
  • apisix/cli/config.lua — registration in the default plugin list
  • conf/config.yaml.example — example entry
  • t/admin/plugins.t — admin plugin list assertion
  • t/plugin/ai-peyeeye.t — test suite (445 lines), follows the ai-prompt-guard / ai-aliyun-content-moderation pattern: mocks the peyeeye HTTP API and a fake LLM upstream so the tests have no external dependencies. Covers:
    • schema validation (3 cases)
    • stateful redact + rehydrate end-to-end
    • stateless mode (sealed skey_…)
    • length-guard fail-closed branch
    • unexpected-response-shape fail-closed branch
    • empty-body short-circuit
  • docs/en/latest/plugins/ai-peyeeye.md + docs/en/latest/config.json
  • docs/zh/latest/plugins/ai-peyeeye.md + docs/zh/latest/config.json

Test status — please read

I want to be transparent here: I was not able to run t/plugin/ai-peyeeye.t locally, because APISIX's test framework requires a custom apisix-runtime build (custom OpenResty plus a toolkit.json Lua module) that I could not reproduce on macOS — toolkit.json is not packaged in any public OpenResty/LuaRocks artifact I could find, and the make deps path against stock OpenResty fails before the test runner can start. I'm relying on this PR's CI (apache/apisix's test workflow) to actually exercise the suite.

Static checks that did pass locally:

  • luacheck apisix/plugins/ai-peyeeye.lua t/plugin/ai-peyeeye.t — clean, no warnings.
  • luajit -bl apisix/plugins/ai-peyeeye.lua — parses cleanly (no syntax errors).

If CI surfaces issues, I'll iterate on this PR. If a maintainer can point me at a working local test setup for macOS (or a CI job I can self-trigger on the fork), I'd appreciate it.

Adds an `ai-peyeeye` AI plugin that redacts PII from prompts before they
reach the upstream LLM and rehydrates the model's response so the client
sees the original values. The plugin calls the peyeeye.ai HTTP API
(`/v1/redact`, `/v1/rehydrate`, `DELETE /v1/sessions/<id>`) and is
designed to sit alongside `ai-proxy` / `ai-proxy-multi` on the same
route, at priority 1074 (ahead of `ai-proxy`'s 1040).

Behavior invariants:

- Length-guard: if `/v1/redact` returns a different number of texts than
  were sent, or returns an unexpected response shape, the request is
  failed with HTTP 500. Unredacted text is never forwarded upstream.
- Auth required: missing `api_key` (in config or `PEYEEYE_API_KEY` env
  var) fails schema validation.
- Best-effort rehydrate: if `/v1/rehydrate` fails the redacted output is
  preserved rather than risking PII leakage.
- Best-effort cleanup: stateful sessions are `DELETE`'d after rehydrate;
  failures are logged only.

Two session modes are supported: `stateful` (default; peyeeye holds the
token-to-value map under a `ses_…` id) and `stateless` (peyeeye returns
a sealed `skey_…` blob and retains nothing).

Includes English and Chinese documentation, plugin registration in
`apisix/cli/config.lua`, `conf/config.yaml.example`, the docs sidebars,
and the admin plugin list (`t/admin/plugins.t`). Plugin tests under
`t/plugin/ai-peyeeye.t` mock the peyeeye HTTP API and a fake LLM
upstream so they run with no external dependencies, exercising:
schema validation (3 cases), the stateful redact+rehydrate end-to-end
flow, the stateless mode, the length-guard branch, the
unexpected-response-shape branch, and the empty-body short-circuit.
@dosubot dosubot Bot added size:XXL This PR changes 1000+ lines, ignoring generated files. enhancement New feature or request plugin labels Apr 26, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request plugin size:XXL This PR changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant