Skip to content

feat: move redaction timing to before job enqueue in record! (#47)#57

Closed
dpaluy wants to merge 4 commits intomasterfrom
feature/move-redaction-timing-to-before-job-enqueue-in-record
Closed

feat: move redaction timing to before job enqueue in record! (#47)#57
dpaluy wants to merge 4 commits intomasterfrom
feature/move-redaction-timing-to-before-job-enqueue-in-record

Conversation

@dpaluy
Copy link
Copy Markdown
Owner

@dpaluy dpaluy commented Jan 7, 2026

Summary

Critical security fix: Apply PII redaction BEFORE job enqueue to ensure raw PII never enters the job queue (Redis/Sidekiq/SQS).

  • record! now calls apply_redaction() before PersistInteractionJob.perform_later
  • apply_redaction() serializes actor and runs RedactionPipeline inline
  • RedactionAudit excluded from job payload (not ActiveJob serializable)
  • PersistInteractionJob uses assign_actor() for serialized actor data
  • Added comprehensive tests for redaction timing and actor serialization

Test plan

  • Test verifies PII is redacted before job enqueue
  • Test verifies no raw PII in persisted data
  • Test verifies actor serialization for job-safe persistence
  • Performance test confirms redaction is under 100ms for large payloads
  • Full test suite passes (171 tests)

Closes #47

dpaluy added 4 commits January 6, 2026 23:33
Add public method to convert ActiveRecord actors to job-safe serialized
format for background job enqueueing. Supports GlobalID extraction with
fallback to type/id tuple for objects without GlobalID support.

Closes #42
Add pattern-based redaction DSL to Config class:
- config.redact :email, :phone - enables individual patterns
- config.redact_group :api_keys - enables pattern groups
- config.redact_pattern(/regex/, "[REPLACEMENT]") - custom patterns
- config.active_patterns - returns all enabled Pattern objects

Also includes:
- T2: Validators module with Luhn and SSN range validation
- T3: PATTERNS hash with 16 built-in patterns (email, phone, credit_card,
  ssn, openai_key, anthropic_key, aws_key, stripe_key, github_token,
  github_pat, bearer_token, basic_auth, private_key, ipv4, ipv6, jwt)
- PATTERN_GROUPS for convenient batch enabling (pii, financial, api_keys,
  auth, network, crypto)

Invalid pattern names raise ConfigurationError at config time for
early validation.
Update RedactionPipeline to support both interface styles:
- New Pattern-based: call(text, audit:, field_path:) returns [text, audit]
- Legacy lambda: call(text) returns text

The pipeline auto-detects which interface to use:
- Pattern objects are detected by class check
- Lambdas with `audit:` keyword param bypass wrapping
- Legacy single-arg lambdas get wrapped with audit tracking

Also includes:
- NormalizedInteraction extended with actor_type, actor_id, actor_gid,
  redaction_audit fields (T4 dependencies)
- Patterns applied first, then custom_redactors
- RedactionAudit populated on result.redaction_audit

Backwards compatible with existing custom_redactors configuration.
Critical security fix: Apply PII redaction BEFORE job enqueue to ensure
raw PII never enters the job queue (Redis/Sidekiq/SQS).

Changes:
- record! now calls apply_redaction() before PersistInteractionJob.perform_later
- apply_redaction() serializes actor and runs RedactionPipeline inline
- RedactionAudit excluded from job payload (not ActiveJob serializable)
- PersistInteractionJob uses assign_actor() for serialized actor data
- Added comprehensive tests for redaction timing and actor serialization
@dpaluy dpaluy force-pushed the master branch 8 times, most recently from 0a23bb3 to d2366f3 Compare March 26, 2026 14:19
@dpaluy
Copy link
Copy Markdown
Owner Author

dpaluy commented Apr 26, 2026

Closing as obsolete after Tracebook v1.0.0 moved to the RubyLLM Chat/Message architecture and removed the old interaction-recorder paths this PR was based on. Current OPF redaction backend work is tracked by #68/#69.

@dpaluy dpaluy closed this Apr 26, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

T10: Move redaction timing to before job enqueue in record!

1 participant