feat: move redaction timing to before job enqueue in record! (#47)#57
Closed
feat: move redaction timing to before job enqueue in record! (#47)#57
Conversation
Add public method to convert ActiveRecord actors to job-safe serialized format for background job enqueueing. Supports GlobalID extraction with fallback to type/id tuple for objects without GlobalID support. Closes #42
Add pattern-based redaction DSL to Config class: - config.redact :email, :phone - enables individual patterns - config.redact_group :api_keys - enables pattern groups - config.redact_pattern(/regex/, "[REPLACEMENT]") - custom patterns - config.active_patterns - returns all enabled Pattern objects Also includes: - T2: Validators module with Luhn and SSN range validation - T3: PATTERNS hash with 16 built-in patterns (email, phone, credit_card, ssn, openai_key, anthropic_key, aws_key, stripe_key, github_token, github_pat, bearer_token, basic_auth, private_key, ipv4, ipv6, jwt) - PATTERN_GROUPS for convenient batch enabling (pii, financial, api_keys, auth, network, crypto) Invalid pattern names raise ConfigurationError at config time for early validation.
Update RedactionPipeline to support both interface styles: - New Pattern-based: call(text, audit:, field_path:) returns [text, audit] - Legacy lambda: call(text) returns text The pipeline auto-detects which interface to use: - Pattern objects are detected by class check - Lambdas with `audit:` keyword param bypass wrapping - Legacy single-arg lambdas get wrapped with audit tracking Also includes: - NormalizedInteraction extended with actor_type, actor_id, actor_gid, redaction_audit fields (T4 dependencies) - Patterns applied first, then custom_redactors - RedactionAudit populated on result.redaction_audit Backwards compatible with existing custom_redactors configuration.
Critical security fix: Apply PII redaction BEFORE job enqueue to ensure raw PII never enters the job queue (Redis/Sidekiq/SQS). Changes: - record! now calls apply_redaction() before PersistInteractionJob.perform_later - apply_redaction() serializes actor and runs RedactionPipeline inline - RedactionAudit excluded from job payload (not ActiveJob serializable) - PersistInteractionJob uses assign_actor() for serialized actor data - Added comprehensive tests for redaction timing and actor serialization
0a23bb3 to
d2366f3
Compare
Owner
Author
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Critical security fix: Apply PII redaction BEFORE job enqueue to ensure raw PII never enters the job queue (Redis/Sidekiq/SQS).
record!now callsapply_redaction()beforePersistInteractionJob.perform_laterapply_redaction()serializes actor and runsRedactionPipelineinlineRedactionAuditexcluded from job payload (not ActiveJob serializable)PersistInteractionJobusesassign_actor()for serialized actor dataTest plan
Closes #47