A powerful REST API built with Node.js and TypeScript for masking and unmasking documents. The API accepts file uploads, masks specified keywords, and provides secure recovery mechanisms.
- File-based processing: Upload/download files instead of string content
- Multiple format support: Text, Markdown, JSON, XML, TXT files
- Flexible keyword input: Space/comma separated with quoted phrases
- Secure recovery: UUID v4 keys without embedded keyword information
- Minimal storage: Redis database for keyword mappings only
- Case handling: Case-insensitive masking, uppercase restoration
Purpose: Upload document, mask keywords, return masked file with recovery key
Request:
- Method: POST
- Content-Type: multipart/form-data
- File Field:
document(file to mask) - Form Field:
keywords(keyword string)
Example Keywords Input:
Hello world "Boston Red Sox", 'Pepperoni Pizza', 'Cheese Pizza', beer
Parsed Keywords:
- Hello
- world
- beer
- Boston Red Sox
- Pepperoni Pizza
- Cheese Pizza
Response:
- Content-Type: application/octet-stream (download file)
- Headers:
X-Recovery-Key: [UUID v4]Content-Disposition: attachment; filename="masked_[original_filename]"
- Body: Masked document file (keywords → XXXXX)
Purpose: Upload masked document, restore original content using recovery key
Request:
- Method: POST
- Content-Type: multipart/form-data
- File Field:
maskedDocument(masked file) - Form Field:
recoveryKey(UUID v4 from masking)
Response:
- Content-Type: application/octet-stream (download file)
- Headers:
Content-Disposition: attachment; filename="original_[original_filename]"
- Body: Restored original document (XXXXX → original keywords in UPPERCASE)
Key Structure: keyword_map:{recovery_key}
Value Format: JSON array of mappings
[
[1, "Hello"],
[1, "world"],
[3, "Boston Red Sox"],
[5, "Pepperoni Pizza"],
[7, "Cheese Pizza"],
[10, "beer"]
]Array Format: [lineNumber, originalText]
- Index 0: Line number (integer)
- Index 1: Original keyword text (string)
TTL: Optional expiration (1 year automatic delete by redis instance)
- Split by spaces and commas
- Extract quoted phrases (single and double quotes)
- Case-insensitive matching
- Preserve quoted phrases as single keywords
- Nested simple or double quotes is not allowed
- Read uploaded file line by line
- For each line, find keyword matches (case-insensitive)
- Replace matches with "XXXXX"
- Store mapping:
[lineNumber, originalText]in Redis - Write masked line to output file so the outputfile is written line by line at the same time we read the input file line by line
- Return file with recovery key in headers
- Validate recovery key exists in Redis
- Read masked file line by line
- Count XXXXX occurrences per line
- Replace each XXXXX with original text from Redis (UPPERCASE)
- Write restored line to output file
- Return restored file
- UUID v4 (128-bit entropy)
- No keyword information embedded
- Secure random generation
- Format:
xxxxxxxx-xxxx-4xxx-yxxx-xxxxxxxxxxxx
- Temporary file storage (/tmp folder is writable in aws lambda environment) during processing
- Read file line by line instead of whole read it in the memory
- Automatic cleanup of temp files
- Input validation and sanitization
- Redis with authentication
- Key with 1 year expiration
- No storage of original documents
- Minimal data footprint
- Runtime: Node.js version 24
- Language: TypeScript version 5
- Framework: Express.js version 5
- Cloud Provider: AWS Lambda Zip
- File Upload: Multer version 2
- Database: Redis with Aiven or redis.io service providers
- UUID Generation: uuidv4()
{
"express": "^5.x",
"multer": "^2.x",
"redis": "^5.x",
"uuid": "^13.x",
"typescript": "^5.x"
}- Line-by-line processing for all formats
- Keyword matching preserves format structure
- Masking maintains original file formatting
- Restoration maintains original file structure
- 400 Bad Request: Invalid file format
- 400 Bad Request: Missing keywords
- 400 Bad Request: Invalid recovery key format
- 422 Unprocessable Entity: Unsupported file format
- 500 Internal Server Error: Processing failures
- 404 Not Found: Invalid recovery key
- Optional: Basic rate limiting for abuse prevention
- Configurable limits per IP address
- Set up project structure
- Implement file upload/download endpoints
- Basic keyword parsing logic
- File processing and masking
- Recovery key generation
- Redis connection and configuration
- Keyword mapping storage
- Recovery key validation
- Data cleanup and TTL
- Multiple file format support
- Enhanced error handling
- Logging and monitoring
- Performance optimization
- Security hardening
- Documentation
- Testing suite
- Deployment configuration
- Keyword parsing logic
- File processing functions
- Recovery key generation
- Database operations
- End-to-end masking workflow
- End-to-end unmasking workflow
- Error scenarios
- File format compatibility (binary files or any other than plain text formats are not allowed)
- Large file processing
- Concurrent requests
- Memory usage optimization
REDIS_URL="redis://default:$REDIS_PARIS_PASSWORD@redis-11686.crce282.eu-west-3-1.ec2.cloud.redislabs.com:11686"
PORT=3000
TEMP_DIR=/tmp
FILE_SIZE_LIMIT=5MB #same than aws lambda limits where this api will be deployed- Multi-stage build for production
- Redis service dependency
- Health check endpoints
- Environment-based configuration
- Request/response logging
- Error tracking
- Performance metrics
- Redis connection monitoring
curl -X POST \
https://some-id.lambda-url.eu-west-3.on.aws/mask \
-F "document=@document.txt" \
-F "keywords=Hello world \"Boston Red Sox\", 'Pepperoni Pizza', beer"curl -X POST \
https://some-id.lambda-url.eu-west-3.on.aws/unmask \
-F "maskedDocument=@masked_document.txt" \
-F "recoveryKey=550e8400-e29b-41d4-a716-446655440000"- ✅ File-based upload/download functionality
- ✅ Support for plain text based document formats (xml, txt, md, json, csv, etc)
- ✅ Flexible keyword input parsing
- ✅ Secure UUID v4 recovery keys
- ✅ Case-insensitive masking with uppercase restoration
- ✅ Minimal Redis database storage
- ✅ Proper error handling and validation
- ✅ Performance with large files
- ✅ Security best practices
- ✅ Comprehensive testing coverage