This document outlines the security considerations, threat model, and mitigation strategies for the Falcon ML Inference Platform.
Security Posture: Defense in depth with multiple layers of protection.
- ML Model - Proprietary intellectual property
- User Data - Inference requests (potentially sensitive)
- Infrastructure - Servers, containers, credentials
- Logs - May contain PII or business-sensitive data
- Metrics - System performance data
- External Attackers - Attempting unauthorized access
- Malicious Users - Abusing the API
- Insider Threats - Compromised credentials
- Supply Chain - Compromised dependencies
- API Abuse - DDoS, resource exhaustion
- Injection Attacks - SQL injection, command injection
- Data Exfiltration - Stealing models or data
- Credential Compromise - Stolen secrets
- Container Escape - Breaking out of containerization
Current Implementation:
# app/models/schemas.py
class InferenceRequest(BaseModel):
text: str = Field(
...,
min_length=1,
max_length=10000, # Prevent extremely large inputs
)
@validator("text")
def validate_text(cls, v: str) -> str:
if not v.strip():
raise ValueError("Text cannot be empty")
return vProtection Against:
- Oversized payloads
- Empty/malformed requests
- Type confusion attacks
Recommendations:
# Additional validation
- Sanitize HTML/script tags if user input is displayed
- Validate encoding (UTF-8 only)
- Check for null bytes
- Rate limit by content hash to prevent abuseCurrent Implementation:
# nginx/conf.d/falcon.conf
limit_req_zone $binary_remote_addr zone=api_limit:10m rate=100r/s;
limit_req zone=api_limit burst=20 nodelay;
limit_conn_zone $binary_remote_addr zone=conn_limit:10m;
limit_conn conn_limit 10;Protection Against:
- DDoS attacks
- Resource exhaustion
- Brute force attempts
Recommendations:
Production Configuration:
- Implement token bucket algorithm
- Use API keys with per-key limits
- Implement exponential backoff for repeated violators
- Add CAPTCHA for suspicious patterns
- Monitor rate_limit_exceeded metric
Current State:
⚠️ No authentication in demo (intended for internal network)
Production Requirements:
# Add to app/api/routes.py
from fastapi.security import HTTPBearer, HTTPAuthorizationCredentials
security = HTTPBearer()
@router.post("/infer")
async def infer(
credentials: HTTPAuthorizationCredentials = Depends(security),
):
# Validate JWT token
token = credentials.credentials
user = validate_jwt(token)
# ... rest of inference logicOptions:
- API Keys - Simple, for service-to-service
- JWT Tokens - For user authentication
- mTLS - For strong service authentication
- OAuth 2.0 - For third-party integration
Current State:
# .env file (NOT FOR PRODUCTION)
POSTGRES_PASSWORD=falcon_dev_password_change_in_prodProduction Requirements:
DO: ✅ Use secret management service:
- AWS Secrets Manager
- Google Secret Manager
- HashiCorp Vault
- Kubernetes Secrets (with encryption at rest)
✅ Rotate secrets regularly:
- Database passwords: 90 days
- API keys: 90 days
- Certificates: Before expiry
✅ Principle of least privilege:
- Workers: Read-only model access
- No root in containers
- Separate credentials per service
DON'T: ❌ Commit secrets to git ❌ Log secrets ❌ Store secrets in environment variables (use secret store) ❌ Hardcode secrets in code
Implementation:
# Use secret manager
import boto3
def get_secret(secret_name):
client = boto3.client('secretsmanager')
response = client.get_secret_value(SecretId=secret_name)
return json.loads(response['SecretString'])
# In config.py
if settings.environment == "production":
secrets = get_secret("falcon/prod/database")
settings.postgres_password = secrets['password']Current Implementation:
# docker-compose.yml
networks:
falcon-network:
driver: bridge # Isolated networkProduction Requirements:
┌──────────────────────────────────────┐
│ Public Internet │
└────────────┬─────────────────────────┘
│
┌──────▼──────┐
│ Firewall │ ← Only ports 80, 443
│ (UFW/SG) │
└──────┬──────┘
│
┌──────▼──────┐
│ DMZ Zone │
│ - Nginx │ ← Reverse proxy only
└──────┬──────┘
│
┌──────▼──────────┐
│ Application Zone │
│ - Workers │ ← No direct internet access
└──────┬──────────┘
│
┌──────▼──────────┐
│ Data Zone │
│ - Redis │ ← Strict ACLs
│ - Postgres │
└─────────────────┘
Firewall Rules:
# UFW example
sudo ufw default deny incoming
sudo ufw default allow outgoing
sudo ufw allow 80/tcp # HTTP
sudo ufw allow 443/tcp # HTTPS
sudo ufw allow 22/tcp # SSH (restrict to bastion)
sudo ufw enableAWS Security Groups:
# ALB Security Group
ingress {
from_port = 80
to_port = 80
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
# Worker Security Group
ingress {
from_port = 8000
to_port = 8000
protocol = "tcp"
security_groups = [aws_security_group.alb.id]
}
# Database Security Group
ingress {
from_port = 5432
to_port = 5432
protocol = "tcp"
security_groups = [aws_security_group.workers.id]
}Current:
# HTTP only (demo)
listen 80;Production:
# HTTPS with TLS 1.3
server {
listen 443 ssl http2;
ssl_protocols TLSv1.3 TLSv1.2;
ssl_ciphers 'TLS_AES_128_GCM_SHA256:TLS_AES_256_GCM_SHA384';
ssl_prefer_server_ciphers off;
ssl_certificate /etc/nginx/ssl/cert.pem;
ssl_certificate_key /etc/nginx/ssl/key.pem;
# Redirect HTTP to HTTPS
if ($scheme != "https") {
return 301 https://$host$request_uri;
}
}Postgres:
# Enable encryption
docker run -e POSTGRES_INITDB_ARGS="--data-checksums" \
-e PGDATA=/var/lib/postgresql/data/pgdata \
-v /encrypted/volume:/var/lib/postgresql/data \
postgres:15Redis:
# Use encrypted EBS/disk
# Redis doesn't encrypt at rest natively, rely on volume encryptionLogs:
# Redact sensitive data
import re
def redact_pii(log_message):
# Email addresses
log_message = re.sub(r'\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b',
'[REDACTED_EMAIL]', log_message)
# Credit cards
log_message = re.sub(r'\b\d{4}[- ]?\d{4}[- ]?\d{4}[- ]?\d{4}\b',
'[REDACTED_CC]', log_message)
return log_messageCurrent Best Practices:
# Dockerfile
FROM python:3.11-slim # Minimal base image
# Don't run as root
RUN groupadd -r falcon && useradd -r -g falcon falcon
# ... install dependencies ...
USER falcon # Switch to non-root user
HEALTHCHECK --interval=30s --timeout=10s --start-period=40s --retries=3 \
CMD curl -f http://localhost:8000/healthz || exit 1Additional Recommendations:
# Multi-stage build to reduce attack surface
FROM python:3.11-slim AS builder
WORKDIR /app
COPY requirements.txt .
RUN pip install --user -r requirements.txt
FROM python:3.11-slim
COPY --from=builder /root/.local /root/.local
COPY app/ /app/
USER 1000:1000 # Numeric UID (more portable)Security Scanning:
# Scan for vulnerabilities
docker scan falcon-worker:latest
# Trivy
trivy image falcon-worker:latest
# Anchore
anchore-cli image add falcon-worker:latest
anchore-cli image vuln falcon-worker:latest allRuntime Security:
# Read-only root filesystem
docker run --read-only falcon-worker
# Drop capabilities
docker run --cap-drop=ALL --cap-add=NET_BIND_SERVICE falcon-worker
# Security profiles
docker run --security-opt=no-new-privileges:true falcon-workerSecurity Logging:
# Log security events
logger.warning(
"Rate limit exceeded",
extra={
"client_ip": client_ip,
"endpoint": "/infer",
"request_count": count,
"security_event": "rate_limit",
}
)
logger.error(
"Invalid authentication token",
extra={
"client_ip": client_ip,
"security_event": "auth_failure",
}
)Security Metrics:
from prometheus_client import Counter
auth_failures = Counter(
"auth_failures_total",
"Total authentication failures",
["reason"]
)
rate_limit_exceeded = Counter(
"rate_limit_exceeded_total",
"Rate limit exceeded events",
["endpoint"]
)Alerts:
# prometheus/alerts/security_alerts.yml
- alert: HighAuthFailureRate
expr: rate(auth_failures_total[5m]) > 10
for: 5m
labels:
severity: warning
annotations:
summary: "High authentication failure rate detected"
- alert: RateLimitAbuse
expr: rate(rate_limit_exceeded_total[5m]) > 50
for: 10m
labels:
severity: warning
annotations:
summary: "Potential API abuse detected"Current:
# requirements.txt with pinned versions
fastapi==0.109.0
uvicorn==0.27.0Security Practices:
# Check for vulnerabilities
pip-audit
# Or use safety
safety check -r requirements.txt
# Update dependencies regularly
pip list --outdatedAutomated Scanning:
# .github/workflows/security.yml
- name: Run pip-audit
run: |
pip install pip-audit
pip-audit -r requirements.txtSupply Chain Security:
# Verify package hashes
pip install --require-hashes -r requirements.txt
# Use private PyPI mirror
pip install --index-url https://private-pypi.company.com/simpleSecurity Incident Playbook:
-
Detection
- Monitor security alerts
- Review logs for anomalies
- User reports
-
Containment
# Isolate affected container docker network disconnect falcon-network falcon-worker-1 # Block malicious IP at firewall sudo ufw deny from <ip_address> # Rotate credentials if compromised
-
Investigation
# Collect logs docker logs falcon-worker-1 > incident-logs.json # Check database for unauthorized access SELECT * FROM inference_logs WHERE created_at > 'incident-time' ORDER BY created_at;
-
Recovery
- Patch vulnerabilities
- Restore from clean backup
- Rotate all credentials
- Update firewall rules
-
Post-Incident
- Write postmortem (see POSTMORTEM_TEMPLATE.md)
- Update security controls
- Add detection for similar incidents
- Enable HTTPS/TLS with valid certificates
- Implement authentication (API keys/JWT)
- Move secrets to secret manager
- Enable firewall with minimal ports
- Set up network segmentation
- Enable database encryption at rest
- Configure security headers (CSP, HSTS, etc.)
- Set up security monitoring/alerts
- Scan containers for vulnerabilities
- Run SAST/DAST security scans
- Perform penetration testing
- Review and sign off security checklist
- Monitor security alerts daily
- Review access logs weekly
- Update dependencies monthly
- Rotate credentials quarterly
- Conduct security audits quarterly
- Test incident response annually
- Review and update security docs
Add to Nginx:
# Security headers
add_header X-Frame-Options "SAMEORIGIN" always;
add_header X-Content-Type-Options "nosniff" always;
add_header X-XSS-Protection "1; mode=block" always;
add_header Referrer-Policy "no-referrer-when-downgrade" always;
add_header Content-Security-Policy "default-src 'self'; script-src 'self'; style-src 'self' 'unsafe-inline';" always;
add_header Strict-Transport-Security "max-age=31536000; includeSubDomains" always;
# Remove version disclosure
server_tokens off;Last Updated: 2026-02-12
Next Review: 2026-05-12
Owner: Security Team