Security Documentation

Overview

The Code Interpreter API implements multiple layers of security to ensure safe code execution and protect against common web application vulnerabilities.

Authentication

API Key Authentication

All API endpoints (except health checks and documentation) require authentication using an API key.

Providing API Key

The API key can be provided in two ways:

x-api-key header (recommended):

curl -H "x-api-key: your-api-key" https://api.example.com/sessions

Authorization header:

curl -H "Authorization: Bearer your-api-key" https://api.example.com/sessions

Configuration

Set the API key in your environment:

export API_KEY="your-secure-api-key-here"

Or in your .env file:

API_KEY=your-secure-api-key-here

Important: Use a strong, randomly generated API key in production.

Rate Limiting

The API implements rate limiting to prevent abuse:

Authentication failures: Max 10 failed attempts per IP per hour
API key validation: Results are cached for 5 minutes to improve performance
Request rate limiting: Additional rate limiting can be configured per endpoint

Security Middleware

Security Headers

All responses include security headers:

X-Content-Type-Options: nosniff
X-Frame-Options: DENY
X-XSS-Protection: 1; mode=block
Strict-Transport-Security: max-age=31536000; includeSubDomains
Content-Security-Policy: default-src 'self' (varies by endpoint)
Referrer-Policy: strict-origin-when-cross-origin
Permissions-Policy: geolocation=(), microphone=(), camera=()

Request Validation

Content-Type validation: Only allowed content types are accepted
Request size limits: Configurable maximum request size
Input sanitization: All user inputs are validated and sanitized

File Security

Filename Validation

Uploaded files are validated for:

Path traversal prevention: ../ and \ characters are blocked
Null byte injection: Null bytes in filenames are rejected
File extension whitelist: Only allowed file extensions are accepted
Filename length limits: Maximum 255 characters
Suspicious characters: Special characters that could be dangerous are blocked

Allowed File Extensions

.txt, .csv, .json, .xml, .yaml, .yml,
.py, .js, .ts, .go, .java, .c, .cpp, .h, .hpp,
.rs, .php, .rb, .r, .f90, .d,
.md, .rst, .html, .css,
.png, .jpg, .jpeg, .gif, .svg,
.pdf, .doc, .docx, .xls, .xlsx

Code Execution Security

Code Validation

Code is analyzed for potentially dangerous patterns:

System imports: import os, import subprocess, etc.
Dangerous functions: eval(), exec(), __import__(), etc.
File operations: open(), file(), etc.
Input functions: input(), raw_input(), etc.

Note: Dangerous patterns generate warnings but don't block execution, as the code runs in isolated Kubernetes pods.

Pod Isolation

Kubernetes pods: All code runs in isolated Kubernetes pods
Resource limits: Memory and CPU limits enforced via Kubernetes
Network isolation: NetworkPolicy denies all egress by default
Security context: Pods run as non-root (runAsUser: 65532)
Ephemeral execution: Pods destroyed immediately after execution

Execution Pod Security Model

Each execution pod runs a single container with an embedded Go runner binary. The runner executes code via subprocess within the same container. This design requires zero elevated privileges.

Pod Settings:

spec:
  containers:
  - name: execution
    securityContext:
      runAsUser: 65532
      runAsNonRoot: true
      allowPrivilegeEscalation: false
      capabilities:
        drop: ["ALL"]

Why This Is Secure:

Zero elevated privileges: No SYS_PTRACE, SYS_ADMIN, SYS_CHROOT capabilities. No allowPrivilegeEscalation. No shareProcessNamespace.
Single cgroup: User code runs in the same container as the runner, so Kubernetes resource limits (CPU, memory) apply directly to user code with no cgroup mismatch.
Non-root execution: The container runs as non-root (runAsUser: 65532).
No host namespace access: No capabilities that could be used to access host processes or namespaces.
Hardened runtime compatible: The zero-privilege model is compatible with any hardened Kubernetes runtime (gVisor, Kata Containers, etc.).

Pod Hardening (Host Info Protection)

Execution pods are hardened to prevent information leakage about the host infrastructure. This prevents reconnaissance attacks that could reveal details about your cloud provider, Kubernetes cluster, or internal network configuration.

Currently Implemented:

Feature	Protection
Generic hostname	All pods use hostname "sandbox" instead of revealing node info
Non-root execution	Pods run as `runAsUser: 65532` with `runAsNonRoot: true`
Network policies	Egress denied by default, blocks cloud metadata endpoints
Public DNS only	Execution pods use only public DNS (8.8.8.8, 1.1.1.1)

Note: Kernel version (/proc/version) and CPU/memory info (/proc/cpuinfo, /proc/meminfo) remain accessible because many libraries depend on them. The pod security context and network policies address the primary concern of revealing cloud provider and internal network details.

Network Isolation

Execution pods are isolated via Kubernetes NetworkPolicy:

Configuration

# In helm values.yaml
execution:
  networkPolicy:
    enabled: true      # Enable NetworkPolicy enforcement
    denyEgress: true   # Block all egress (default: true)

Network Modes

Full Isolation (default): denyEgress: true
- Blocks all outbound connections
- Pods cannot access internet or cluster services
- Maximum security for untrusted code
Selective Egress: denyEgress: false
- Allows DNS (UDP 53) and HTTPS (TCP 443/80)
- Enables package downloads (pip, npm, etc.)
- Note: Does not block private IP ranges

Security Considerations

NetworkPolicy Required: Your Kubernetes cluster must have a CNI that supports NetworkPolicy (Calico, Cilium, etc.).
Default Deny: All egress is blocked by default for maximum security.
No Inter-Pod Communication: NetworkPolicy denies all ingress from other pods.

State Persistence Security

Python state persistence introduces additional security considerations:

Serialization Security

Serialization inside pods: State is serialized within the isolated execution pod, not on the API server. The API never unpickles user data.
cloudpickle usage: We use cloudpickle for serialization. While pickle-based formats can execute code during deserialization, this only occurs inside the sandboxed pod.
Compression: State is compressed with lz4 before storage, providing minor obfuscation and reducing attack surface.
Base64 encoding: Final storage uses base64 encoding for safe transport.

Storage Security

Redis encryption: Consider enabling Redis TLS in production for encrypted state storage
MinIO encryption: Enable server-side encryption for archived states
TTL-based cleanup: States automatically expire (2 hours in Redis, 7 days in MinIO archives)
Size limits: STATE_MAX_SIZE_MB prevents denial-of-service via large states

Session Isolation

Session binding: State is bound to session_id, not directly accessible by other sessions
User scoping: Sessions are scoped by user_id and entity_id
No cross-session access: One user's session cannot access another user's state

Disabling State Persistence

If state persistence poses unacceptable risk for your use case:

STATE_PERSISTENCE_ENABLED=false

This ensures each execution starts with a clean namespace.

Audit Events

State persistence operations are logged:

State save (size, session_id)
State load (session_id, source: redis/minio)
State archive (session_id)
State size limit exceeded (warning)

Security Monitoring

Audit Logging

All security-relevant events are logged:

Authentication attempts: Success and failure
File operations: Upload, download, delete
Code execution: Language, warnings, success/failure
Rate limiting: When limits are exceeded

Log Format

{
  "event_type": "authentication",
  "success": true,
  "api_key_prefix": "abc123...",
  "client_ip": "192.168.1.100",
  "endpoint": "GET /sessions",
  "timestamp": "2024-01-15T10:30:00Z"
}

Monitoring Endpoints

Authentication stats: Get authentication failure statistics
Rate limit status: Check current rate limit status
Security events: Query recent security events

Configuration

Environment Variables

# API Key (required)
API_KEY=your-secure-api-key

# Resource Limits
MAX_EXECUTION_TIME=30          # seconds
MAX_MEMORY_MB=512             # megabytes
MAX_FILE_SIZE_MB=10           # megabytes per file
MAX_FILES_PER_SESSION=50      # files per session
MAX_OUTPUT_FILES=10           # output files per execution

# Redis for caching and rate limiting
REDIS_HOST=localhost
REDIS_PORT=6379
REDIS_PASSWORD=optional-password

Security Best Practices

Use strong API keys: Generate cryptographically secure random keys
Enable HTTPS: Always use HTTPS in production
Monitor logs: Regularly review security logs for suspicious activity
Update dependencies: Keep all dependencies up to date
Network isolation: Deploy in a private network when possible
Resource monitoring: Monitor resource usage and set appropriate limits

Incident Response

Authentication Failures

If you see repeated authentication failures:

Check the source IP in logs
Verify the API key is correct
Consider blocking suspicious IPs at the network level
Rotate API keys if compromise is suspected

Suspicious Code Execution

If dangerous code patterns are detected:

Review the code content in logs
Check the session and user context
Consider additional code validation rules
Monitor pod resource usage via Kubernetes metrics

File Upload Issues

For suspicious file uploads:

Check filename validation logs
Review file content if necessary
Verify file size and type restrictions
Monitor storage usage

Security Updates

This security documentation should be reviewed and updated regularly as new threats emerge and security measures are enhanced.

FilesExpand file tree

SECURITY.md

Latest commit

History

SECURITY.md

File metadata and controls

Security Documentation

Overview

Authentication

API Key Authentication

Providing API Key

Configuration

Rate Limiting

Security Middleware

Security Headers

Request Validation

File Security

Filename Validation

Allowed File Extensions

Code Execution Security

Code Validation

Pod Isolation

Execution Pod Security Model

Pod Hardening (Host Info Protection)

Network Isolation

Configuration

Network Modes

Security Considerations

State Persistence Security

Serialization Security

Storage Security

Session Isolation

Disabling State Persistence

Audit Events

Security Monitoring

Audit Logging

Log Format

Monitoring Endpoints

Configuration

Environment Variables

Security Best Practices

Incident Response

Authentication Failures

Suspicious Code Execution

File Upload Issues

Security Updates