This module implements Azure Blob Storage as the document repository for TrustGuard. All uploaded documents (PDFs, images, audio, etc.) are stored here and remain the source of truth.
User Upload
↓
FastAPI (routes/documents.py)
↓
BlobStorageService (services/blob_storage.py)
↓
Azure Blob Storage Containers:
- trustguard-documents (raw uploads)
- trustguard-chunks (processed text chunks)
- trustguard-temp (temporary processing files)
File: infrastructure/main.bicep
Creates:
- Storage Account (Standard_LRS, Hot tier)
- 3 Containers:
trustguard-documents- Raw uploadstrustguard-chunks- Processed chunkstrustguard-temp- Temp files (auto-deleted after 7 days)
- Lifecycle policy for auto-cleanup
# Deploy
cd infrastructure
./deploy.sh trustguard-rg eastusFile: backend/services/blob_storage.py
Core functionality:
upload_blob()- Upload files with metadatadownload_blob()- Retrieve file contentlist_blobs()- List documents with filteringdelete_blob()- Remove documentsgenerate_blob_sas_url()- Create temporary download linksgenerate_container_sas_url()- Container-level SASget_blob_properties()- Retrieve metadata
Authentication: Uses either:
- Managed Identity (production) - No credentials needed
- Storage Account Key (development)
File: backend/routes/documents.py
Endpoints:
POST /api/v1/documents/upload- Upload documentGET /api/v1/documents/list- List documentsGET /api/v1/documents/download/{folder}/{filename}- DownloadPOST /api/v1/documents/sas-url/{folder}/{filename}- Generate temp URLDELETE /api/v1/documents/{folder}/{filename}- Delete documentGET /api/v1/documents/{folder}/{filename}/properties- Get metadata
curl -X POST "http://localhost:8000/api/v1/documents/upload?folder=documents" \
-F "file=@claim_123.pdf"Response:
{
"blob_name": "documents/claim_123.pdf",
"container_name": "trustguard-documents",
"blob_uri": "https://trustguardstg.blob.core.windows.net/trustguard-documents/documents/claim_123.pdf",
"size": 1024576,
"created": "2025-11-23T10:30:00+00:00"
}curl "http://localhost:8000/api/v1/documents/list?folder=documents"curl -X POST "http://localhost:8000/api/v1/documents/sas-url/documents/claim_123.pdf?expires_in_hours=24"Response:
{
"blob_name": "documents/claim_123.pdf",
"sas_url": "https://trustguardstg.blob.core.windows.net/trustguard-documents/documents/claim_123.pdf?sv=2023-01-01&...",
"expires_in_hours": 24
}✅ What we've implemented:
- No public blob access (
allowBlobPublicAccess: false) - HTTPS-only (
supportsHttpsTrafficOnly: true) - Managed Identity for production
- SAS tokens for temporary access
- Encryption at rest (default)
- RBAC roles (Storage Blob Data Contributor)
- Network restrictions (Firewall, VNets)
- Soft delete policy
- Private endpoints
- Audit logging to Log Analytics
# Storage Account
STORAGE_ACCOUNT_NAME=trustguardstg # Set by deployment
STORAGE_ACCOUNT_KEY= # Leave empty for Managed Identity
BLOB_CONTAINER_NAME=trustguard-documents- Enable Managed Identity on Container Apps
- Assign role: "Storage Blob Data Contributor"
- Leave
STORAGE_ACCOUNT_KEYempty in environment
cd infrastructure
./deploy.sh trustguard-rg eastusaz group create --name trustguard-rg --location eastus
az deployment group create \
--resource-group trustguard-rg \
--template-file main.bicep \
--parameters parameters.json# Install dependencies
cd backend
pip install -r ../requirements.txt
# Run FastAPI
uvicorn main:app --reload
# Upload test file
curl -X POST "http://localhost:8000/api/v1/documents/upload?folder=documents" \
-F "file=@sample.pdf"
# List documents
curl "http://localhost:8000/api/v1/documents/list"Import the collection from docs/postman/trustguard-module1.postman_collection.json (we'll create this next).
Monthly estimate (Standard_LRS, Hot tier):
- Storage: $0.018 per GB
- Transactions: ~$0.0004 per 10K operations
- Example: 100GB + 1M transactions = ~$2.40/month
This is well within Azure student credits!
Module 2 will implement Azure Functions to:
- Trigger on blob uploads
- Extract text using Document Intelligence
- Store chunks in the
trustguard-chunkscontainer
Learning Outcomes: ✅ Understand Azure Blob Storage architecture ✅ Implement storage service with Python SDK ✅ Create FastAPI endpoints for file operations ✅ Generate SAS tokens for temporary access ✅ Deploy infrastructure with Bicep