Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 10 additions & 0 deletions .env.example
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,16 @@ LOG_EACH_QUERY=false
RESERVATION_TTL_SECONDS=600
EMBEDDING_MODEL=text-embedding-3-small

# ===== Ingestion Limits =====
MAX_FILE_SIZE_BYTES=10485760
MAX_PDF_PAGE_COUNT=10
MIN_EXTRACTED_TEXT_CHARS=1
MAX_BULK_UPLOAD_FILES=50
EMBEDDING_BATCH_SIZE=32
OPENAI_EMBEDDING_TIMEOUT_SECONDS=300
INGEST_EXTRACT_JOB_TIMEOUT_SECONDS=900
INGEST_INDEX_JOB_TIMEOUT_SECONDS=1800

# ===== Client (React) =====
VITE_API_URL=http://localhost:8000
VITE_SUPABASE_URL=https://your-project.supabase.co
Expand Down
7 changes: 4 additions & 3 deletions .github/workflows/deploy.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ on:
description: 'Pull request number'
required: true
push:
branches: [ main ]
branches: [ main, develop ]
pull_request:
types: [opened, synchronize, reopened, ready_for_review]

Expand Down Expand Up @@ -37,10 +37,11 @@ jobs:
run: mkdir -p trivy-reports

- name: Run Trivy FS Scan
uses: aquasecurity/trivy-action@0.24.0
uses: aquasecurity/trivy-action@v0.36.0
with:
scan-type: 'fs'
scan-ref: '.'
version: 'v0.70.0'
scanners: 'vuln,misconfig,secret,license'
ignore-unfixed: true
format: 'table'
Expand Down Expand Up @@ -95,4 +96,4 @@ jobs:
with:
name: bandit-report
path: bandit-report.html
retention-days: 30
retention-days: 30
27 changes: 24 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -174,10 +174,11 @@ sequenceDiagram

What happens in practice:
- `upload-prepare` validates file size, content type, workspace limits, and idempotency.
- `upload-prepare-batch` can prepare many small PDFs as one ingestion run and returns per-file results.
- The API stores a placeholder document record and returns a signed storage URL.
- `upload-complete` confirms the object exists in storage and enqueues extraction.
- `upload-complete` and `upload-complete-batch` confirm uploaded objects and enqueue extraction jobs.
- `ingest_extract` downloads the PDF and writes extracted page text into `document_pages`.
- `ingest_index` chunks page text, generates embeddings, stores vectors, and marks the document ready.
- `ingest_index` chunks page text, generates embeddings in batches, stores vectors, and marks the document ready.

Primary files:
- `server/app/api/documents.py`
Expand Down Expand Up @@ -284,7 +285,11 @@ enterprise-rag/
- `GET /documents/{document_id}`
- `GET /documents/{document_id}/pages/{page_number}`
- `POST /documents/upload-prepare`
- `POST /documents/upload-prepare-batch`
- `POST /documents/upload-complete`
- `POST /documents/upload-complete-batch`
- `GET /documents/ingestion-runs/{run_id}`
- `GET /documents/ingestion-queues`
- `POST /documents/{document_id}/retry`
- `POST /documents/{document_id}/reindex`
- `DELETE /documents/{document_id}`
Expand Down Expand Up @@ -314,6 +319,7 @@ enterprise-rag/
Core tables in the current implementation:

- `workspaces`: tenant root for all user content
- `ingestion_runs`: batch upload and ingestion progress grouping
- `documents`: uploaded PDF metadata and pipeline status
- `document_pages`: extracted page text
- `chunks`: page-bounded text chunks used for retrieval
Expand All @@ -337,7 +343,9 @@ Current enforced limits from the application config and rate limiter:

- `1` workspace per user
- up to `100` documents per workspace
- maximum file size: `20 MB`
- maximum file size: `10 MB`
- maximum PDF page count: `10`
- maximum files per backend batch upload: `50`
- supported upload type: `application/pdf`
- maximum query length: `500` characters
- retrieval depth: `top_k = 5`
Expand All @@ -347,6 +355,10 @@ Current enforced limits from the application config and rate limiter:
- upload complete rate limit: `20` requests per minute per workspace
- query rate limit: `100` requests per minute per workspace

Bulk ingestion endpoints have separate rate limits from single-file endpoints:
- batch upload prepare: `5` requests per minute per workspace
- batch upload complete: `10` requests per minute per workspace

### Token Budget Model

The token budget is managed with reservation semantics so concurrent requests do not overspend the daily allowance.
Expand Down Expand Up @@ -487,6 +499,14 @@ DAILY_TOKEN_LIMIT=100000
RESERVATION_TTL_SECONDS=600
LOG_EACH_QUERY=false
EMBEDDING_MODEL=text-embedding-3-small
MAX_FILE_SIZE_BYTES=10485760
MAX_PDF_PAGE_COUNT=10
MIN_EXTRACTED_TEXT_CHARS=1
MAX_BULK_UPLOAD_FILES=50
EMBEDDING_BATCH_SIZE=32
OPENAI_EMBEDDING_TIMEOUT_SECONDS=300
INGEST_EXTRACT_JOB_TIMEOUT_SECONDS=900
INGEST_INDEX_JOB_TIMEOUT_SECONDS=1800
VITE_API_URL=http://localhost:8000
VITE_SUPABASE_URL=https://your-project.supabase.co
VITE_SUPABASE_ANON_KEY=your-anon-key
Expand All @@ -511,6 +531,7 @@ Embeddings are stored in `chunk_embeddings.embedding` using `pgvector`. Retrieva
- failed documents can be retried with `POST /documents/{document_id}/retry`
- already processed documents can be reindexed with `POST /documents/{document_id}/reindex`
- stale token reservations can be cleared by the maintenance job
- RQ extraction/indexing jobs use explicit timeouts and a failure callback so timed-out jobs mark documents `failed` instead of leaving them stuck in processing
- document deletion removes metadata first, then attempts storage cleanup

### Frontend Application Areas
Expand Down
22 changes: 16 additions & 6 deletions client/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -90,14 +90,19 @@ Primary files:
Upload UX is centered in `UploadPage` and `components/upload/UploadPanel`.

Flow:
1. request `POST /documents/upload-prepare`
2. upload the PDF directly to the signed URL
3. notify the backend with `POST /documents/upload-complete`
4. poll the document list while processing is active
5. let the user delete documents from the table
1. select and review up to 50 PDFs as one upload run
2. request `POST /documents/upload-prepare-batch`
3. upload accepted PDFs directly to their signed URLs
4. notify the backend with `POST /documents/upload-complete-batch`
5. poll documents, queue status, and the active ingestion run while processing is active
6. separate active, failed, and ready documents for recovery and review

Current behavior:
- shows live document status in a table
- shows selected files before upload starts
- shows accepted/rejected/processing/ready/failed counts for the current run
- shows live document status grouped by workflow state
- supports search, status filtering, and sorting for larger document sets
- supports retrying failed documents individually or in a retryable batch
- refreshes every 4 seconds while documents are processing
- treats `ready` and `indexed` as successful ingest states

Expand Down Expand Up @@ -145,7 +150,12 @@ The client currently uses these backend APIs:
- `GET /documents/{document_id}/pages/{page_number}`
- `POST /documents/upload-prepare`
- `POST /documents/upload-complete`
- `POST /documents/upload-prepare-batch`
- `POST /documents/upload-complete-batch`
- `GET /documents/ingestion-runs/{run_id}`
- `GET /documents/ingestion-queues`
- `DELETE /documents/{document_id}`
- `POST /documents/{document_id}/retry`
- `POST /query/stream`
- `GET /citations/{chunk_id}`
- `GET /queries`
Expand Down
2 changes: 1 addition & 1 deletion client/src/components/layout/AppShell.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -69,7 +69,7 @@ export default function AppShell() {
}

try {
const nextDocuments = await apiGetDocuments(accessToken);
const nextDocuments = await apiGetDocuments(accessToken, { limit: 100 });
setDocuments(nextDocuments);
} catch {
setDocuments([]);
Expand Down
Loading
Loading