Skip to content

feat: optimize document upload process and enhance memory management #849

Merged
MODSetter merged 2 commits intomainfrom
dev
Mar 1, 2026
Merged

feat: optimize document upload process and enhance memory management #849
MODSetter merged 2 commits intomainfrom
dev

Conversation

@MODSetter
Copy link
Copy Markdown
Owner

@MODSetter MODSetter commented Mar 1, 2026

Description

Motivation and Context

FIX #

Screenshots

API Changes

  • This PR includes API changes

Change Type

  • Bug fix
  • New feature
  • Performance improvement
  • Refactoring
  • Documentation
  • Dependency/Build system
  • Breaking change
  • Other (specify):

Testing Performed

  • Tested locally
  • Manual/QA verification

Checklist

  • Follows project coding standards and conventions
  • Documentation updated as needed
  • Dependencies updated as needed
  • No lint/build errors or new warnings
  • All relevant tests are passing

High-level PR Summary

This PR optimizes memory management across the application by tuning garbage collection thresholds, disabling unnecessary LiteLLM logging and telemetry, and adding explicit cleanup of circular references in chat streaming. It also improves the document upload process by processing files concurrently with temporary file buffering to avoid blocking the event loop, implementing client-side batching (5 files per request) to prevent proxy timeouts, and consolidating Celery session maker creation into a single shared function. Additionally, database connection pooling is configured with production-ready settings, and enhanced performance monitoring tracks RSS memory deltas and garbage collection statistics.

⏱️ Estimated Review Time: 30-90 minutes

💡 Review Order Suggestion
Order File Path
1 surfsense_backend/app/utils/perf.py
2 surfsense_backend/app/app.py
3 surfsense_backend/app/services/llm_service.py
4 surfsense_backend/app/services/llm_router_service.py
5 surfsense_backend/app/tasks/chat/stream_new_chat.py
6 surfsense_backend/app/db.py
7 surfsense_backend/app/tasks/celery_tasks/__init__.py
8 surfsense_backend/app/tasks/celery_tasks/document_tasks.py
9 surfsense_backend/app/tasks/celery_tasks/connector_tasks.py
10 surfsense_backend/app/tasks/celery_tasks/document_reindex_tasks.py
11 surfsense_backend/app/tasks/celery_tasks/podcast_tasks.py
12 surfsense_backend/app/tasks/celery_tasks/schedule_checker_task.py
13 surfsense_backend/app/tasks/celery_tasks/stale_notification_cleanup_task.py
14 surfsense_backend/app/routes/documents_routes.py
15 surfsense_web/lib/apis/documents-api.service.ts
16 surfsense_web/components/sources/DocumentUploadTab.tsx

Need help? Join our Discord

Analyze latest changes

- Increased maximum file upload limit from 10 to 50 to improve user experience.
- Implemented batch processing for document uploads to avoid proxy timeouts, splitting files into manageable chunks.
- Enhanced garbage collection in chat streaming functions to prevent memory leaks and improve performance.
- Added memory delta tracking in system snapshots for better monitoring of resource usage.
- Updated LLM router and service configurations to prevent unbounded internal accumulation and improve efficiency.
@vercel
Copy link
Copy Markdown

vercel bot commented Mar 1, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
surf-sense-frontend Building Building Preview, Comment Mar 1, 2026 1:26am

Request Review

@MODSetter MODSetter merged commit 4298257 into main Mar 1, 2026
5 of 7 checks passed
Copy link
Copy Markdown

@recurseml recurseml bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review by RecurseML

🔍 Review performed on f2f15f6..b08e8da

✨ No bugs found, your code is sparkling clean

✅ Files analyzed, no issues (15)

surfsense_backend/app/app.py
surfsense_backend/app/db.py
surfsense_backend/app/services/llm_router_service.py
surfsense_backend/app/services/llm_service.py
surfsense_backend/app/tasks/celery_tasks/__init__.py
surfsense_backend/app/tasks/celery_tasks/connector_tasks.py
surfsense_backend/app/tasks/celery_tasks/document_reindex_tasks.py
surfsense_backend/app/tasks/celery_tasks/document_tasks.py
surfsense_backend/app/tasks/celery_tasks/podcast_tasks.py
surfsense_backend/app/tasks/celery_tasks/schedule_checker_task.py
surfsense_backend/app/tasks/celery_tasks/stale_notification_cleanup_task.py
surfsense_backend/app/tasks/chat/stream_new_chat.py
surfsense_backend/app/utils/perf.py
surfsense_web/components/sources/DocumentUploadTab.tsx
surfsense_web/lib/apis/documents-api.service.ts

Copy link
Copy Markdown

@recurseml recurseml bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review by RecurseML

🔍 Review performed on b08e8da..b08e8da

✨ No files to analyze

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant