-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Description
Currently, models are loaded on first request which causes significant delays (30-60 seconds). Implement a comprehensive model preloading and caching system to ensure fast response times for all AI4Bharat services.
Acceptance Criteria
- Implement model preloading at application startup
- Add model caching mechanism to prevent reloading
- Add model warmup endpoints for manual preloading
- Implement model health checks and auto-reloading
- Add model memory management and cleanup
- Add configuration for model preloading behavior
- Add model loading progress indicators
- Add model versioning and updates
Metadata
Metadata
Assignees
Labels
No labels