-
Notifications
You must be signed in to change notification settings - Fork 548
Fix ONNX Session Memory Leak in Face Search API #1050
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: WHOIM1205 <rathourprateek8@gmail.com>
|
|
📝 WalkthroughWalkthroughThe change relocates face detector and face recognition model initialization from upfront creation to a guarded try-catch block, adds dedicated exception handling for initialization failures, and improves resource cleanup by checking None directly instead of inspecting locals. Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~8 minutes Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@backend/app/utils/faceSearch.py`:
- Around line 105-110: The current broad exception handler returns "Failed to
initialize models" for any error in the entire try block; narrow the scope or
make the message generic: either wrap only the model initialization code in its
own try/except and keep the existing initialization-specific message there, and
use a separate try/except (or let exceptions propagate) around calls like
fn.get_embedding(), get_all_face_embeddings(), and the similarity processing
(the block handling cosine similarity and building GetAllImagesResponse), or
change the outer except to return a generic failure message (e.g., "Failed to
process face search") so errors from fn.get_embedding(),
get_all_face_embeddings(), or the similarity logic do not misreport as
initialization failures; reference GetAllImagesResponse, fn.get_embedding(),
get_all_face_embeddings(), and the similarity processing loop when applying the
fix.
|
Hi @rahulharpal1603 I’d appreciate a review, especially around:
Thanks in advance for your time |
Summary
This PR fixes a critical ONNX Runtime memory leak in the Face Search API that occurred when model initialization failed during request handling.
Previously, if
FaceDetector()was successfully created butFaceNet()failed during initialization, the allocated ONNX inference sessions were never released. This caused permanent CPU/GPU memory leaks and backend crashes under load.This change ensures all ONNX resources are always cleaned up, even when initialization or processing fails.
Problem Description
In
perform_face_search(), model resources were initialized outside thetryblock:FaceDetector()allocates multiple ONNX inference sessionsFaceNet()may throw (missing model file, GPU OOM, corrupted model, etc.)FaceNet()failed, execution never entered thetryblockfinallycleanup block was never executedONNX sessions allocate native memory outside Python GC, so the leak accumulated silently on every failed request.
Impact
This affects production deployments with:
Steps to Reproduce
Scenario 1: Missing FaceNet model