Skip to content

Conversation

@WHOIM1205
Copy link

@WHOIM1205 WHOIM1205 commented Jan 18, 2026

Summary

This PR fixes a critical ONNX Runtime memory leak in the Face Search API that occurred when model initialization failed during request handling.

Previously, if FaceDetector() was successfully created but FaceNet() failed during initialization, the allocated ONNX inference sessions were never released. This caused permanent CPU/GPU memory leaks and backend crashes under load.

This change ensures all ONNX resources are always cleaned up, even when initialization or processing fails.


Problem Description

In perform_face_search(), model resources were initialized outside the try block:

  • FaceDetector() allocates multiple ONNX inference sessions
  • FaceNet() may throw (missing model file, GPU OOM, corrupted model, etc.)
  • If FaceNet() failed, execution never entered the try block
  • The finally cleanup block was never executed
  • ONNX Runtime sessions (native CPU/GPU memory) were leaked permanently

ONNX sessions allocate native memory outside Python GC, so the leak accumulated silently on every failed request.


Impact

  • Permanent memory leak per failed request (~200–500MB)
  • GPU memory exhaustion when using CUDA execution provider
  • Backend crashes under moderate concurrency
  • Silent degradation (API returns 500 but memory continues to grow)

This affects production deployments with:

  • Misconfigured or missing model files
  • GPU-constrained environments
  • Concurrent face search requests
  • Partial or broken installations

Steps to Reproduce

Scenario 1: Missing FaceNet model

mv backend/app/models/ONNX_Exports/FaceNet_128D.onnx /tmp/
cd backend && python main.py


<!-- This is an auto-generated comment: release notes by coderabbit.ai -->

## Summary by CodeRabbit

* **Bug Fixes**
  * Enhanced error handling for the face search feature to provide clearer feedback when initialization fails.
  * Improved resource management to ensure better reliability and proper cleanup of the face detection service.

<sub>✏️ Tip: You can customize this high-level summary in your review settings.</sub>

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Signed-off-by: WHOIM1205 <rathourprateek8@gmail.com>
@github-actions
Copy link
Contributor

⚠️ No issue was linked in the PR description.
Please make sure to link an issue (e.g., 'Fixes #issue_number')

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 18, 2026

📝 Walkthrough

Walkthrough

The change relocates face detector and face recognition model initialization from upfront creation to a guarded try-catch block, adds dedicated exception handling for initialization failures, and improves resource cleanup by checking None directly instead of inspecting locals.

Changes

Cohort / File(s) Summary
Model Initialization Refactoring
backend/app/utils/faceSearch.py
Moves FaceDetector and FaceNet initialization into guarded try block with upfront None assignment; adds dedicated exception handler for initialization failures returning error response; improves cleanup logic in finally by checking for None directly

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

Poem

🐰 A model waits with patient care,
No eager rush through startup's air,
When init fails, we catch the fall,
And cleanup whispers soft to all,
Safe resources dance, None keeps watch—
Error handling without a botch! 🌟

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately reflects the main change: fixing an ONNX Session Memory Leak in the Face Search API by restructuring initialization and cleanup logic.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Contributor

⚠️ No issue was linked in the PR description.
Please make sure to link an issue (e.g., 'Fixes #issue_number')

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@backend/app/utils/faceSearch.py`:
- Around line 105-110: The current broad exception handler returns "Failed to
initialize models" for any error in the entire try block; narrow the scope or
make the message generic: either wrap only the model initialization code in its
own try/except and keep the existing initialization-specific message there, and
use a separate try/except (or let exceptions propagate) around calls like
fn.get_embedding(), get_all_face_embeddings(), and the similarity processing
(the block handling cosine similarity and building GetAllImagesResponse), or
change the outer except to return a generic failure message (e.g., "Failed to
process face search") so errors from fn.get_embedding(),
get_all_face_embeddings(), or the similarity logic do not misreport as
initialization failures; reference GetAllImagesResponse, fn.get_embedding(),
get_all_face_embeddings(), and the similarity processing loop when applying the
fix.

@WHOIM1205
Copy link
Author

Hi @rahulharpal1603
This PR fixes a critical ONNX Runtime memory leak in the face search endpoint caused by resource initialization outside a try/finally block.

I’d appreciate a review, especially around:

  • Resource lifecycle and cleanup guarantees
  • Error handling during model initialization
  • Any edge cases I may have missed in production scenarios

Thanks in advance for your time

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant