Refactor AI form fill task and add sentry to catch errors#23
Refactor AI form fill task and add sentry to catch errors#23
Conversation
|
Note Other AI code review bot(s) detectedCodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review. WalkthroughChanges touch three modules: Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Pull request overview
This PR refactors the AI form fill task to improve error handling and monitoring by introducing a custom ScribeError exception class and adding Sentry integration for production error tracking. The refactoring consolidates error handling into a single try-except block, eliminates early returns in favor of raising exceptions, and improves logging consistency throughout the task.
Key changes:
- Introduced
ScribeErrorexception class with processing context support - Added Sentry integration for error capture in production environments
- Refactored error handling to use exceptions instead of early returns with status updates
- Added
tnc_hashproperty to plugin settings andlast_modified_dateproperty to ScribeQuota model
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 6 comments.
| File | Description |
|---|---|
| care_scribe/tasks/scribe.py | Major refactoring of the AI form fill task with new error handling, custom exception class, improved logging format, and consolidated error processing |
| care_scribe/settings.py | Added tnc_hash attribute to plugin_settings for caching the hash of terms and conditions |
| care_scribe/models/scribe_quota.py | Added last_modified_date property and reorganized imports |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Actionable comments posted: 7
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
care_scribe/models/scribe_quota.py (1)
14-15: Module-level date computation causes stale month boundaries.
start_of_monthandend_of_monthare computed once at import time. In long-running processes (e.g., Celery workers), these values become stale when the month changes, causingcalculate_used()to query incorrect date ranges by default.Move the computation inside
calculate_used()or make them functions:-start_of_month = timezone.now().replace(day=1, hour=0, minute=0, second=0, microsecond=0) -end_of_month = (start_of_month + timezone.timedelta(days=31)).replace(day=1) - timezone.timedelta(seconds=1) + +def get_month_boundaries(): + start = timezone.now().replace(day=1, hour=0, minute=0, second=0, microsecond=0) + end = (start + timezone.timedelta(days=31)).replace(day=1) - timezone.timedelta(seconds=1) + return start, endThen update
calculate_used:- def calculate_used(self, from_date=start_of_month, to_date=end_of_month): + def calculate_used(self, from_date=None, to_date=None): + if from_date is None or to_date is None: + from_date, to_date = get_month_boundaries()care_scribe/tasks/scribe.py (1)
467-481: Missingprocessing_contextin additional error paths.These
ScribeErrorraises should includeprocessing_context=processingfor consistent error tracking:if retry > 0: - raise ScribeError( - f"AI response was malformed, please retry: {str(ai_resp.candidates[0].finish_message)}" - ) + raise ScribeError( + f"AI response was malformed, please retry: {ai_resp.candidates[0].finish_message!s}", + processing_context=processing, + ) else: processing["retries"] = retry + 1 return generate_response(retry + 1) return ai_resp ai_response = generate_response() if ai_response.candidates[0].finish_reason != types.FinishReason.STOP: - raise ScribeError( - f"AI response did not finish successfully: {str(ai_response.candidates[0].finish_reason)}: " - f"{str(ai_response.candidates[0].finish_message)}" - ) + raise ScribeError( + f"AI response did not finish successfully: {ai_response.candidates[0].finish_reason!s}: " + f"{ai_response.candidates[0].finish_message!s}", + processing_context=processing, + )Note: Also using
!sconversion flag as suggested by RUF010 instead of explicitstr()calls.
🧹 Nitpick comments (2)
care_scribe/tasks/scribe.py (2)
399-408: Refactor cache fetching to avoid raising exception for control flow.The current pattern raises
Exception("Cache not found")inside a try block just to trigger the except handler. This is an anti-pattern flagged by linter rules TRY002/TRY301.cached_content = None - try: - logger.info( - f"Scribe[{form.external_id}] status: fetching cache for scribe_{chat_model}_{output_schema_hash}" - ) - cached_content = client.caches.get(name=f"scribe_{chat_model}_{output_schema_hash}") - if not cached_content: - raise Exception("Cache not found") - except Exception as e: - logger.error(f"Scribe[{form.external_id}] error fetching cache: {e}") + logger.info( + f"Scribe[{form.external_id}] status: fetching cache for scribe_{chat_model}_{output_schema_hash}" + ) + try: + cached_content = client.caches.get(name=f"scribe_{chat_model}_{output_schema_hash}") + except Exception: + logger.exception(f"Scribe[{form.external_id}] error fetching cache") + cached_content = None + + if not cached_content: + logger.info(f"Scribe[{form.external_id}] cache not found, will create new cache")
621-629: Uselogger.exceptionfor automatic stack trace inclusion.
logger.errordoesn't include the stack trace automatically. Since this is the main error handler, uselogger.exceptionfor better debugging context (especially useful when Sentry is not configured).except Exception as e: - logger.error(f"Scribe[{form.external_id}] status: error occurred while processing form: {e}") + logger.exception(f"Scribe[{form.external_id}] status: error occurred while processing form") if settings.SENTRY_DSN: import sentry_sdk sentry_sdk.capture_exception(e) else: processing["error_trace"] = "".join(traceback.format_exception(None, e, e.__traceback__))
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
care_scribe/models/scribe_quota.py(2 hunks)care_scribe/settings.py(2 hunks)care_scribe/tasks/scribe.py(9 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
care_scribe/settings.py (1)
care_scribe/utils.py (1)
hash_string(3-5)
🪛 Ruff (0.14.8)
care_scribe/tasks/scribe.py
150-153: Abstract raise to an inner function
(TRY301)
150-153: Avoid specifying long messages outside the exception class
(TRY003)
168-171: Abstract raise to an inner function
(TRY301)
168-171: Avoid specifying long messages outside the exception class
(TRY003)
174-174: Abstract raise to an inner function
(TRY301)
174-174: Avoid specifying long messages outside the exception class
(TRY003)
177-180: Abstract raise to an inner function
(TRY301)
177-180: Avoid specifying long messages outside the exception class
(TRY003)
183-186: Abstract raise to an inner function
(TRY301)
183-186: Avoid specifying long messages outside the exception class
(TRY003)
203-206: Abstract raise to an inner function
(TRY301)
203-206: Avoid specifying long messages outside the exception class
(TRY003)
209-212: Abstract raise to an inner function
(TRY301)
209-212: Avoid specifying long messages outside the exception class
(TRY003)
225-225: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
225-225: Avoid specifying long messages outside the exception class
(TRY003)
341-341: Avoid specifying long messages outside the exception class
(TRY003)
406-406: Abstract raise to an inner function
(TRY301)
406-406: Create your own exception
(TRY002)
406-406: Avoid specifying long messages outside the exception class
(TRY003)
407-407: Do not catch blind exception: Exception
(BLE001)
408-408: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
428-430: Avoid specifying long messages outside the exception class
(TRY003)
461-461: Avoid specifying long messages outside the exception class
(TRY003)
467-469: Abstract raise to an inner function
(TRY301)
467-469: Avoid specifying long messages outside the exception class
(TRY003)
468-468: Use explicit conversion flag
Replace with conversion flag
(RUF010)
478-481: Abstract raise to an inner function
(TRY301)
478-481: Avoid specifying long messages outside the exception class
(TRY003)
479-479: Use explicit conversion flag
Replace with conversion flag
(RUF010)
480-480: Use explicit conversion flag
Replace with conversion flag
(RUF010)
581-581: Avoid specifying long messages outside the exception class
(TRY003)
586-586: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
587-587: Avoid specifying long messages outside the exception class
(TRY003)
621-621: Do not catch blind exception: Exception
(BLE001)
622-622: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
🔇 Additional comments (3)
care_scribe/tasks/scribe.py (3)
28-53: Well-structured prompt template and error handling.The centralized
BASE_PROMPTimproves maintainability, andScribeErrorwithprocessing_contextprovides good structured error handling for tracking processing state during failures.
427-430: Missingprocessing_contextin cache creation error.if match and "constraint-is-too-big" in match.group(1): - raise ScribeError( - "The form is too large for Scribe. Please try again with a smaller form." - ) from e + raise ScribeError( + "The form is too large for Scribe. Please try again with a smaller form.", + processing_context=processing, + ) from eLikely an incorrect or invalid review comment.
202-212: No changes needed — quota comparison logic is correct as written.The code intentionally compares
user_quota.usedagainstfacility_quota.tokens_per_user(the facility's current per-user token limit) rather thanuser_quota.tokens. This is the correct behavior. Whileuser_quota.tokensis initialized tofacility_quota.tokens_per_userat quota creation time, the facility'stokens_per_usersetting can be updated dynamically by administrators via the API. By comparing against the facility's current limit, the code ensures that quota enforcement reflects the facility's current policy, allowing admins to adjust per-user limits and have those changes apply to all users going forward.
There was a problem hiding this comment.
Actionable comments posted: 4
♻️ Duplicate comments (5)
care_scribe/tasks/scribe.py (5)
225-251: Verify schema key mapping is stable and matches UI field order.
You buildprocessed_fields_no_keysvia enumeration ofprocessed_fields.values()and later convert back via enumeration ofprocessed_fields.keys(). This is order-dependent; it’s OK only if insertion order is guaranteed to match the form’s field order (and never changes across recursion).# Confirm field ordering is deterministic and matches the frontend form order. # Expect: `process_fields()` traverses in a stable order and `form.form_data` preserves order. rg -n "def process_fields|processed_fields_no_keys|converted_response" -C 3Also applies to: 582-587
199-206: Provider override uses!= ""and can mis-route when key isNone(and lacks exception chaining).
This repeats a prior review point: use truthiness for the key, and useraise ... from Noneinside theexcept ValueError:to satisfy B904.try: api_provider, chat_model = form.chat_model.split("/") - if api_provider == "openai" and plugin_settings.SCRIBE_AZURE_API_KEY != "": + if api_provider == "openai" and plugin_settings.SCRIBE_AZURE_API_KEY: api_provider = "azure" except ValueError: - raise ScribeError("Invalid chat model format. Use 'provider/model_name'.") + raise ScribeError("Invalid chat model format. Use 'provider/model_name'.") from None
379-420: Cache name/display_name should be sanitized (chat_model may contain/), and cache-fetch flow has redundant raise/logging.
This repeats prior feedback:scribe_{chat_model}_...can break naming ifchat_modelincludes slashes; alsoraise Exception("Cache not found")then swallowing it is unnecessary, andlogger.errorshould belogger.exceptionif you care about the traceback.- logger.info( - f"Scribe[{form.external_id}] status: fetching cache for scribe_{chat_model}_{output_schema_hash}" - ) - cached_content = client.caches.get(name=f"scribe_{chat_model}_{output_schema_hash}") - if not cached_content: - raise Exception("Cache not found") - except Exception as e: - logger.error(f"Scribe[{form.external_id}] error fetching cache: {e}") + safe_model = re.sub(r"[^a-zA-Z0-9_.-]", "_", chat_model) + cache_key = f"scribe_{safe_model}_{output_schema_hash}" + logger.info(f"Scribe[{form.external_id}] status: fetching cache for {cache_key}") + cached_content = client.caches.get(name=cache_key) + except Exception: + logger.exception(f"Scribe[{form.external_id}] error fetching cache") ... - cached_content = client.caches.create( + cached_content = client.caches.create( model=chat_model, config=types.CreateCachedContentConfig( - display_name=f"scribe_{chat_model}_{output_schema_hash}", + display_name=cache_key,
463-472:next(...)can raiseStopIterationif Gemini returns no function_call part.
This repeats prior feedback: add a default and raise a clearScribeError.- ai_response_json = next( - part.function_call.args for part in ai_response.candidates[0].content.parts if part.function_call - ) + function_call_part = next( + (part for part in ai_response.candidates[0].content.parts if part.function_call), + None, + ) + if not function_call_part: + raise ScribeError("AI response did not contain a function call") + ai_response_json = function_call_part.function_call.args
563-568: Uselogger.exceptionfor parse failures (and avoid dumping full response).
This repeats prior feedback:logger.error(... {ai_response})loses stack trace and may print PHI; preferlogger.exceptionwith identifiers only.except Exception as e: - logger.error(f"Scribe[{form.external_id}] error parsing response: {ai_response}") + logger.exception(f"Scribe[{form.external_id}] error parsing non-google AI response JSON") raise ScribeError("Error parsing AI response") from e
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
care_scribe/tasks/scribe.py(9 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
care_scribe/tasks/scribe.py (3)
care_scribe/models/scribe.py (4)
Scribe(103-154)Status(104-111)audio_file_ids(134-141)document_file_ids(144-151)care_scribe/models/scribe_quota.py (4)
ScribeQuota(17-72)last_modified_date(35-36)calculate_used(50-61)save(66-72)care_scribe/utils.py (1)
hash_string(3-5)
🪛 Ruff (0.14.8)
care_scribe/tasks/scribe.py
148-148: Abstract raise to an inner function
(TRY301)
148-148: Avoid specifying long messages outside the exception class
(TRY003)
163-163: Abstract raise to an inner function
(TRY301)
163-163: Avoid specifying long messages outside the exception class
(TRY003)
166-166: Abstract raise to an inner function
(TRY301)
166-166: Avoid specifying long messages outside the exception class
(TRY003)
169-169: Abstract raise to an inner function
(TRY301)
169-169: Avoid specifying long messages outside the exception class
(TRY003)
172-172: Abstract raise to an inner function
(TRY301)
172-172: Avoid specifying long messages outside the exception class
(TRY003)
189-189: Abstract raise to an inner function
(TRY301)
189-189: Avoid specifying long messages outside the exception class
(TRY003)
192-192: Abstract raise to an inner function
(TRY301)
192-192: Avoid specifying long messages outside the exception class
(TRY003)
205-205: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
205-205: Avoid specifying long messages outside the exception class
(TRY003)
321-321: Avoid specifying long messages outside the exception class
(TRY003)
386-386: Abstract raise to an inner function
(TRY301)
386-386: Create your own exception
(TRY002)
386-386: Avoid specifying long messages outside the exception class
(TRY003)
387-387: Do not catch blind exception: Exception
(BLE001)
388-388: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
408-410: Avoid specifying long messages outside the exception class
(TRY003)
441-441: Avoid specifying long messages outside the exception class
(TRY003)
447-449: Abstract raise to an inner function
(TRY301)
447-449: Avoid specifying long messages outside the exception class
(TRY003)
448-448: Use explicit conversion flag
Replace with conversion flag
(RUF010)
458-461: Abstract raise to an inner function
(TRY301)
458-461: Avoid specifying long messages outside the exception class
(TRY003)
459-459: Use explicit conversion flag
Replace with conversion flag
(RUF010)
460-460: Use explicit conversion flag
Replace with conversion flag
(RUF010)
561-561: Avoid specifying long messages outside the exception class
(TRY003)
566-566: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
567-567: Avoid specifying long messages outside the exception class
(TRY003)
601-601: Do not catch blind exception: Exception
(BLE001)
602-602: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
🔇 Additional comments (1)
care_scribe/tasks/scribe.py (1)
14-17: Verify OpenAI SDK 2.2.0 support forresponse_formatwithjson_schemaandstrict: true, and Azure OpenAI parity.The code at lines 426–431 correctly handles Google GenAI's constraint:
tool_configandtoolsare set toNonewhencached_contentis used, and vice versa, which matches the API requirement. TheGenerateContentConfigcall at lines 426–431 properly conditions these parameters.However, confirm that
openai==2.2.0supportsresponse_format={"type":"json_schema", ..., "strict": true}(feature added August 2024) and that Azure OpenAI parity exists for this SDK version. Azure structured outputs support was added in SDK releases around Jan–Feb 2025; verify this version covers your Azure deployment.
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (4)
care_scribe/settings.py (1)
120-131:tnc_hashbecomes stale afterplugin_settings.reload(); recompute it (or clear it) inreload().
reload()clears cached attrs, buttnc_hashis neither cleared nor recomputed, so runtime updates toSCRIBE_TNCwill not be reflected.def reload(self) -> None: @@ if hasattr(self, "_user_settings"): delattr(self, "_user_settings") + + # Keep `tnc_hash` in sync with the latest SCRIBE_TNC after reload + self.tnc_hash = hash_string(self.SCRIBE_TNC)care_scribe/tasks/scribe.py (3)
294-325: High compliance risk: logs/Sentry can capture PHI (transcript, full AI response, raw dumps). Redact/gate content logging and avoid storing raw responses.At minimum, remove direct content logs and avoid persisting
ai_response_raw(or gate behind an explicit allowlist flag defaulting off).- logger.info(f"Scribe[{form.external_id}] transcript: {transcript}") + logger.info( + f"Scribe[{form.external_id}] transcript generated (chars={len(transcript)})" + ) @@ - processing["ai_response_raw"] = ai_response.model_dump() + # Beware PHI: avoid storing full raw model responses in meta by default + # processing["ai_response_raw"] = ai_response.model_dump() @@ - processing["ai_response_raw"] = ai_response.model_dump() + # Beware PHI: avoid storing full raw model responses in meta by default + # processing["ai_response_raw"] = ai_response.model_dump() @@ - logger.info(f"AI response: {ai_response_json}") + logger.info( + f"Scribe[{form.external_id}] AI response received (filled_fields={len(converted_response)})" + )Also consider a Sentry scrubber / event processor before
capture_exception(e)if exceptions can include request/meta payloads.Also applies to: 566-570, 582-593, 607-613
133-139: Exception handler can crash ifScribe.objects.get(...)fails; move theget()insidetry(or guardform).@shared_task def process_ai_form_fill(external_id): current_time = timezone.now() - - form = Scribe.objects.get(external_id=external_id, status=Scribe.Status.READY) + form = None processing = { "created_date": current_time.isoformat(), } try: + form = Scribe.objects.get(external_id=external_id, status=Scribe.Status.READY) @@ - except Exception as e: - logger.error(f"Scribe[{form.external_id}] status: error occurred while processing form: {e}") + except Exception as e: + logger.exception( + f"Scribe[{getattr(form, 'external_id', external_id)}] status: error occurred while processing form" + ) @@ - form.meta["processings"] = [*form.meta.get("processings", []), processing] - form.status = Scribe.Status.FAILED - form.save() + if form: + form.meta["processings"] = [*form.meta.get("processings", []), processing] + form.status = Scribe.Status.FAILED + form.save()Also applies to: 604-617
147-173:audio_file_ids/document_file_idsare methods; using them as attributes breaks checks/filters. Call once and reuse.(Your
care_scribe/models/scribe.pysnippets show these are methods returning QuerySets.)try: @@ - if not form.audio_file_ids and not form.document_file_ids: + audio_ids = list(form.audio_file_ids()) + document_ids = list(form.document_file_ids()) + + if not audio_ids and not document_ids: raise ScribeError("No audio or documents associated with the Scribe. Your upload might have failed.") @@ - if not facility_quota.allow_ocr and not user_quota.allow_ocr and len(form.document_file_ids) > 0: + if not facility_quota.allow_ocr and not user_quota.allow_ocr and len(document_ids) > 0: raise ScribeError("OCR is not enabled for this user or facility.") @@ - audio_files = ScribeFile.objects.filter(external_id__in=form.audio_file_ids) + audio_files = ScribeFile.objects.filter(external_id__in=audio_ids) @@ - if (has_images := len(form.document_file_ids) > 0) or total_audio_duration > (3 * 60 * 1000): + if (has_images := len(document_ids) > 0) or total_audio_duration > (3 * 60 * 1000): @@ - if api_provider != "google" and len(form.document_file_ids) == 0: + if api_provider != "google" and len(document_ids) == 0: del output_schema["properties"]["__scribe__transcription"] output_schema["required"].remove("__scribe__transcription") @@ - document_file_objects = ScribeFile.objects.filter(external_id__in=form.document_file_ids) + document_file_objects = ScribeFile.objects.filter(external_id__in=document_ids)Also applies to: 222-224, 255-266, 333-334
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
care_scribe/settings.py(2 hunks)care_scribe/tasks/scribe.py(9 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
care_scribe/tasks/scribe.py (6)
care_scribe/models/scribe.py (4)
Scribe(103-154)Status(104-111)audio_file_ids(134-141)document_file_ids(144-151)care_scribe/models/scribe_file.py (2)
ScribeFile(14-56)save(42-53)care_scribe/models/scribe_quota.py (4)
ScribeQuota(17-72)last_modified_date(35-36)calculate_used(50-61)save(66-72)care_scribe/utils.py (1)
hash_string(3-5)care_scribe/serializers/scribe.py (1)
save(94-144)care_scribe/serializers/scribe_file.py (1)
create(50-67)
care_scribe/settings.py (1)
care_scribe/utils.py (1)
hash_string(3-5)
🪛 Ruff (0.14.8)
care_scribe/tasks/scribe.py
148-148: Abstract raise to an inner function
(TRY301)
148-148: Avoid specifying long messages outside the exception class
(TRY003)
163-163: Abstract raise to an inner function
(TRY301)
163-163: Avoid specifying long messages outside the exception class
(TRY003)
166-166: Abstract raise to an inner function
(TRY301)
166-166: Avoid specifying long messages outside the exception class
(TRY003)
169-169: Abstract raise to an inner function
(TRY301)
169-169: Avoid specifying long messages outside the exception class
(TRY003)
172-172: Abstract raise to an inner function
(TRY301)
172-172: Avoid specifying long messages outside the exception class
(TRY003)
189-189: Abstract raise to an inner function
(TRY301)
189-189: Avoid specifying long messages outside the exception class
(TRY003)
192-192: Abstract raise to an inner function
(TRY301)
192-192: Avoid specifying long messages outside the exception class
(TRY003)
205-205: Avoid specifying long messages outside the exception class
(TRY003)
321-321: Avoid specifying long messages outside the exception class
(TRY003)
386-386: Do not catch blind exception: Exception
(BLE001)
387-387: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
407-409: Avoid specifying long messages outside the exception class
(TRY003)
440-440: Avoid specifying long messages outside the exception class
(TRY003)
446-448: Abstract raise to an inner function
(TRY301)
446-448: Avoid specifying long messages outside the exception class
(TRY003)
447-447: Use explicit conversion flag
Replace with conversion flag
(RUF010)
457-460: Abstract raise to an inner function
(TRY301)
457-460: Avoid specifying long messages outside the exception class
(TRY003)
458-458: Use explicit conversion flag
Replace with conversion flag
(RUF010)
459-459: Use explicit conversion flag
Replace with conversion flag
(RUF010)
473-473: Abstract raise to an inner function
(TRY301)
473-473: Avoid specifying long messages outside the exception class
(TRY003)
564-564: Avoid specifying long messages outside the exception class
(TRY003)
569-569: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
570-570: Avoid specifying long messages outside the exception class
(TRY003)
604-604: Do not catch blind exception: Exception
(BLE001)
605-605: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
care_scribe/settings.py
120-120: Do not call getattr with a constant attribute value. It is not any safer than normal property access.
Replace getattr with attribute access
(B009)
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (3)
care_scribe/tasks/scribe.py (3)
137-137: CRITICAL: Exception handler will crash ifScribe.objects.get()fails.If line 137 raises
Scribe.DoesNotExistor any other exception, theexceptblock at lines 604-618 will attempt to accessform.external_id(line 605), causing a secondaryNameErrorthat masks the original error.This was flagged in past reviews. Initialize
form = Nonebefore the query and guard allformreferences in the exception handler:@shared_task def process_ai_form_fill(external_id): current_time = timezone.now() - + form = None form = Scribe.objects.get(external_id=external_id, status=Scribe.Status.READY)And in the exception handler:
except Exception as e: - logger.error(f"Scribe[{form.external_id}] status: error occurred while processing form at line {e.__traceback__.tb_lineno}: {e}") + form_id = form.external_id if form else external_id + logger.exception(f"Scribe[{form_id}] status: error occurred while processing form at line {e.__traceback__.tb_lineno}") ... - form.meta["processings"] = [*form.meta.get("processings", []), processing] - form.status = Scribe.Status.FAILED - form.save() + if form: + form.meta["processings"] = [*form.meta.get("processings", []), processing] + form.status = Scribe.Status.FAILED + form.save()
386-387: Uselogger.exceptionto preserve stack traces.Line 387 (and also 569, 605) uses
logger.errorwhich discards the stack trace. Uselogger.exceptionin exception handlers to preserve debugging context.except Exception as e: - logger.error(f"Scribe[{form.external_id}] error fetching cache: {e}") + logger.exception(f"Scribe[{form.external_id}] error fetching cache")Apply the same change at lines 569 and 605.
As per coding guidelines from past reviews,
logger.exceptionshould always be used in exception handlers.
324-324: MAJOR: Transcript logging exposes PHI.Logging the full transcript (line 324) and potentially logging it again in error handlers exposes Protected Health Information. Similarly, line 582 logs the full AI response JSON which likely contains patient data.
Remove content logging or gate it behind an explicit configuration flag (default off):
- logger.info(f"Scribe[{form.external_id}] transcript: {transcript}") + logger.info(f"Scribe[{form.external_id}] transcript length: {len(transcript)} chars")And at line 582:
- logger.info(f"AI response: {ai_response_json}") + logger.info(f"Scribe[{form.external_id}] AI response: {len(ai_response_json)} fields populated")Based on learnings, past reviews flagged PHI exposure in logs and Sentry traces. Ensure Sentry integration is configured with proper scrubbing for PHI fields.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
care_scribe/tasks/scribe.py(9 hunks)
🧰 Additional context used
🪛 Ruff (0.14.8)
care_scribe/tasks/scribe.py
148-148: Abstract raise to an inner function
(TRY301)
148-148: Avoid specifying long messages outside the exception class
(TRY003)
163-163: Abstract raise to an inner function
(TRY301)
163-163: Avoid specifying long messages outside the exception class
(TRY003)
166-166: Abstract raise to an inner function
(TRY301)
166-166: Avoid specifying long messages outside the exception class
(TRY003)
169-169: Abstract raise to an inner function
(TRY301)
169-169: Avoid specifying long messages outside the exception class
(TRY003)
172-172: Abstract raise to an inner function
(TRY301)
172-172: Avoid specifying long messages outside the exception class
(TRY003)
189-189: Abstract raise to an inner function
(TRY301)
189-189: Avoid specifying long messages outside the exception class
(TRY003)
192-192: Abstract raise to an inner function
(TRY301)
192-192: Avoid specifying long messages outside the exception class
(TRY003)
205-205: Avoid specifying long messages outside the exception class
(TRY003)
321-321: Avoid specifying long messages outside the exception class
(TRY003)
386-386: Do not catch blind exception: Exception
(BLE001)
387-387: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
407-409: Avoid specifying long messages outside the exception class
(TRY003)
440-440: Avoid specifying long messages outside the exception class
(TRY003)
446-448: Abstract raise to an inner function
(TRY301)
446-448: Avoid specifying long messages outside the exception class
(TRY003)
447-447: Use explicit conversion flag
Replace with conversion flag
(RUF010)
457-460: Abstract raise to an inner function
(TRY301)
457-460: Avoid specifying long messages outside the exception class
(TRY003)
458-458: Use explicit conversion flag
Replace with conversion flag
(RUF010)
459-459: Use explicit conversion flag
Replace with conversion flag
(RUF010)
473-473: Abstract raise to an inner function
(TRY301)
473-473: Avoid specifying long messages outside the exception class
(TRY003)
564-564: Avoid specifying long messages outside the exception class
(TRY003)
569-569: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
570-570: Avoid specifying long messages outside the exception class
(TRY003)
604-604: Do not catch blind exception: Exception
(BLE001)
605-605: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
🔇 Additional comments (2)
care_scribe/tasks/scribe.py (2)
139-141: CRITICAL:processingdict contains non-JSON-serializable datetime.Line 140 stores
current_time.isoformat(), but if any error occurs before the processing dict is finalized, the datetime objects inprocessingwill cause "Object of type datetime is not JSON serializable" errors when saving toform.metaat lines 593 or 615.Based on learnings, user shivankacker reported this exact error in past reviews. Ensure all values in
processingare JSON-serializable:processing = { - "created_date": current_time.isoformat(), + "created_date": str(current_time.isoformat()), }Also verify that no other datetime objects are added to
processingwithout serialization.Likely an incorrect or invalid review comment.
147-148: Remove this comment - the code is correct as written.
audio_file_idsanddocument_file_idsare Python@propertydecorated methods that return Django QuerySets. They are properly accessed as attributes (without parentheses) throughout the code, and QuerySets support all operations being performed: truthiness checks,len(), and iteration in ORM filters. Calling them as methods (form.audio_file_ids()) would fail.Likely an incorrect or invalid review comment.
shivankacker
left a comment
There was a problem hiding this comment.
lgtm, clankers have something to say
There was a problem hiding this comment.
Actionable comments posted: 0
♻️ Duplicate comments (5)
care_scribe/tasks/scribe.py (5)
386-387: Uselogger.exceptionto preserve stack trace.Per static analysis hint TRY400, replace
logger.error(f"... {e}")withlogger.exception(...)to automatically capture the full stack trace.except Exception as e: - logger.error(f"Scribe[{form.external_id}] error fetching cache: {e}") + logger.exception(f"Scribe[{form.external_id}] error fetching cache")
571-573: Uselogger.exceptionto preserve stack trace.Per static analysis hint TRY400 and past review, replace
logger.errorwithlogger.exceptionto capture the full traceback.except Exception as e: - logger.error(f"Scribe[{form.external_id}] error parsing response: {ai_response}") + logger.exception(f"Scribe[{form.external_id}] error parsing response") raise ScribeError("Error parsing AI response") from e
134-148: CRITICAL: Exception handler will crash ifScribe.objects.get()fails.The
formretrieval at line 137 is outside thetryblock. If the query raisesDoesNotExistor any other exception, theexceptblock at line 607-620 will fail withNameErrorwhen accessingform.external_idandform.meta.🔎 Recommended fix
@shared_task def process_ai_form_fill(external_id): current_time = timezone.now() + form = None - form = Scribe.objects.get(external_id=external_id, status=Scribe.Status.READY) + try: + form = Scribe.objects.get(external_id=external_id, status=Scribe.Status.READY) + except Scribe.DoesNotExist: + logger.warning(f"Scribe[{external_id}] not found or not READY; skipping") + return processing = { "created_date": current_time.isoformat(), } try:And update the exception handler:
except Exception as e: - logger.error(f"Scribe[{form.external_id}] status: error occurred ...") + logger.exception(f"Scribe[{form.external_id if form else external_id}] status: error occurred ...") ... - form.meta["processings"] = [*form.meta.get("processings", []), processing] - form.status = Scribe.Status.FAILED - form.save() + if form: + form.meta["processings"] = [*form.meta.get("processings", []), processing] + form.status = Scribe.Status.FAILED + form.save()
324-324: PHI risk: transcript content logged.Logging the full transcript may expose patient health information (PHI) in log files. Consider logging only metadata (e.g., transcript length) or gating content logs behind a debug flag.
585-585: PHI risk: AI response logged.Logging the full
ai_response_jsonmay expose patient health information. Consider logging only metadata or removing this log in production.
🧹 Nitpick comments (2)
care_scribe/tasks/scribe.py (2)
199-205: Consider logging the provider override.When
api_providerisopenaibutSCRIBE_AZURE_API_KEYis configured, the provider silently switches toazure. This could cause confusion during debugging.🔎 Suggested improvement
if api_provider == "openai" and plugin_settings.SCRIBE_AZURE_API_KEY: + logger.info(f"Scribe[{form.external_id}] overriding openai provider to azure") api_provider = "azure"
607-620: Uselogger.exceptioninstead of manual formatting.Line 608 manually formats the exception with line number, which loses the full stack trace. Use
logger.exceptionfor automatic traceback capture.except Exception as e: - logger.error(f"Scribe[{form.external_id}] status: error occurred while processing form at line {e.__traceback__.tb_lineno}: {e}") + logger.exception(f"Scribe[{form.external_id if form else external_id}] status: error occurred while processing form")Note: Also requires the earlier fix to initialize
form = Noneand guard the meta access, as flagged in the critical issue above.
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
care_scribe/tasks/scribe.py
🧰 Additional context used
🪛 Ruff (0.14.8)
care_scribe/tasks/scribe.py
148-148: Abstract raise to an inner function
(TRY301)
148-148: Avoid specifying long messages outside the exception class
(TRY003)
163-163: Abstract raise to an inner function
(TRY301)
163-163: Avoid specifying long messages outside the exception class
(TRY003)
166-166: Abstract raise to an inner function
(TRY301)
166-166: Avoid specifying long messages outside the exception class
(TRY003)
169-169: Abstract raise to an inner function
(TRY301)
169-169: Avoid specifying long messages outside the exception class
(TRY003)
172-172: Abstract raise to an inner function
(TRY301)
172-172: Avoid specifying long messages outside the exception class
(TRY003)
189-189: Abstract raise to an inner function
(TRY301)
189-189: Avoid specifying long messages outside the exception class
(TRY003)
192-192: Abstract raise to an inner function
(TRY301)
192-192: Avoid specifying long messages outside the exception class
(TRY003)
205-205: Avoid specifying long messages outside the exception class
(TRY003)
321-321: Avoid specifying long messages outside the exception class
(TRY003)
386-386: Do not catch blind exception: Exception
(BLE001)
387-387: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
407-409: Avoid specifying long messages outside the exception class
(TRY003)
440-440: Avoid specifying long messages outside the exception class
(TRY003)
446-448: Abstract raise to an inner function
(TRY301)
446-448: Avoid specifying long messages outside the exception class
(TRY003)
447-447: Use explicit conversion flag
Replace with conversion flag
(RUF010)
457-460: Abstract raise to an inner function
(TRY301)
457-460: Avoid specifying long messages outside the exception class
(TRY003)
458-458: Use explicit conversion flag
Replace with conversion flag
(RUF010)
459-459: Use explicit conversion flag
Replace with conversion flag
(RUF010)
473-473: Abstract raise to an inner function
(TRY301)
473-473: Avoid specifying long messages outside the exception class
(TRY003)
567-567: Avoid specifying long messages outside the exception class
(TRY003)
572-572: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
573-573: Avoid specifying long messages outside the exception class
(TRY003)
607-607: Do not catch blind exception: Exception
(BLE001)
608-608: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
🔇 Additional comments (12)
care_scribe/tasks/scribe.py (12)
28-47: Well-structured base prompt constant.Extracting the prompt into a module-level constant improves maintainability and makes it easier to test or override. The placeholder
{current_date_time}is appropriately replaced at runtime.
54-85: LGTM on AI client factory.The provider-based client instantiation is clean. The Google credentials handling with base64 decoding is appropriate for containerized deployments.
88-130: Clean multi-provider message construction.The function properly abstracts provider-specific message formats. The file handling with base64 encoding for OpenAI/Azure and native bytes for Google is correct.
174-186: Good edge-case handling for quota recalculation across months.The logic correctly recalculates used quota when the first request of a new month arrives, preventing edge cases where stale quota data from the previous month would incorrectly block requests.
225-251: Good approach for normalizing field IDs in the schema.The
qNkey mapping avoids potential issues with special characters in original field IDs when used in the JSON schema. The reverse mapping at lines 589-593 correctly restores the original keys.
420-452: Retry logic is sound with single retry attempt.The
generate_responsefunction correctly handles the known Gemini issue with malformed function calls by retrying once. The exception chaining is properly implemented.
468-474: Good fix for StopIteration issue.The
next()call now properly uses a default value ofNoneand explicitly checks for the missing function call case with a descriptive error message. This addresses the previous review concern.
478-541: Comprehensive token accounting.The detailed breakdown of token usage by modality (audio, image, text) and cache status enables accurate quota tracking and cost analysis. The safe fallbacks to
Nonewhen details aren't available are appropriate.
588-598: Correct response key mapping restoration.The dictionary comprehension properly maps the
qNkeys back to original field IDs using enumeration order, maintaining consistency with the schema construction at line 239.
610-615: Good defensive check for Sentry configuration.Using
getattr(settings, "SENTRY_DSN", None)preventsAttributeErrorwhen Sentry isn't configured. The fallback to store the traceback in processing metadata ensures error details are preserved in all environments.
544-567: No action needed. The model names "gpt-5", "gpt-5-mini", and "gpt-5-nano" are all valid GPT-5 API model sizes, and temperature is indeed unsupported with these reasoning models. The code's conditional handling of the temperature parameter is correct and necessary.
147-147: No action required.audio_file_idsanddocument_file_idsare Python properties and are correctly accessed as attributes throughout the code. The @Property decorator allows them to execute method logic while being accessed without parentheses.Likely an incorrect or invalid review comment.
Summary by CodeRabbit
Improvements
New
✏️ Tip: You can customize this high-level summary in your review settings.