Release 0.58.3 by odlbot · Pull Request #3069 · mitodl/mit-learn

odlbot · 2026-03-19T17:38:27Z

Shankar Ambady

AskTim canvas ai contentfile ingestion issue (AskTim canvas ai contentfile ingestion issue #3059) (03e59af1)

Carey P Gumaer

use v3 enrollments endpoint on the dashboard (use v3 enrollments endpoint on the dashboard #3049) (2bf27b36)

* Update everything to use the v3 enrollments endpoint * upgrade api temporarily to branch build and use new upgrade product fields * if a course run is passed in, get the b2b contract ID directly from the v3 run data * fix typecheck issue with missing upgrade product props * fix is_upgradable check * properly handle upgrade deadline * switch back to release api client * fix test mock after rebase * copilot suggestion regarding checking product id before rendering upgrade banner * address feedback

* logic fix * adding test * fixing test * move check outside of ocr method * Revert "move check outside of ocr method" This reverts commit ced5536. * move check outside of ocr * fix for pull request finding Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> * catch all pdf errors * fix test * fix test * fix check * fix tests * switch logging statements --------- Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

sentry · 2026-03-19T17:41:01Z

learning_resources/etl/utils.py

+
+    page_count = len(PdfReader(file_path).pages)
+    if page_count > settings.OCR_PDF_MAX_PAGE_THRESHOLD and not is_tutor_problem:
        return None
+    return {
+        "content": _pdf_to_markdown(file_path),
+        "content_title": "",


Bug: The _extract_content_with_ocr function lacks exception handling. A PDF that is valid on its first page but corrupted on a later page will cause an unhandled exception.
_{Severity: HIGH}

Suggested Fix

Wrap the PDF processing logic within the _extract_content_with_ocr function in a try/except block to catch potential exceptions from pypdf, such as FileNotDecryptedError or PdfReadError. This will prevent a single malformed file from crashing the entire ETL task and allow it to be skipped gracefully.

Prompt for AI Agent

Review the code at the location below. A potential bug has been identified by an AI agent. Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not valid. Location: learning_resources/etl/utils.py#L594-L600 Potential issue: The `pdf_is_valid` function only checks the first page of a PDF. A file with a valid first page but a corrupted or encrypted subsequent page will pass this initial validation. When `_extract_content_with_ocr` is later called, it attempts to count all pages via `len(PdfReader(file_path).pages)`. Because a `try/except` block was removed in this function, an exception raised by `pypdf` (e.g., `FileNotDecryptedError`, `PdfReadError`) on a subsequent bad page will be unhandled. This will crash the entire ETL ingestion task for that file, whereas previously it would have been gracefully skipped.

_{Did we get this right? 👍 / 👎 to inform future reviews.}

gumaerc and others added 4 commits March 18, 2026 15:24

Merge branch 'release'

4eca54f

Release 0.58.3

833c2cb

odlbot added the deploying to rc label Mar 19, 2026

sentry bot reviewed Mar 19, 2026

View reviewed changes

odlbot added waiting for checkboxes all checkboxes checked and removed deploying to rc waiting for checkboxes labels Mar 19, 2026

odlbot merged commit 833c2cb into release Mar 19, 2026
17 checks passed

odlbot added deploying to prod and removed all checkboxes checked labels Mar 19, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Release 0.58.3#3069

Release 0.58.3#3069
odlbot merged 4 commits intoreleasefrom
release-candidate

odlbot commented Mar 19, 2026 •

edited by ChristopherChudzicki

Loading

Uh oh!

sentry bot Mar 19, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

odlbot commented Mar 19, 2026 • edited by ChristopherChudzicki Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Shankar Ambady

Carey P Gumaer

Uh oh!

sentry bot Mar 19, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

odlbot commented Mar 19, 2026 •

edited by ChristopherChudzicki

Loading