Skip to content

Fix: guard against None contents in S3 download and upload listing#13

Merged
SeaCelo merged 1 commit intoEAPD-DRB:mainfrom
pigeio:feature/fix-s3-download-none-guard
Mar 2, 2026
Merged

Fix: guard against None contents in S3 download and upload listing#13
SeaCelo merged 1 commit intoEAPD-DRB:mainfrom
pigeio:feature/fix-s3-download-none-guard

Conversation

@pigeio
Copy link
Contributor

@pigeio pigeio commented Feb 23, 2026

Summary

  • What changed: Added if contents: guard before iterating over S3 list_objects_v2 response in SyncS3.downloadSync() and the duplicated pagination logic in UploadRoute.py. This prevents a TypeError crash when the S3 prefix matches no objects.

  • Why: AWS list_objects_v2 omits the Contents key entirely (rather than returning an empty list) when no objects match. The code iterated directly over None, causing TypeError: 'NoneType' is not iterable.

Related issues

Validation

  • Tests added/updated (or not applicable)
  • Validation steps documented
  • Evidence attached (logs/screenshots/output as relevant)

Tested with mock AWS responses simulating both normal and empty S3 results:
=== Testing with NORMAL response ===
Keys: ['case1/data.json']
Dirs: ['case1/subfolder/']
PASSED
=== Testing with EMPTY response (the bug scenario) ===
Keys: []
Dirs: []
PASSED
=== Proving the BUGGY version crashes ===
Confirmed bug: 'NoneType' object is not iterable

Documentation

  • Docs updated in this PR (or not applicable)
  • Any setup/workflow changes reflected in repo docs

No documentation changes needed — this is a bug fix with no API or workflow changes.

Scope check

  • No unrelated refactors
  • Implemented from a feature branch
  • [x]Change is deliverable without upstream OSeMOSYS/MUIO dependency
  • Base repo/branch is EAPD-DRB/MUIOGO:main (not upstream)

@sanvishukla
Copy link
Contributor

Hi @pigeio,
It is a great fix, the Contents guard correctly resolves the crash when no objects match the S3 prefix, and the validation steps are very clear.
I have one minor robustness suggestion: although AWS guarantees the presence of Key in each Contents entry, would it make sense to also guard against unexpected None or empty values before accessing k[-1] (for example, if k and not k.endswith('/'))?
I realize this may be overly defensive given AWS’s API contract, but I’m curious what you think, especially from a long-term maintainability perspective or in mocked test scenarios.

@pigeio
Copy link
Contributor Author

pigeio commented Feb 25, 2026

Hi @pigeio, It is a great fix, the Contents guard correctly resolves the crash when no objects match the S3 prefix, and the validation steps are very clear. I have one minor robustness suggestion: although AWS guarantees the presence of Key in each Contents entry, would it make sense to also guard against unexpected None or empty values before accessing k[-1] (for example, if k and not k.endswith('/'))? I realize this may be overly defensive given AWS’s API contract, but I’m curious what you think, especially from a long-term maintainability perspective or in mocked test scenarios.

Hi @sanvishukla,
Thank you for reviewing the PR! That is a fantastic point. Even though AWS API contracts are strict, adding this guard makes the code much more resilient—especially if the project ever uses mock frameworks like moto for testing, or connects to alternative S3-compatible storage where edge cases might arise.
Furthermore, updating it to if k and not k.endswith('/') is definitely more Pythonic and completely avoids a potential IndexError if k is ever an empty string.
I completely agree from a maintainability perspective. I will push a commit to this branch right now to include this extra guard, and I'll add you as a Co-author on the commit to give you credit for the great catch! Thanks again!

@sanvishukla
Copy link
Contributor

Thanks @pigeio for being open to the suggestion, and I appreciate the co-author credit! Looking forward to the updated commit!

@SeaCelo SeaCelo changed the base branch from main to mac-port February 25, 2026 17:58
@SeaCelo
Copy link
Collaborator

SeaCelo commented Feb 25, 2026

Retargeted to mac-port per the macOS integration track: #51.
We are validating Mac/cross-platform changes in mac-port first, then merging mac-port -> main.

@pigeio
Copy link
Contributor Author

pigeio commented Feb 28, 2026

@SeaCelo This PR is ready for review — reviewer feedback has been addressed
and the branch has no conflicts with mac-port. Would you be able to take a
look when you get a chance? Thanks!

@pigeio pigeio force-pushed the feature/fix-s3-download-none-guard branch from 46c9b0e to 75a13c0 Compare February 28, 2026 12:12
@SeaCelo SeaCelo changed the base branch from mac-port to main March 1, 2026 19:27
@SeaCelo
Copy link
Collaborator

SeaCelo commented Mar 1, 2026

mac-port has now been merged into main, so I’ve retargeted this PR to main.

This still looks like a valid small follow-up on the current mainline, since the S3 None-guard fix is independent of the old branch structure.

@pigeio
Copy link
Contributor Author

pigeio commented Mar 1, 2026

@SeaCelo Makes perfect sense! Glad to see the macOS integration successfully merged into main. The PR is clean and ready to go whenever you are doing the next batch of merges. Thanks for the update!

@SeaCelo
Copy link
Collaborator

SeaCelo commented Mar 2, 2026

Merging.
Thanks @pigeio and @sanvishukla for the fix and the review iteration — the None guard and the improved k.endswith('/') check are both correct.

Note: S3 sync is currently disabled (AWS_SYNC = 0, empty credentials in Config). PR #148 is working on wiring up env-based S3 credential loading, but that's still under review. S3-related work is low priority for now — the immediate focus is on the core app bug fixes and security hardening.

@SeaCelo SeaCelo merged commit 8e8e9eb into EAPD-DRB:main Mar 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] SyncS3.downloadSync() crashes with TypeError when S3 prefix returns no objects

3 participants