Skip to content

Luigi S3 module now uses boto3 by default#29

Open
andresvelasquez2-affirm wants to merge 2 commits intorafay/BATCH-3679-luigi-py-312-upgradefrom
andres.velasquez2/BATCH-3679-luigi-py-312-upgrade-boto3-only
Open

Luigi S3 module now uses boto3 by default#29
andresvelasquez2-affirm wants to merge 2 commits intorafay/BATCH-3679-luigi-py-312-upgradefrom
andres.velasquez2/BATCH-3679-luigi-py-312-upgrade-boto3-only

Conversation

@andresvelasquez2-affirm
Copy link
Copy Markdown
Collaborator

@andresvelasquez2-affirm andresvelasquez2-affirm commented May 6, 2026

Summary

This PR makes boto3 the default backend for Luigi's S3 module, finishing the boto3 migration that PR #28 started. The legacy S3ClientBoto1 class stays defined and importable for callers that still need boto1 explicitly, but unparameterized S3Target(...), S3PathTask(...), etc. now resolve to boto3.

This PR is built on top of rafay/BATCH-3679-luigi-py-312-upgrade (PR #28).

JIRA Ticket

BATCH-3679

Changes

S3 Client Default Flipped to boto3

  • luigi/contrib/s3.py:
    • S3Client = S3ClientBoto3 (was S3ClientBoto1); same for ReadableS3File = ReadableS3FileBoto3.
    • S3ClientBoto1 and ReadableS3FileBoto1 remain defined and importable — opt-in only.
    • Reverted the client= constructor parameter on S3PathTask, S3EmrTask, and S3FlagTask (added in 05c71137 to let callers opt into boto3 while the default was still boto1) — the injection plumbing is redundant now that the default itself is boto3.

Test Refactor — test/contrib/s3_test.py

  • Dropped the dual-mode try/except ImportError HAS_BOTO shim that, post-flip, would have routed boto1-style calls (e.g. client.s3.create_bucket('mybucket')) into what is now actually a boto3 client whenever boto1 happened to be installed.
  • Re-added boto3 versions of the previously boto1-only tests using boto3 idioms:
    • ServerSideEncryption='AES256' instead of encrypt_key=True.
    • Credentials introspected via s3.meta.client._request_signer._credentials.
    • STS-assumed-role assertions check shape (ASIA-prefixed key + token populated) rather than canonical example values that newer moto no longer returns.
  • Added bonus deprecation tests — test_put_encrypt_key_raises, test_put_string_encrypt_key_raises, test_put_multipart_encrypt_key_raises — verifying S3ClientBoto3 rejects the boto1 encrypt_key= parameter with DeprecatedBotoClientException.
  • Added TestS3TargetBoto1 and TestS3ClientBoto1 test classes that exercise S3ClientBoto1 / ReadableS3FileBoto1 directly (basic put/get round-trip, the original boto.s3.key.Key.BufferSize line-buffering test, boto1 credential introspection, all four encrypt_key=True multipart variants). Gated on BOTO1_AVAILABLE = HAS_BOTO_1 and HAS_BOTO_1_MOTO — both boto<3 and moto's mock_s3_deprecated/mock_sts_deprecated decorators must be importable; skip cleanly otherwise.
  • Added a MOTO_LT_2 skip gate — moto<2 doesn't decode boto3's Transfer-Encoding: chunked upload bodies, so any round-trip through put_multipart / upload_fileobj / copy stores chunk-size markers (b"4f\n...real-data...\n0\n") instead of real content. Class-level @unittest.skipIf(MOTO_LT_2, ...) on TestS3Target, self.skipTest(...) guards inside _run_multipart_test / _run_copy_test / _run_multipart_copy_test, plus @unittest.skipIf on test_get and test_get_as_string.

Dependency Updates

  • setup.py: version bump 2.7.5+affirm.1.4.9.rc82.7.5+affirm.1.4.9.rc9.

Documentation

  • CLAUDE.md: New "S3 Module: boto3-Only Default + Boto1 Legacy Shim" section with uv run --no-project invocations for running the S3 suite, the moto/boto compatibility matrix, the MOTO_LT_2 and BOTO1_AVAILABLE flag explanations, and the macOS arm64 + arch -x86_64 note for Python 3.9.
  • PLAN_Py39_P312_DUAL_COMPATIBILITY.md: Added a "May 6 2026 — Boto3-Only Default + Test Refactor" section. Annotated the original April 13 checklist step 5 as reverted (preserving the rationale verbatim as a quoted block). Updated Migration Guide patterns 3 and 4 + the Migration Impact Summary to reflect that S3PathTask/S3EmrTask/S3FlagTask no longer accept client=.

Testing

The S3 test file was validated under uv run --no-project across five Python + moto + boto combinations:

# Env Result
1 py3.12 + moto1 + boto + boto3 Broken environment — moto1 + py3.12 incompatible (ssl.wrap_socket removed; boto1 vendored six.moves fails to import). Documented; not addressable from test code.
2 py3.12 + moto5 + boto3 60 passed, 13 skipped, 0 failed
3 py3.9 + moto1 + boto + boto3 33 passed, 40 skipped, 0 failed ✅ (boto3 round-trip path skipped under MOTO_LT_2)
4 py3.9 + moto1 + boto3 (no boto) Identical to #3moto==1.3.16 declares boto as a hard runtime dep, transitively pulled in.
5 py3.9 + moto5 + boto3 60 passed, 13 skipped, 0 failed

Full-suite results via ./run_tests.sh:

  • py3.12: 1382 passed, 39 failed, 36 skipped, 18 errors39 failures match the documented baseline (no regressions). 18 errors are pre-existing SQLAlchemy 'Engine' object has no attribute … issues.
  • py3.9 (arch -x86_64 ./run_tests.sh --py39): 1355 passed, 39 failed, 63 skipped, 18 errors — same matching baseline.

Verification

# Modern stack — py3.12 + moto5 + boto3
PYTHONPATH=test uv run --python 3.12 --no-project --with-editable . \
  --with pytest --with sqlalchemy --with mock --with hypothesis --with pygments \
  --with 'moto>=5,<6' --with boto3 \
  python -m pytest test/contrib/s3_test.py -q --override-ini addopts=''

# Legacy stack — py3.9 + moto1 + boto1 + boto3 (Apple Silicon needs `arch -x86_64`)
arch -x86_64 env PYTHONPATH=test uv run --python 3.9 --no-project --with-editable . \
  --with pytest --with sqlalchemy --with mock --with hypothesis --with pygments \
  --with 'moto==1.3.16' --with boto3 --with boto \
  python -m pytest test/contrib/s3_test.py -q --override-ini addopts=''

andresvelasquez2-affirm and others added 2 commits May 6, 2026 13:17
Flip `S3Client = S3ClientBoto3` (was boto1) and revert the `client=`
parameter on `S3PathTask` / `S3EmrTask` / `S3FlagTask` (now redundant).
Drop the boto1/boto3 dual-mode shim from `s3_test.py`; add boto3 ports
of the SSE / credential / STS tests, boto1-gated test classes, and a
`MOTO_LT_2` skip for boto3 round-trips moto<2 corrupts via raw chunked
Transfer-Encoding. Document the new default and compatibility matrix.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@andresvelasquez2-affirm andresvelasquez2-affirm added the Python-3.12-Migration For all code changes that relate to the upgrade to python v3.12 label May 6, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Python-3.12-Migration For all code changes that relate to the upgrade to python v3.12

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant