Support S3 unsigned body #3323

jterapin · 2025-12-06T21:41:46Z

This PR intends to improve performance for S3's PutObject and UploadPart .

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

To make sure we include your contribution in the release notes, please make sure to add description entry for your changes in the "unreleased changes" section of the CHANGELOG.md file (at corresponding gem). For the description entry, please make sure it lives in one line and starts with Feature or Issue in the correct format.
For generated code changes, please checkout below instructions first:
https://github.com/aws/aws-sdk-ruby/blob/version-3/CONTRIBUTING.md

Thank you for your contribution!

github-actions · 2025-12-07T22:23:18Z

Detected 1 possible performance regressions:

aws-sdk-s3.put_object_small_allocated_kb - z-score regression: 81.42 -> 83.44. Z-score: 31.52

jterapin · 2025-12-07T22:32:59Z

build_tools/services.rb

-    MINIMUM_CORE_VERSION = "3.239.1"
+    MINIMUM_CORE_VERSION = "3.240.0"

    # Minimum `aws-sdk-core` version for new S3 gem builds
-    MINIMUM_CORE_VERSION_S3 = "3.234.0"
+    MINIMUM_CORE_VERSION_S3 = "3.240.0"


Self-reminder to update the minimum core version when this is ready for release.

jterapin · 2025-12-07T22:39:18Z

gems/aws-sdk-core/lib/aws-sdk-core/plugins/checksum_algorithm.rb

-          if @trailer_io
-            return @trailer_io.read(length, buf)
-          end
+        def read(_length = nil, buf = nil)


The length parameter is ignored since we always expect 16KB (default) or whatever is set by @chunk_size. This is an internal implementation accessed by either our patch or Net::HTTP.

I could make this more dynamic but that gets tricky... the #size method needs to know the chunk size ahead of time to calculate the overhead bytes plus actual IO size. Supporting variable-length reads would break that calculation.

What would be the benefit of it being dynamic? If we're always expecting 16KB and this is an established expectation, I think it's fine that we don't use the length input. Not sure how feasible this is, but we could consider removing the parameter altogether in the future as to not cause confusion when using this method.

Well, I don't think we could remove the parameter altogether due to the interfaces... this IO-like object is being read by http-client (net-http in our case) and length is a required parameter. Removing the parameter will break.

What would be the benefit of it being dynamic?

In the scenario where we don't use the patch - net-http could change the way their reads work - like increasing the chunk size and etc. We are bottlenecking the cap to 16KB here. We won't know if things have changed unless net-http checks the length on their end.

jterapin · 2025-12-07T22:41:30Z

gems/aws-sdk-core/lib/seahorse/client/net_http/patches.rb

+          # See: https://github.com/ruby/net-http/issues/205
          def supply_default_content_type
            return if Thread.current[:net_http_skip_default_content_type]


I think we could eventually get rid of this - given that net-http no longer sets the default content type in their recent patch. We may want to be move this in later versions of Ruby when the default net-http gem version is the recent one.

jterapin · 2025-12-07T22:50:03Z

gems/aws-sdk-s3/spec/client_spec.rb

+          # Need to discuss
+          expect(resp.context.http_request.body.instance_variable_get(:@io).read).to eq(data)


I realized the body gets replaced with TrailerIO when we use the unsigned body path. I'm wondering if customers ever access this body object directly? If they do, TrailerIO doesn't have all the methods that StringIO would have, which could break their code.

One option is to swap the body back to the original IO object after we finish sending the request. That way customers would see the original object if they need access later.

If it's feasible, it may be better to swap the body back to the original IO object to minimize potentially breaking customers. However, I'm assuming most of these details should be API private?

Should be easy to swap back to the original - given that TrailerIO has access to the original io object. On that question... I would assume so but you never know!

jterapin · 2025-12-08T17:40:07Z

build_tools/customizations.rb

+        next unless %w[PutObject UploadPart].include?(key)
+
+        operation['authType'] = 'v4-unsigned-body'
+        operation['unsignedPayload'] = true


I didn't add test cases for unsigned body since its already covered by our test cases here: https://github.com/aws/aws-sdk-ruby/blob/version-3/gems/aws-sdk-core/spec/aws/plugins/checksum_algorithm_spec.rb#L24

richardwang1124

Nice! I think it looks good overall to me. Added a few comments.

richardwang1124 · 2025-12-09T16:46:56Z

gems/aws-sdk-core/lib/aws-sdk-core/plugins/checksum_algorithm.rb

          headers[header_name] = calculate_checksum(
            checksum_properties[:algorithm],
-            body
+            context.http_request.body


Was this change from context.http_request.body_contents to context.http_request.body intentional?

Great question. And yes! This was the root cause why the memory was high during testing. Calling #body_contents on the request loads the entire body into memory. It is unknown on why this have been done previously when I asked around so I think just having that body here makes sense (I do a rewind much later before we start calculating checksums).

Do we need to do a rewind after calculating as well?

richardwang1124 · 2025-12-09T16:51:20Z

gems/aws-sdk-core/lib/aws-sdk-core/plugins/checksum_algorithm.rb

-          if @trailer_io
-            return @trailer_io.read(length, buf)
-          end
+        def read(_length = nil, buf = nil)


What would be the benefit of it being dynamic? If we're always expecting 16KB and this is an established expectation, I think it's fine that we don't use the length input. Not sure how feasible this is, but we could consider removing the parameter altogether in the future as to not cause confusion when using this method.

richardwang1124 · 2025-12-09T17:04:13Z

gems/aws-sdk-s3/spec/client_spec.rb

+          # Need to discuss
+          expect(resp.context.http_request.body.instance_variable_get(:@io).read).to eq(data)


If it's feasible, it may be better to swap the body back to the original IO object to minimize potentially breaking customers. However, I'm assuming most of these details should be API private?

alextwoods · 2025-12-09T19:26:29Z

gems/aws-sdk-core/lib/aws-sdk-core/plugins/checksum_algorithm.rb

          headers[header_name] = calculate_checksum(
            checksum_properties[:algorithm],
-            body
+            context.http_request.body


Do we need to do a rewind after calculating as well?

alextwoods · 2025-12-09T19:29:50Z

gems/aws-sdk-core/lib/aws-sdk-core/plugins/checksum_algorithm.rb

+          @digest = ChecksumAlgorithm.digest_for_algorithm(@algorithm)
+          @chunk_size = Thread.current[:net_http_override_body_stream_chunk] || MIN_CHUNK_SIZE
+          @overhead_bytes = calculate_overhead(@chunk_size)
+          @max_chunk_size = @chunk_size - @overhead_bytes


I find max_chunk_size to be a little bit of a confusing name. Why is it less than chunk size? This is actually the chunk size that we're reading from the underlying io right? so maybe like... base_chunk_size or underlying_chunk_size or something like that?

jterapin added 13 commits November 17, 2025 12:22

Update checksum plugin to not load body into memory

da5692a

Add customization

20bd2b8

Merge branch 'version-3' into unsigned-body

392d536

Fix trailer impl to include overhead bytes

1a761ec

Merge branch 'version-3' into unsigned-body

c09bb65

Update net http patches

dc9b603

Merge branch 'version-3' into unsigned-body

192d651

Update trailer impl with patch

5d453b3

Update MPU with custom chunk size

81dfafd

Generated S3 gem

14b7f34

Fix failing tests

06e08f2

Add temp changelog

ad14d42

Add http_chunk_size spec

de72cf1

aws deleted a comment from github-actions bot Dec 7, 2025

jterapin added 2 commits December 7, 2025 14:32

Improve documentation

4abf3f8

Fix changelog entries

4fbff2c

jterapin commented Dec 7, 2025

View reviewed changes

jterapin marked this pull request as ready for review December 8, 2025 17:36

jterapin changed the title ~~[WIP] Support S3 unsigned body~~ Support S3 unsigned body Dec 8, 2025

jterapin commented Dec 8, 2025

View reviewed changes

richardwang1124 approved these changes Dec 9, 2025

View reviewed changes

alextwoods approved these changes Dec 9, 2025

View reviewed changes

		# Need to discuss
		expect(resp.context.http_request.body.instance_variable_get(:@io).read).to eq(data)

Support S3 unsigned body #3323

Are you sure you want to change the base?

Support S3 unsigned body #3323

Uh oh!

Conversation

jterapin commented Dec 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Dec 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jterapin Dec 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jterapin Dec 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

richardwang1124 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

jterapin commented Dec 6, 2025 •

edited

Loading

github-actions bot commented Dec 7, 2025 •

edited

Loading

jterapin Dec 9, 2025 •

edited

Loading

jterapin Dec 7, 2025 •

edited

Loading