Skip to content

swift 6.0+ performance tweaks 3x-6x #9067

Open
blindspotbounty wants to merge 4 commits intogoogle:masterfrom
ordo-one:swift-performance-tweaks
Open

swift 6.0+ performance tweaks 3x-6x #9067
blindspotbounty wants to merge 4 commits intogoogle:masterfrom
ordo-one:swift-performance-tweaks

Conversation

@blindspotbounty
Copy link
Copy Markdown
Contributor

We were experimenting with the latest changes for swift flatbuffers runtime and found several tweaks that allow to have significantly faster:

  1. Remove exclusivity checks by making Blob let
  2. Mark Blob as @frozen for library evolution mode
  3. Mark Blob as ~Copyable to avoid compiler inserted copies (swift 6.0+)
  4. Add BitwiseCopyable annotation for directly read types (swift 6.0+)
  5. Add exclusivity(unchecked) for FlatbuffersBuilder - assuming it is always exclusive anyway
  6. Fix func duplicate to not crash in debug with default value

Benchmarks give a lot of false negative/positive results in main vs main on my machine (i.e. deviation is >+-5%). I guess mainly due to allocations.
However, there are the main improvements:

==============================================================
Threshold deviations for FlatbuffersBenchmarks:Reading Doubles
==============================================================
╒══════════════════════════════════════════╤═════════════════╤═════════════════╤═════════════════╤═════════════════╕
│ Time (wall clock) (ms, %)                │          tweaks │            main │    Difference % │     Threshold % │
╞══════════════════════════════════════════╪═════════════════╪═════════════════╪═════════════════╪═════════════════╡
│ p25                                      │               5 │              52 │            1051 │               5 │
├──────────────────────────────────────────┼─────────────────┼─────────────────┼─────────────────┼─────────────────┤
│ p50                                      │               5 │              52 │            1039 │               5 │
├──────────────────────────────────────────┼─────────────────┼─────────────────┼─────────────────┼─────────────────┤
│ p75                                      │               5 │              53 │            1036 │               5 │
╘══════════════════════════════════════════╧═════════════════╧═════════════════╧═════════════════╧═════════════════╛

╒══════════════════════════════════════════╤═════════════════╤═════════════════╤═════════════════╤═════════════════╕
│ Time (total CPU) (ms, %)                 │          tweaks │            main │    Difference % │     Threshold % │
╞══════════════════════════════════════════╪═════════════════╪═════════════════╪═════════════════╪═════════════════╡
│ p25                                      │               5 │              52 │            1050 │               5 │
├──────────────────────────────────────────┼─────────────────┼─────────────────┼─────────────────┼─────────────────┤
│ p50                                      │               5 │              53 │            1040 │               5 │
├──────────────────────────────────────────┼─────────────────┼─────────────────┼─────────────────┼─────────────────┤
│ p75                                      │               5 │              53 │            1036 │               5 │
╘══════════════════════════════════════════╧═════════════════╧═════════════════╧═════════════════╧═════════════════╛

╒══════════════════════════════════════════╤═════════════════╤═════════════════╤═════════════════╤═════════════════╕
│ Releases (K, %)                          │          tweaks │            main │    Difference % │     Threshold % │
╞══════════════════════════════════════════╪═════════════════╪═════════════════╪═════════════════╪═════════════════╡
│ p25                                      │               0 │            1000 │        50000000 │               5 │
├──────────────────────────────────────────┼─────────────────┼─────────────────┼─────────────────┼─────────────────┤
│ p50                                      │               0 │            1000 │        50000000 │               5 │
├──────────────────────────────────────────┼─────────────────┼─────────────────┼─────────────────┼─────────────────┤
│ p75                                      │               0 │            1000 │        50000000 │               5 │
╘══════════════════════════════════════════╧═════════════════╧═════════════════╧═════════════════╧═════════════════╛

For serialization I probably need to add one more test. However the difference is quite noticeable.

Was:
image

Became:
image

All improvements are manly due to removing swift runtime calls: exclusivity checks and runtime metadata access.
In practice, all improvements give up to 3x-6x gain on scale.

@mustiikhalil I put all tweaks that we have currently together.
Let me know if I should split them or if something is going to be implemented other way (or if I need to make separate cases for those improvements).

@mustiikhalil
Copy link
Copy Markdown
Collaborator

@blindspotbounty

1- is there any chance that you also add or investigate the same changes to the flexbuffer implementation?
2- Can we wait on #8983? It will reduce the number of #if

@hassila
Copy link
Copy Markdown
Contributor

hassila commented Apr 28, 2026

Sorry for extra ping, I was a bit trigger happy there - missed you were in @mustiikhalil - will let @blindspotbounty respond.

@hassila
Copy link
Copy Markdown
Contributor

hassila commented Apr 28, 2026

2- Can we wait on #8983? It will reduce the number of #if

When do you think it will go in approx? Looks like a great update, 5.10 and older is more or less gone now I think.

@mustiikhalil
Copy link
Copy Markdown
Collaborator

mustiikhalil commented Apr 28, 2026

When do you think it will go in approx? Looks like a great update, 5.10 and older is more or less gone now I think.

Not sure, I pinged for help to review it let's see though! If it doesn't get reviewed the today or tomorrow then we can merge this and update the other branch

@mustiikhalil mustiikhalil self-requested a review April 28, 2026 15:32
@blindspotbounty
Copy link
Copy Markdown
Contributor Author

@blindspotbounty

1- is there any chance that you also add or investigate the same changes to the flexbuffer implementation? 2- Can we wait on #8983? It will reduce the number of #if

Yeah, why not!
I'll take a look at #8983.
Also can look at FlexBuffers (can't promise this week though)

Copy link
Copy Markdown
Collaborator

@mustiikhalil mustiikhalil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amazing work! some minor changes! and ofc the CI hasnt passed

Comment on lines +50 to +51
@usableFromInline
var serializeDefaults: Bool
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder, can this simply be a let

Comment on lines +43 to +46
case .data(let data):
self = .data(data)
case .bytes(let contiguousBytes):
self = .bytes(contiguousBytes)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

requires OS not wasm flag

Comment on lines +348 to +359
#if compiler(>=6.0)
@inline(__always)
init(
blob: borrowing Storage.Blob,
count: Int,
removing removeBytes: Int)
{
_storage = Storage(blob: blob, capacity: count)
_readerIndex = removeBytes
capacity = count
}
#else
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Im wondering if we can reduce the amount of #if compiler specially for language features that are available in 5.10, example: borrowing

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I can wait for #8983 and remove all swift>=6.0 annotations all together.

Comment thread swift/Sources/FlatBuffers/ByteBuffer.swift
@mustiikhalil
Copy link
Copy Markdown
Collaborator

mustiikhalil commented May 6, 2026

@blindspotbounty The swift upgrade PR has been merged! Please pull, and update the current Pr with the changes required.

We will need to check if the gRPC changes will run with the new code change that you've done.

Then it would be interesting to look at the changes for flexbuffers (I use those more than flatbuffers nowadays)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants