Skip to content

Comments

part: optimize node16 findIndex with SIMD instructions#145

Draft
dylandreimerink wants to merge 1 commit intomainfrom
feature/part-simd
Draft

part: optimize node16 findIndex with SIMD instructions#145
dylandreimerink wants to merge 1 commit intomainfrom
feature/part-simd

Conversation

@dylandreimerink
Copy link
Member

Go v1.26 introduced the experimental simd/archsimd package. This package provides a convenient API for using SIMD instructions in Go on supported architectures.

The academic paper on which our part implementation is based suggested using SIMD instructions to optimize the findIndex method of node16. This commit implements this optimization using the simd/archsimd package.

The algorithm works by loading the 16 bytes into a SIMD register. It takes the search key and broadcasts it across all lanes of another SIMD register such that all 16 bytes in that register have the same search key. Then, a SIMD equal is performed between the two registers, resulting in a mask where each bit indicates equality. We mask off any bits that are not considered, in case the node16 has less than 16 keys. If the mask is zero, we do the same but for the greater than op, so we can determine which index the search key would need to be inserted to maintain sorted order. For both the equality and greater than cases, we count the trailing zeros to get an array index from the masks.

Since the simd/archsimd package is experimental, we have to put the implementation in a separate file with a build tag. This should not be nessecary once the simd/archsimd package is stabilized and can be used without a build tag. However, this setup allows someone to take advantage of the optimization by building with GOEXPERIMENT=simd.

Benchmarks on my machine show a 50%+ improvement in speed, for 16 byte nodes.

benchstat before.txt after.txt
goos: linux
goarch: amd64
pkg: github.com/cilium/statedb/part
cpu: 13th Gen Intel(R) Core(TM) i7-13800H
                │ before.txt  │              after.txt              │
                │   sec/op    │   sec/op     vs base                │
_findIndex16-20   7.346n ± 1%   3.454n ± 2%  -52.98% (p=0.000 n=10)

Go v1.26 introduced the experimental simd/archsimd package. This package
provides a convenient API for using SIMD instructions in Go on
supported architectures.

The academic paper on which our part implementation is based suggested
using SIMD instructions to optimize the findIndex method of node16. This
commit implements this optimization using the simd/archsimd package.

The algorithm works by loading the 16 bytes into a SIMD register. It
takes the search key and broadcasts it across all lanes of another SIMD
register such that all 16 bytes in that register have the same search
key. Then, a SIMD equal is performed between the two registers,
resulting in a mask where each bit indicates equality. We mask off any
bits that are not considered, in case the node16 has less than 16 keys.
If the mask is zero, we do the same but for the greater than op, so we
can determine which index the search key would need to be inserted to
maintain sorted order. For both the equality and greater than cases, we
count the trailing zeros to get an array index from the masks.

Since the simd/archsimd package is experimental, we have to put the
implementation in a separate file with a build tag. This should not
be nessecary once the simd/archsimd package is stabilized and can be
used without a build tag. However, this setup allows someone to take
advantage of the optimization by building with `GOEXPERIMENT=simd`.

Benchmarks on my machine show a 50%+ improvement in speed, for 16 byte
nodes.

```
benchstat before.txt after.txt
goos: linux
goarch: amd64
pkg: github.com/cilium/statedb/part
cpu: 13th Gen Intel(R) Core(TM) i7-13800H
                │ before.txt  │              after.txt              │
                │   sec/op    │   sec/op     vs base                │
_findIndex16-20   7.346n ± 1%   3.454n ± 2%  -52.98% (p=0.000 n=10)
```

Signed-off-by: Dylan Reimerink <dylan.reimerink@isovalent.com>
@github-actions
Copy link

$ make
go build ./...
go: downloading go1.26.0 (linux/amd64)
go: downloading go.yaml.in/yaml/v3 v3.0.3
go: downloading github.com/cilium/hive v0.0.0-20250731144630-28e7a35ed227
go: downloading golang.org/x/time v0.5.0
go: downloading github.com/spf13/cobra v1.8.0
go: downloading github.com/spf13/pflag v1.0.5
go: downloading github.com/cilium/stream v0.0.0-20240209152734-a0792b51812d
go: downloading github.com/liggitt/tabwriter v0.0.0-20181228230101-89fcab3d43de
go: downloading github.com/spf13/viper v1.18.2
go: downloading go.uber.org/dig v1.17.1
go: downloading golang.org/x/term v0.16.0
go: downloading github.com/davecgh/go-spew v1.1.2-0.20180830191138-d8f796af33cc
go: downloading github.com/mitchellh/mapstructure v1.5.0
go: downloading golang.org/x/sys v0.17.0
go: downloading golang.org/x/tools v0.17.0
go: downloading github.com/spf13/cast v1.6.0
go: downloading github.com/fsnotify/fsnotify v1.7.0
go: downloading github.com/sagikazarmark/slog-shim v0.1.0
go: downloading github.com/spf13/afero v1.11.0
go: downloading github.com/subosito/gotenv v1.6.0
go: downloading github.com/hashicorp/hcl v1.0.0
go: downloading gopkg.in/ini.v1 v1.67.0
go: downloading github.com/magiconair/properties v1.8.7
go: downloading github.com/pelletier/go-toml/v2 v2.1.0
go: downloading gopkg.in/yaml.v3 v3.0.1
go: downloading golang.org/x/text v0.14.0
STATEDB_VALIDATE=1 go test ./... -cover -vet=all -test.count 1
go: downloading github.com/stretchr/testify v1.8.4
go: downloading go.uber.org/goleak v1.3.0
go: downloading golang.org/x/exp v0.0.0-20240119083558-1b970713d09a
go: downloading github.com/pmezard/go-difflib v1.0.1-0.20181226105442-5d4384ee4fb2
# github.com/cilium/statedb/reconciler/benchmark
# internal/goarch
compile: version "go1.26.0" does not match go tool version "go1.25.6"
# internal/byteorder
compile: version "go1.26.0" does not match go tool version "go1.25.6"
# internal/unsafeheader
compile: version "go1.26.0" does not match go tool version "go1.25.6"
# internal/coverage/rtcov
compile: version "go1.26.0" does not match go tool version "go1.25.6"
# internal/godebugs
compile: version "go1.26.0" does not match go tool version "go1.25.6"
# internal/goexperiment
compile: version "go1.26.0" does not match go tool version "go1.25.6"
# internal/cpu
compile: version "go1.26.0" does not match go tool version "go1.25.6"
# internal/goos
compile: version "go1.26.0" does not match go tool version "go1.25.6"
# internal/profilerecord
compile: version "go1.26.0" does not match go tool version "go1.25.6"
# math/bits
compile: version "go1.26.0" does not match go tool version "go1.25.6"
# internal/msan
compile: version "go1.26.0" does not match go tool version "go1.25.6"
# internal/asan
compile: version "go1.26.0" does not match go tool version "go1.25.6"
# internal/trace/tracev2
compile: version "go1.26.0" does not match go tool version "go1.25.6"
# internal/runtime/pprof/label
compile: version "go1.26.0" does not match go tool version "go1.25.6"
# cmp
compile: version "go1.26.0" does not match go tool version "go1.25.6"
# sync/atomic
compile: version "go1.26.0" does not match go tool version "go1.25.6"
# unicode/utf8
compile: version "go1.26.0" does not match go tool version "go1.25.6"
# log/internal
compile: version "go1.26.0" does not match go tool version "go1.25.6"
# unicode
compile: version "go1.26.0" does not match go tool version "go1.25.6"
# internal/coverage
compile: version "go1.26.0" does not match go tool version "go1.25.6"
# internal/coverage/calloc
compile: version "go1.26.0" does not match go tool version "go1.25.6"
# internal/coverage/uleb128
compile: version "go1.26.0" does not match go tool version "go1.25.6"
# encoding
compile: version "go1.26.0" does not match go tool version "go1.25.6"
# github.com/cilium/statedb/reconciler/example
# internal/goarch
compile: version "go1.26.0" does not match go tool version "go1.25.6"
# internal/unsafeheader
compile: version "go1.26.0" does not match go tool version "go1.25.6"
# internal/byteorder
compile: version "go1.26.0" does not match go tool version "go1.25.6"
# internal/coverage/rtcov
compile: version "go1.26.0" does not match go tool version "go1.25.6"
# internal/godebugs
compile: version "go1.26.0" does not match go tool version "go1.25.6"
# internal/goos
compile: version "go1.26.0" does not match go tool version "go1.25.6"
# internal/goexperiment
compile: version "go1.26.0" does not match go tool version "go1.25.6"
# internal/cpu
compile: version "go1.26.0" does not match go tool version "go1.25.6"
# internal/profilerecord
compile: version "go1.26.0" does not match go tool version "go1.25.6"
# math/bits
compile: version "go1.26.0" does not match go tool version "go1.25.6"
# internal/asan
compile: version "go1.26.0" does not match go tool version "go1.25.6"
# internal/runtime/pprof/label
compile: version "go1.26.0" does not match go tool version "go1.25.6"
# internal/msan
compile: version "go1.26.0" does not match go tool version "go1.25.6"
# internal/trace/tracev2
compile: version "go1.26.0" does not match go tool version "go1.25.6"
# unicode
compile: version "go1.26.0" does not match go tool version "go1.25.6"
# unicode/utf8
compile: version "go1.26.0" does not match go tool version "go1.25.6"
# sync/atomic
compile: version "go1.26.0" does not match go tool version "go1.25.6"
# cmp
compile: version "go1.26.0" does not match go tool version "go1.25.6"
# log/internal
compile: version "go1.26.0" does not match go tool version "go1.25.6"
# internal/coverage
compile: version "go1.26.0" does not match go tool version "go1.25.6"
# internal/coverage/uleb128
compile: version "go1.26.0" does not match go tool version "go1.25.6"
# encoding
compile: version "go1.26.0" does not match go tool version "go1.25.6"
# internal/coverage/calloc
compile: version "go1.26.0" does not match go tool version "go1.25.6"
ok  	github.com/cilium/statedb	417.878s	coverage: 78.6% of statements
ok  	github.com/cilium/statedb/index	0.007s	coverage: 33.7% of statements
ok  	github.com/cilium/statedb/internal	0.045s	coverage: 42.9% of statements
ok  	github.com/cilium/statedb/lpm	4.458s	coverage: 77.6% of statements
ok  	github.com/cilium/statedb/part	62.937s	coverage: 87.2% of statements
ok  	github.com/cilium/statedb/reconciler	0.257s	coverage: 91.9% of statements
make: *** [Makefile:9: test] Error 1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant