Skip to content

dkorunic/betteralign

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

153 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

betteralign

GitHub license GitHub release go-recipes

About

betteralign is a tool that detects structs that would use less memory if their fields were reordered, and optionally rewrites them.

It is a fork of the official Go fieldalignment tool, with the bulk of the alignment logic unchanged. Its notable differences group into a few themes:

Why this exists

  • preserves comments (field comments, doc comments, and floating comments) — the original fieldalignment erases them on rewrite.

Safety and robustness

  • detects positional composite literals (T{1, 2, 3} rather than T{a: 1, b: 2, c: 3}) anywhere in the package and reports the would-be saving without rewriting the struct — reordering its fields would silently re-map the literal's elements at the next build,
  • analyzes only named struct types declared with type T struct { ... }; anonymous structs (nested struct-typed fields, struct literals, var x struct{...} declarations) are skipped, so it never rewrites the riskier unnamed shapes,
  • when -apply rewrites a struct, fields declared with multiple names (A, B int) are kept together as a single grouped declaration and moved as a unit — the byte-span reorder never splits them,
  • performs atomic file writes to prevent corruption or data loss on rewrite (not on Windows),
  • uses a stable sort for field ordering, so fields of equal sort rank preserve their original relative order; upstream's sort.Slice is unstable and may permute equal-ranked fields differently between runs,
  • fixes a crash on packages with generic struct fields — upstream's ptrdata implementation ends with panic("impossible"), which fires when an uninstantiated type parameter appears as a field type (because (*types.TypeParam).Underlying() returns itself, falling through every switch case); betteralign replaces the panic with a conservative one-word fallback.

Opt-in / opt-out controls

  • skips generated files, identified either by a known suffix (_generated.go, _gen.go, .gen.go, .pb.go, .pb.gw.go) or by a package-level comment containing Code generated by ... DO NOT EDIT.,
  • skips test files (_test.go suffix),
  • skips structs annotated with // betteralign:ignore placed inside the struct body,
  • supports opt-in mode, where only structs annotated with // betteralign:check on the type declaration are checked; placing the directive on a surrounding type ( ... ) block opts in every spec inside the group.

Other improvements

  • caches per-type size, alignment, and pointer-scan results within each analysis pass, avoiding redundant recomputation for types that recur across many struct fields (common in protobuf-generated code),
  • prefixes diagnostics with the actual byte saving (8 bytes saved: struct of size 24 could be 16) rather than only the before/after sizes,
  • includes more thorough tests comparing expected versus golden output,
  • adapts automatically to CPU and memory constraints in containerized environments (Docker, Kubernetes, LXC, LXD, etc.).

Comment preservation is achieved by a small in-tree package, internal/dstmin, that decorates the parser's *ast.File with byte-range spans for each field's lead-doc, body and trailing blanks, then reprints the file by byte-splicing the synthesized struct bodies into the original source and finalising with go/format.Source for column realignment. It replaces a previous dependency on sirkon/dst with ~900 lines of focused code; see internal/dstmin/README.md for the design rationale and benchmark comparison. The whole-file rewrite trade-off still applies: partial rewrites via SuggestedFixes are not possible, so the -fix flag from the analysis package is treated as an alias for -apply, and auto-fix integration with golangci-lint is not supported.

Go's standard AST does not associate comments with nodes — it only stores byte offsets, so the original fieldalignment tool erases all comments. A proposed fix exists as an open CL but has not yet been merged.

Note: This is a single-pass tool. Achieving a fully optimal layout may require running it more than once.

This tool builds upon the following prior work:

Deep dive

betteralign pursues two goals:

  1. Minimize struct size by sorting fields in descending alignment order and placing zero-sized fields first, reducing internal padding (and avoiding the one-byte tail the runtime adds when a struct ends in a zero-sized field).
  2. Reduce GC pointer scan overhead by grouping pointer-bearing fields before pointer-free ones (the GC stops scanning at the last pointer in a value).

With those goals in mind, fields are sorted stably by the following criteria, in order:

  1. Zero-sized types first.
  2. Higher alignment first.
  3. Pointer-bearing types before pointer-free types.
  4. Among pointer-bearing types, those with fewer trailing non-pointer bytes first (minimizing GC ptrdata).
  5. Larger size first.

Size: sorting by descending alignment

type Record struct {
	Flag  bool  // 1 byte,  1-byte aligned
	ID    int64 // 8 bytes, 8-byte aligned
	Count int32 // 4 bytes, 4-byte aligned
}

As declared, Record is 24 bytes on a 64-bit platform: Flag sits at offset 0, then 7 padding bytes appear so ID can land on an 8-byte boundary; Count follows at offset 16, and the struct is rounded up to the next multiple of its highest alignment (8), adding 4 trailing padding bytes.

After optimalOrder sorts by descending alignment, the layout becomes ID, Count, Flag and shrinks to 16 bytes: ID at offset 0, Count at offset 8, Flag at offset 12, plus 3 trailing padding bytes to round to 16. No internal padding is needed because each field already lands on a correctly-aligned offset. Same data, 33% less memory.

GC scan: ordering pointer-bearing fields by trailing non-pointer bytes

type IPAddr struct {
	Zone string // 16 bytes, 8-byte aligned
	IP   []byte // 24 bytes, 8-byte aligned
}

Zone comes first because string has fewer trailing non-pointer bytes than []byte: Sizeof(string) - ptrdata(string) = 8 versus Sizeof([]byte) - ptrdata([]byte) = 16. The struct is 40 bytes either way on a 64-bit platform, but with Zone first the GC only scans the first 24 bytes for pointers; reversing the fields would push that to 32.

Runtime tuning with GOGC

The analyzer itself is allocation-light, but loading and type-checking a package graph (go/types, the AST inspector, and the parser) produces a large volume of short-lived garbage — on a real-world run, well over 1 GiB of churn against only a few MiB of live heap at exit. Because betteralign is a short-lived batch process, the default GOGC=100 spends CPU collecting garbage that would be freed at exit anyway. Running with garbage collection disabled trades that work for higher peak memory:

GOGC=off betteralign ./...

On a large warm-cache run this measured roughly a 15–16% wall-clock improvement (and GOGC=400 about 13%), since most of the runtime's GC work simply disappears.

Caution

GOGC=off lets the heap grow unbounded for the duration of the run. In memory-constrained containers (e.g. a Kubernetes pod with a low memory limit) this can trigger an OOM kill on a large package graph. betteralign already sets a soft GOMEMLIMIT (90% of the cgroup/system memory) which acts as a backstop, but on tight limits prefer a high finite value such as GOGC=400 over fully disabling collection, or leave the default.

Installation

Manual: Download the appropriate binary from the releases page and place it in your PATH, typically /usr/local/bin/betteralign.

Via go install:

go install github.com/dkorunic/betteralign/cmd/betteralign@latest

Usage

betteralign: find structs that would use less memory if their fields were sorted

Usage: betteralign [-flag] [package]

This analyzer find structs that can be rearranged to use less memory, and provides
a suggested edit with the most compact order.

Note that there are two different diagnostics reported. One checks struct size,
and the other reports "pointer bytes" used. Pointer bytes is how many bytes of the
object that the garbage collector has to potentially scan for pointers, for example:

        struct { uint32; string }

have 16 pointer bytes because the garbage collector has to scan up through the string's
inner pointer.

        struct { string; *uint32 }

has 24 pointer bytes because it has to scan further through the *uint32.

        struct { string; uint32 }

has 8 because it can stop immediately after the string pointer.

Be aware that the most compact order is not always the most efficient.
In rare cases it may cause two variables each updated by its own goroutine
to occupy the same CPU cache line, inducing a form of memory contention
known as "false sharing" that slows down both goroutines.

Unlike most analyzers, which report likely mistakes, the diagnostics
produced by betteralign very rarely indicate a significant problem,
so the analyzer is not included in typical suites such as vet or
gopls. Use this standalone command to run it on your code:

   $ go install github.com/dkorunic/betteralign/cmd/betteralign@latest
   $ betteralign [packages]

Only named struct types declared with the `type T struct { ... }` form
are analyzed. Anonymous structs (nested struct-typed fields, struct literals,
`var x struct{...}` declarations, and similar unnamed forms) are
skipped. To enable analysis of one of those, lift it into a named type
declaration.

If a struct is constructed somewhere in the package with a positional
composite literal (`T{1, 2, 3}` rather than `T{a: 1, b: 2, c: 3}`),
the reorder is reported but never applied: rewriting the field order would
re-map the literal's elements to different fields, breaking the build (or
worse, silently mis-assigning values when the new field types still happen
to accept the old element types). Convert the literal to keyed form and
rerun to enable the reorder.



Flags:
  -V    print version and exit
  -all
        no effect (deprecated)
  -apply
        apply suggested fixes
  -c int
        display offending line with this many lines of context (default -1)
  -cpuprofile string
        write CPU profile to this file
  -debug string
        debug flags, any subset of "fpstv"
  -diff
        with -fix, don't update the files, but print a unified diff
  -exclude_dirs value
        exclude directories matching a pattern
  -exclude_files value
        exclude files matching a pattern
  -fix
        apply all suggested fixes
  -flags
        print analyzer flags in JSON
  -generated_files
        also check and fix generated files
  -json
        emit JSON output
  -memprofile string
        write memory profile to this file
  -opt_in
        opt-in mode on per-struct basis with 'betteralign:check' in comment
  -source
        no effect (deprecated)
  -tags string
        no effect (deprecated)
  -test
        indicates whether test files should be analyzed, too (default true)
  -test_files
        also check and fix test files
  -trace string
        write trace log to this file
  -v    no effect (deprecated)

To check all packages in the current module:

betteralign ./...

To automatically rewrite files (excluding test and generated files):

betteralign -apply ./...

Generated and test files can be included with the -generated_files and -test_files flags respectively. Use -exclude_dirs and -exclude_files to skip specific paths. Use -opt_in to check only structs explicitly annotated with // betteralign:check.

Structs constructed via positional composite literals are reported but never rewritten under -apply; the diagnostic points at the offending literal so it can be converted to keyed form (T{Field: value}), after which a rerun will enable the reorder.

Star history

Star History Chart

About

Detect and fix struct field alignment to reduce memory usage in Go programs

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages