locodedb: allocate less, don't leave references to csv data#60
Merged
roman-khimov merged 2 commits intomasterfrom Oct 21, 2025
Merged
locodedb: allocate less, don't leave references to csv data#60roman-khimov merged 2 commits intomasterfrom
roman-khimov merged 2 commits intomasterfrom
Conversation
1. We have a reference to data allocated by csv reader code which means
we're wasting memory and have a lot of useless active objects (can
be clearly seen in `inuse_objects` of NeoFS node.
2. string is like 16 bytes while we can avoid storing it at all by writing
3 bytes more into the data blob.
benchstat:
goos: linux
goarch: amd64
pkg: github.com/nspcc-dev/locode-db/pkg/locodedb
cpu: AMD Ryzen 7 PRO 7840U w/ Radeon 780M Graphics
│ code.old │ code.new │
│ sec/op │ sec/op vs base │
Unpack-16 132.5m ± 2% 130.4m ± 3% ~ (p=0.052 n=10)
Get-16 351.4n ± 2% 385.9n ± 1% +9.83% (p=0.000 n=10)
geomean 215.8µ 224.3µ +3.96%
│ code.old │ code.new │
│ B/op │ B/op vs base │
Unpack-16 35.63Mi ± 0% 30.24Mi ± 0% -15.13% (p=0.000 n=10)
Get-16 5.000 ± 0% 5.000 ± 0% ~ (p=1.000 n=10) ¹
geomean 13.35Ki 12.30Ki -7.88%
¹ all samples are equal
│ code.old │ code.new │
│ allocs/op │ allocs/op vs base │
Unpack-16 191.4k ± 0% 191.4k ± 0% +0.02% (p=0.000 n=10)
Get-16 1.000 ± 0% 1.000 ± 0% ~ (p=1.000 n=10) ¹
geomean 437.5 437.5 +0.01%
¹ all samples are equal
Get clearly costs a bit more , but 15% less memory is much more valuable
here because it's not a frequentely accessed data.
No traces of csv-allocated strings left, before:
File: locodedb.test
Build ID: 3b6646cf8a2a939e7bb4055621d8b1b1d4ae096f
Type: inuse_objects
Time: 2025-10-21 12:52:13 MSK
Entering interactive mode (type "help" for commands, "o" for options)
(pprof) top
Showing nodes accounting for 86170, 100% of 86171 total
Dropped 1 node (cum <= 430)
Showing top 10 nodes out of 32
flat flat% sum% cum cum%
47515 55.14% 55.14% 47515 55.14% encoding/csv.(*Reader).readRecord
32768 38.03% 93.17% 32768 38.03% runtime.(*timers).addHeap
3277 3.80% 96.97% 3277 3.80% strings.NewReplacer
2308 2.68% 99.65% 2308 2.68% runtime.allocm
302 0.35% 100% 47818 55.49% github.com/nspcc-dev/locode-db/pkg/locodedb.unpackLocodesData
0 0% 100% 47515 55.14% encoding/csv.(*Reader).Read
0 0% 100% 47818 55.49% github.com/nspcc-dev/locode-db/pkg/locodedb.Get
0 0% 100% 47818 55.49% github.com/nspcc-dev/locode-db/pkg/locodedb.initLocodeData
0 0% 100% 47818 55.49% github.com/nspcc-dev/locode-db/pkg/locodedb.initLocodeData.func1
0 0% 100% 47818 55.49% github.com/nspcc-dev/locode-db/pkg/locodedb_test.TestGet.func1
After:
File: locodedb.test
Build ID: 7c43b87dd117ccb052a1a569160cbe68e44dbe25
Type: inuse_objects
Time: 2025-10-21 12:52:02 MSK
Entering interactive mode (type "help" for commands, "o" for options)
(pprof) top
Showing nodes accounting for 69083, 100% of 69094 total
Dropped 12 nodes (cum <= 345)
Showing top 10 nodes out of 18
flat flat% sum% cum cum%
65537 94.85% 94.85% 65537 94.85% runtime.(*timers).addHeap
2521 3.65% 98.50% 2521 3.65% net/http.init
1025 1.48% 100% 1025 1.48% runtime.allocm
0 0% 100% 65537 94.85% runtime.(*scavengerState).sleep
0 0% 100% 65537 94.85% runtime.(*timer).maybeAdd
0 0% 100% 65537 94.85% runtime.(*timer).modify
0 0% 100% 65537 94.85% runtime.(*timer).reset (inline)
0 0% 100% 65537 94.85% runtime.bgscavenge
0 0% 100% 2521 3.65% runtime.doInit
0 0% 100% 2521 3.65% runtime.doInit1
Signed-off-by: Roman Khimov <roman@nspcc.ru>
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #60 +/- ##
==========================================
+ Coverage 25.88% 26.04% +0.15%
==========================================
Files 20 20
Lines 935 937 +2
==========================================
+ Hits 242 244 +2
Misses 672 672
Partials 21 21 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
It returns boolean flag for a reason, so 9bb4b26 could do a bit better here and avoid useless comparisons. goos: linux goarch: amd64 pkg: github.com/nspcc-dev/locode-db/pkg/locodedb cpu: AMD Ryzen 7 PRO 7840U w/ Radeon 780M Graphics │ search.old │ search.new │ │ sec/op │ sec/op vs base │ Get-16 383.6n ± 1% 375.6n ± 2% -2.07% (p=0.012 n=10) │ search.old │ search.new │ │ B/op │ B/op vs base │ Get-16 5.000 ± 0% 5.000 ± 0% ~ (p=1.000 n=10) ¹ ¹ all samples are equal │ search.old │ search.new │ │ allocs/op │ allocs/op vs base │ Get-16 1.000 ± 0% 1.000 ± 0% ~ (p=1.000 n=10) ¹ ¹ all samples are equal Signed-off-by: Roman Khimov <roman@nspcc.ru>
Member
Author
End-rey
approved these changes
Oct 21, 2025
carpawell
approved these changes
Oct 21, 2025
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.


inuse_objectsof NeoFS node.benchstat:
Get clearly costs a bit more , but 15% less memory is much more valuable here because it's not a frequentely accessed data.
No traces of csv-allocated strings left, before:
After: