Skip to content

!perf: Use string for iteration and error reporting#24

Open
romshark wants to merge 2 commits intomainfrom
string-iterator
Open

!perf: Use string for iteration and error reporting#24
romshark wants to merge 2 commits intomainfrom
string-iterator

Conversation

@romshark
Copy link
Copy Markdown
Owner

@romshark romshark commented Dec 17, 2023

The string type is smaller than a []byte and can be copied much faster. Copying structs <= 32 bytes on modern 64-bit systems is essentially free whereas Error[[]byte] is 40 bytes wide
making jscan work faster with string than with []byte. This PR eliminates the overhead at the cost of backward compatibility improving performance by about 14%.

  • 👎🏻 This change requires the use of unsafe.
  • 👎🏻 Requires backward incompatible changes to type Error:
    • Removed type parameter.
    • Removed field Src.
  • 👍🏻 Slightly better performance overall.
  • 👍🏻 Type Error is no longer required to be type-parameterized.

An alternative solution is presented in #25.

Benchmark results:

goos: darwin
goarch: arm64
pkg: github.com/romshark/jscan/v2
                                    │  main.bench  │        string-iterator.bench        │
                                    │    sec/op    │   sec/op     vs base                │
CalcStats/miniscule_1b__________-10    20.63n ± 0%   14.28n ± 0%  -30.80% (p=0.000 n=10)
CalcStats/tiny_8b_______________-10    27.84n ± 0%   25.92n ± 0%   -6.90% (p=0.000 n=10)
CalcStats/small_336b____________-10    311.8n ± 0%   317.2n ± 0%   +1.75% (p=0.000 n=10)
CalcStats/large_26m_____________-10    13.84m ± 0%   13.79m ± 0%   -0.35% (p=0.000 n=10)
CalcStats/nasa_SxSW_2016_125k___-10    115.4µ ± 0%   114.2µ ± 0%   -1.09% (p=0.000 n=10)
CalcStats/escaped_3k____________-10    1.399µ ± 0%   1.397µ ± 0%   -0.14% (p=0.000 n=10)
CalcStats/array_int_1024_12k____-10    13.83µ ± 0%   13.31µ ± 0%   -3.72% (p=0.000 n=10)
CalcStats/array_dec_1024_10k____-10    11.73µ ± 4%   11.91µ ± 7%        ~ (p=0.105 n=10)
CalcStats/array_nullbool_1024_5k-10    7.482µ ± 1%   7.480µ ± 1%        ~ (p=0.436 n=10)
CalcStats/array_str_1024_639k___-10    146.4µ ± 0%   146.2µ ± 0%   -0.15% (p=0.000 n=10)
Valid/deeparray_____________-10       15.680n ± 1%   4.966n ± 0%  -68.33% (p=0.000 n=10)
Valid/unwind_stack__________-10       15.665n ± 1%   4.966n ± 0%  -68.30% (p=0.000 n=10)
Valid/miniscule_1b__________-10       11.430n ± 0%   7.760n ± 0%  -32.11% (p=0.000 n=10)
Valid/tiny_8b_______________-10        17.38n ± 0%   16.76n ± 0%   -3.57% (p=0.000 n=10)
Valid/small_336b____________-10        236.2n ± 0%   225.9n ± 0%   -4.34% (p=0.000 n=10)
Valid/large_26m_____________-10        11.09m ± 0%   11.00m ± 0%   -0.80% (p=0.000 n=10)
Valid/nasa_SxSW_2016_125k___-10        85.25µ ± 0%   83.16µ ± 0%   -2.45% (p=0.000 n=10)
Valid/escaped_3k____________-10        1.387µ ± 0%   1.386µ ± 0%   -0.07% (p=0.001 n=10)
Valid/array_int_1024_12k____-10        9.135µ ± 1%   9.447µ ± 0%   +3.42% (p=0.000 n=10)
Valid/array_dec_1024_10k____-10        8.672µ ± 0%   8.661µ ± 0%        ~ (p=0.280 n=10)
Valid/array_nullbool_1024_5k-10        3.415µ ± 5%   3.243µ ± 4%   -5.04% (p=0.001 n=10)
Valid/array_str_1024_639k___-10        142.3µ ± 0%   142.4µ ± 0%        ~ (p=0.105 n=10)
Yo/bytes-10                                          13.04n ± 0%
Yo/string-10                                         12.89n ± 1%
geomean                                3.038µ        1.681µ       -13.86%

@romshark romshark self-assigned this Dec 17, 2023
@romshark romshark added major Requires backward incompatible changes performance Improves performance labels Dec 17, 2023
@romshark romshark added this to the v3 milestone Dec 17, 2023
@romshark
Copy link
Copy Markdown
Owner Author

Alternatively, the error reporting can be changed to avoid needing Src in the first place.
The downside would be slightly less convenient error messages:

error at index 6: illegal control character

instead of:

error at index 6 (0x02): illegal control character

But this can still be replicated by the API user if necessary.

The string type is smaller than a byte slice and can be
copied much faster than []byte on modern 64-bit systems.
Drawbacks: this change requires the use of unsafe.
Type Error is no longer required to be type-parameterized but
it's preserved as Error[S] in order to remain backward compatible.
The type parameter can be removed in a new major version.
Remove type parameter because the Error type
requires backward incompatible changes anyway.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

major Requires backward incompatible changes performance Improves performance

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant