Skip to content

Conversation

@dolmen
Copy link
Contributor

@dolmen dolmen commented Jun 9, 2023

In C.Run bypass check of the signature of the Run method of c.TB by using a prefilled table for the common cases of *testing.T, *testing.B and *quicktest.C.

This serie of patches shows the developement process:

  • refactor C.Run to extract getRunFuncSignature
  • add unit test for getRunFuncSignature
  • add a dummy implementation of a caching version (which does no caching at this point) of getRunFuncSignature and add a BenchmarkCRunGetFuncSig to compare them
  • implement the real version of getRunFuncSignatureCache (see commit for results)
  • enable the use of getRunFuncSignatureCache instead of getRunFuncSignature in C.Run
$ go test -bench BenchmarkCRunGetFuncSig -benchmem
    BenchmarkCRunGetFuncSig/*testing.T-no-cache-4          562579   2142     ns/op   168 B/op             4 allocs/op
    BenchmarkCRunGetFuncSig/*testing.T-with-cache-4     183638598      7.514 ns/op     0 B/op             0 allocs/op
    BenchmarkCRunGetFuncSig/*testing.B-no-cache-4          608715   2626     ns/op   168 B/op             4 allocs/op
    BenchmarkCRunGetFuncSig/*testing.B-with-cache-4     100000000     11.00  ns/op     0 B/op             0 allocs/op
    BenchmarkCRunGetFuncSig/*quicktest.C-no-cache-4        636226   2523     ns/op   168 B/op             4 allocs/op
    BenchmarkCRunGetFuncSig/*quicktest.C-with-cache-4    68991796     48.91  ns/op     0 B/op             0 allocs/op

@dolmen
Copy link
Contributor Author

dolmen commented Jun 9, 2023

@frankban Please merge #157 first. I'll do the rebase of #160 on top.

@dolmen dolmen force-pushed the faster-Run branch 2 times, most recently from 7295bd0 to 3482a30 Compare June 12, 2023 08:01
@dolmen
Copy link
Contributor Author

dolmen commented Jun 12, 2023

@frankban Rebased.

This important optimization (with my other fixes) are worth a release.

@dolmen
Copy link
Contributor Author

dolmen commented Jun 26, 2023

ping @frankban

@frankban
Copy link
Owner

frankban commented Aug 1, 2023

This looks very interesting. We are discussing with @rogpeppe about other possible ways for introducing this optimization. Thanks for this work!

@dolmen
Copy link
Contributor Author

dolmen commented Aug 1, 2023

I could go further and completely remove the getRunFuncSignature. At this point the cache is statically filled and getRunFuncSignature is only used at runtime for error reporting in the case of a bad usage (and also for the benchmark that would go too): this could be replaced by a simple message displaying the signature of the received function and the list of acceptable ones.

An alternate implementation would use generics to enforce the signature at compile time. But are you ready to push the minimum Go version required from 1.13 to 1.18?

@dolmen
Copy link
Contributor Author

dolmen commented Aug 1, 2023

Well, Go doesn't have generic methods yet, so I don't see how a compile-time check of the subtest signature could be enforced.

dolmen added 5 commits August 1, 2023 14:41
Extract signature check of the Run method. This will allow to cache the
result in future commits.
Add a dummy implementation getRunFuncSignature (it will be replaced by
concrete implementation in future commits) called getRunFuncSignature.

Add a benchmark that allows to compare the performance of
getRunFuncSignatureCache with getRunFuncSignature. Of course the results
are currently useless.
This will give a real boost to C.Run by removing most uses of reflect on
each Run.

Benchmark results:

go test -bench BenchmarkCRunGetFuncSig -benchmem
BenchmarkCRunGetFuncSig/*testing.T-no-cache-4          562579   2142     ns/op   168 B/op	      4 allocs/op
BenchmarkCRunGetFuncSig/*testing.T-with-cache-4     183638598      7.514 ns/op     0 B/op	      0 allocs/op
BenchmarkCRunGetFuncSig/*testing.B-no-cache-4          608715   2626     ns/op   168 B/op	      4 allocs/op
BenchmarkCRunGetFuncSig/*testing.B-with-cache-4     100000000     11.00  ns/op     0 B/op	      0 allocs/op
BenchmarkCRunGetFuncSig/*quicktest.C-no-cache-4        636226   2523     ns/op	 168 B/op	      4 allocs/op
BenchmarkCRunGetFuncSig/*quicktest.C-with-cache-4    68991796     48.91  ns/op     0 B/op	      0 allocs/op
@dolmen
Copy link
Contributor Author

dolmen commented Aug 1, 2023

Rebased on top of v1.14.6.

@dolmen
Copy link
Contributor Author

dolmen commented Aug 1, 2023

Replaced by #165 which completely bypass reflect in the common cases for which we used a cache here.

@dolmen dolmen closed this Aug 1, 2023
rogpeppe added a commit to rogpeppe-contrib/quicktest that referenced this pull request Aug 1, 2023
This is an alternative to PRs frankban#160 and frankban#165.
It's essentially the same as PR frankban#165 except that it uses
generics to reduce the amount of duplicated code.

Instead of just amortizing the checking of the type, when
the argument type of the function passed to `Run` is known,
it bypasses the reflect-based code altogether.
We don't bother implementing the optimization on pre-generics
Go versions because those are end-of-lifetime anyway.

I've added an implementation-independent benchmark.

```
goos: linux
goarch: amd64
pkg: github.com/frankban/quicktest
cpu: Intel(R) Core(TM) i7-8565U CPU @ 1.80GHz
                           │      base      │               thisPR               │
                           │     sec/op     │   sec/op     vs base               │
CNewAndRunWithCustomType-8    1077.5n ±  5%   136.8n ± 6%  -87.30% (p=0.002 n=6)
CRunWithCustomType-8         1035.00n ± 11%   66.43n ± 3%  -93.58% (p=0.002 n=6)
geomean                        1.056µ         95.33n       -90.97%
```
rogpeppe added a commit to rogpeppe-contrib/quicktest that referenced this pull request Aug 1, 2023
This is an alternative to PRs frankban#160 and frankban#165.
It's essentially the same as PR frankban#165 except that it uses
generics to reduce the amount of duplicated code.

Instead of just amortizing the checking of the type, when
the argument type of the function passed to `Run` is known,
it bypasses the reflect-based code altogether.
We don't bother implementing the optimization on pre-generics
Go versions because those are end-of-lifetime anyway.

I've added an implementation-independent benchmark.

```
goos: linux
goarch: amd64
pkg: github.com/frankban/quicktest
cpu: Intel(R) Core(TM) i7-8565U CPU @ 1.80GHz
                           │      base      │               thisPR               │
                           │     sec/op     │   sec/op     vs base               │
CNewAndRunWithCustomType-8    1077.5n ±  5%   136.8n ± 6%  -87.30% (p=0.002 n=6)
CRunWithCustomType-8         1035.00n ± 11%   66.43n ± 3%  -93.58% (p=0.002 n=6)
geomean                        1.056µ         95.33n       -90.97%
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants