-
-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Cat splice perf #10835
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Cat splice perf #10835
Conversation
|
Still slower than GNU |
|
How about splice threshould + clap bypass? |
|
GNU testsuite comparison: |
|
please run the benchmark with /usr/bin/cat too |
|
now running hyperfine using --show-output, as otherwise perf improvement does not make much sense when it comes to writes (every command takes about the same). Downside being it seems to introduce significant noise: hyperfine -w 10 -L coreutils "target/release/cat_16k","target/release/cat_64k","target/release/cat_bypass","target/release/cat_base","target/release/cat_bypass_8k","/usr/bin/cat" "{coreutils} /tmp/threshold_test/file_16K*" --show-output
Benchmark 1: target/release/cat_16k /tmp/threshold_test/file_16K*
Time (mean ± σ): 347.5 ms ± 209.5 ms [User: 1.3 ms, System: 2.0 ms]
Range (min … max): 123.6 ms … 713.3 ms 10 runs
Benchmark 2: target/release/cat_64k /tmp/threshold_test/file_16K*
Time (mean ± σ): 125.3 ms ± 134.1 ms [User: 0.3 ms, System: 1.8 ms]
Range (min … max): 68.3 ms … 704.3 ms 21 runs
Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet system without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.
Benchmark 3: target/release/cat_bypass /tmp/threshold_test/file_16K*
Time (mean ± σ): 77.5 ms ± 16.7 ms [User: 0.4 ms, System: 0.9 ms]
Range (min … max): 44.6 ms … 101.3 ms 33 runs
Benchmark 4: target/release/cat_base /tmp/threshold_test/file_16K*
Time (mean ± σ): 109.1 ms ± 221.0 ms [User: 0.3 ms, System: 1.1 ms]
Range (min … max): 44.7 ms … 1710.4 ms 55 runs
Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet system without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.
Benchmark 5: target/release/cat_bypass_8k /tmp/threshold_test/file_16K*
Time (mean ± σ): 82.8 ms ± 23.6 ms [User: 0.3 ms, System: 1.3 ms]
Range (min … max): 52.9 ms … 154.0 ms 18 runs
Benchmark 6: /usr/bin/cat /tmp/threshold_test/file_16K*
Time (mean ± σ): 142.9 ms ± 212.9 ms [User: 0.3 ms, System: 1.2 ms]
Range (min … max): 54.0 ms … 1130.2 ms 27 runs
Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet system without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.
Summary
target/release/cat_bypass /tmp/threshold_test/file_16K* ran
1.07 ± 0.38 times faster than target/release/cat_bypass_8k /tmp/threshold_test/file_16K*
1.41 ± 2.87 times faster than target/release/cat_base /tmp/threshold_test/file_16K*
1.62 ± 1.77 times faster than target/release/cat_64k /tmp/threshold_test/file_16K*
1.84 ± 2.78 times faster than /usr/bin/cat /tmp/threshold_test/file_16K*
4.48 ± 2.87 times faster than target/release/cat_16k /tmp/threshold_test/file_16K* |
|
Does this fix #9609 too? |
e7d7f5d to
04ad03d
Compare
|
GNU testsuite comparison: |
Merging this PR will not alter performance
Comparing Footnotes
|
As I understand, this should fix the specific example presented, but I frankly don't know if a similar case could arrive in the future. All it takes is for stat() to wrongly report some size larger that the defined threshold. |
|
GNU testsuite comparison: |
Based on #10832
Avoids using splice on small files.
May result in a small perf improvement
Threshold set to 16KB, determined by trial and error: