Skip to content

snappy: optimize UnalignedCopy64 and IncrementalCopy for RISC-V#3360

Open
Felix-Gong wants to merge 1 commit into
apache:masterfrom
Felix-Gong:riscv-snappy-opt
Open

snappy: optimize UnalignedCopy64 and IncrementalCopy for RISC-V#3360
Felix-Gong wants to merge 1 commit into
apache:masterfrom
Felix-Gong:riscv-snappy-opt

Conversation

@Felix-Gong

Copy link
Copy Markdown
Contributor

Summary

  • Optimize UnalignedCopy64() with direct ld/sd inline assembly for 8-byte copy
  • Optimize IncrementalCopy() with 8-byte bulk copies when source/dest don't overlap

Performance Improvement (direct function benchmark)

Scenario Baseline Optimized Improvement
Decompress compressible-256K 728 MB/s 2205 MB/s +203%
Decompress zeros-256K 543 MB/s 1462 MB/s +169%

Changes

  • src/butil/third_party/snappy/snappy-stubs-internal.h: +11 lines
  • src/butil/third_party/snappy/snappy.cc: +24 lines

Test Plan

  • brpc_snappy_compress_unittest passed (7/7)
  • Build verified on RISC-V (SOPHGO SG2044)

Use RISC-V inline assembly (ld/sd) for 8-byte copy operations instead
of generic macro-based implementation.

Changes:
- UnalignedCopy64: direct ld/sd pair for 8-byte copy
- IncrementalCopy: 8-byte bulk copies when source/dest don't overlap

Performance improvement (direct function benchmark):
- Decompress compressible-256K: 728 MB/s -> 2205 MB/s (+203%)
- Decompress zeros-256K: 543 MB/s -> 1462 MB/s (+169%)

Tests: brpc_snappy_compress_unittest passed (7/7)

Signed-off-by: Felix-Gong <gongxiaofei24@iscas.ac.cn>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant