bthread: implement lock-free AtomicInteger128 for RISC-V by Felix-Gong · Pull Request #3355 · apache/brpc

Felix-Gong · 2026-06-24T09:05:18Z

Summary

Replace mutex-based fallback with seqlock implementation for RISC-V platform, bringing it in line with x86 (SSE) and ARM (NEON) lock-free implementations.

Changes

Implement seqlock-based atomic 128-bit load/store for RISC-V
Use fence instructions for proper memory ordering
Maintain backward compatibility with x86/ARM (no changes to existing code paths)

Seqlock Algorithm

Reader: read sequence → read data → verify sequence, retry if stale
Writer: increment sequence → write data → increment sequence
All memory accesses use fence r, rw (acquire) and fence w, w (release) for ordering

Performance Improvement

Tested on SOPHGO SG2044 (RISC-V rv64gcv):

bthread_start_urgent latency: ~1.5-2% reduction
adding_func throughput: ~25% improvement (541ns vs 722ns)

Test Plan

RISC-V cross-compilation verification
Native compilation on RISC-V server
Official brpc unit tests (bthread_unittest, test_butil, brpc_server_unittest, etc.)
Performance benchmark on RISC-V hardware

Notes

All changes are wrapped in #elif defined(__riscv) preprocessor guards
x86/ARM behavior is completely unchanged
The _seq member is added to AtomicInteger128 class but only used on RISC-V (other platforms have (void)_seq; to suppress warnings)

Copilot

Pull request overview

Implements a lock-free (seqlock-based) 128-bit load/store path for bthread::AtomicInteger128 on RISC-V, replacing the prior mutex fallback to improve performance when reading/writing _cpu_time_stat in TaskGroup.

Changes:

Adds a sequence counter (_seq) to AtomicInteger128 to support seqlock on RISC-V.
Updates AtomicInteger128::load() / store() with RISC-V inline-asm implementations and fences.
Keeps existing x86_64 (SSE) and ARM NEON code paths unchanged.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File	Description
src/bthread/task_group.h	Adds `_seq` member and updates the mutex comment for non-lock-free architectures.
src/bthread/task_group.cpp	Implements RISC-V seqlock-based 128-bit load/store paths alongside existing x86/ARM implementations.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Felix-Gong · 2026-06-25T07:13:22Z

+    __asm__ volatile(
+        "fence w, w\n\t"
+        "sd %0, %1\n\t"
+        :
+        : "r"(new_seq), "m"(_seq)
+        : "memory"
+    );
+    __asm__ volatile("fence w, w\n\t" ::: "memory");
+    __asm__ volatile(
+        "sd %0, %2\n\t"
+        "sd %1, %3\n\t"
+        :
+        : "r"(value.v1), "r"(value.v2),
+          "m"(_value.v1), "m"(_value.v2)
+        : "memory"
+    );
+    __asm__ volatile("fence w, w\n\t" ::: "memory");
+    new_seq++;
+    __asm__ volatile(
+        "sd %0, %1\n\t"
+        :
+        : "r"(new_seq), "m"(_seq)
+        : "memory"
+    );


Fixed. For sd (store) instructions, memory operands are now correctly declared as outputs using "=m" constraints. For ld (load) instructions, memory operands remain as inputs ("m") since ld only reads memory.

sd %1, %0 with : "=m"(_seq) : "r"(new_seq) - memory is output

ld %0, %1 with : "=r"(seq0) : "m"(_seq) - memory is input

This correctly informs the compiler about which memory locations are modified.

Felix-Gong · 2026-06-25T07:11:01Z

    return {value[0], value[1]};
-#else // __x86_64__ || __ARM_NEON
-    // RISC-V and other architectures use mutex fallback
+#elif defined(__riscv)


Fixed. Added && __riscv_xlen == 64 guard to restrict this path to RV64 only. RV32 will fall back to the mutex implementation.

Felix-Gong · 2026-06-25T07:13:10Z

+    (void)_seq;
    int64x2_t v = vld1q_s64(reinterpret_cast<int64_t*>(&value));
    vst1q_s64(reinterpret_cast<int64_t*>(&_value), v);
+#elif defined(__riscv)


Fixed. Added && __riscv_xlen == 64 guard for store() as well. RV32 builds will use the mutex fallback.

Replace mutex-based fallback with seqlock implementation for RISC-V platform. This brings RISC-V in line with x86 (SSE) and ARM (NEON) lock-free implementations. Seqlock algorithm: - Reader: read sequence, read data, verify sequence, retry if stale - Writer: increment sequence, write data, increment sequence - All memory accesses use fence instructions for ordering Performance improvement: - bthread_start_urgent latency: 1.5-2% reduction - adding_func throughput: 25% improvement (541ns vs 722ns)

- Use "m" (input) for ld instructions reading memory - Use "=m" (output) for sd instructions writing memory - Swap operand order in sd templates to match RISC-V syntax (register, memory) The previous "+m" constraints in input operand positions caused compilation errors on RISC-V. This fix ensures proper constraint usage: - ld: memory is input "m", register is output "=r" - sd: register is input "r", memory is output "=m"

Felix-Gong force-pushed the riscv-atomic128-opt branch 2 times, most recently from 2a4b97f to 5cd6d6a Compare June 24, 2026 09:50

chenBright requested a review from Copilot June 24, 2026 11:17

Copilot started reviewing on behalf of chenBright June 24, 2026 11:18 View session

Copilot AI reviewed Jun 24, 2026

View reviewed changes

Felix-Gong force-pushed the riscv-atomic128-opt branch from 5cd6d6a to aee927b Compare June 25, 2026 00:22

Felix-Gong force-pushed the riscv-atomic128-opt branch from 4fa5409 to d7841bf Compare June 25, 2026 05:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

bthread: implement lock-free AtomicInteger128 for RISC-V#3355

bthread: implement lock-free AtomicInteger128 for RISC-V#3355
Felix-Gong wants to merge 2 commits into
apache:masterfrom
Felix-Gong:riscv-atomic128-opt

Felix-Gong commented Jun 24, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Felix-Gong Jun 25, 2026

Uh oh!

Felix-Gong Jun 25, 2026

Uh oh!

Felix-Gong Jun 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

Felix-Gong commented Jun 24, 2026

Summary

Changes

Seqlock Algorithm

Performance Improvement

Test Plan

Notes

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Felix-Gong Jun 25, 2026

Choose a reason for hiding this comment

Uh oh!

Felix-Gong Jun 25, 2026

Choose a reason for hiding this comment

Uh oh!

Felix-Gong Jun 25, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants