Skip to content

Add Bazel benchmark target and CodSpeed CI workflow (frankslin/OpenCC#31)#1361

Open
frankslin wants to merge 2 commits into
BYVoid:masterfrom
frankslin:merged-dicts
Open

Add Bazel benchmark target and CodSpeed CI workflow (frankslin/OpenCC#31)#1361
frankslin wants to merge 2 commits into
BYVoid:masterfrom
frankslin:merged-dicts

Conversation

@frankslin

Copy link
Copy Markdown
Collaborator
  • Add Bazel benchmark target and CodSpeed CI workflow

Performance.cpp path resolution now checks env vars before compile-time macros, and in Bazel builds uses Runfiles::Create(argv0) to locate configs, dictionaries, test data, and the opencc CLI binary via the .runfiles/ directory produced by bazel build. Jieba benchmarks are excluded (OPENCC_BENCHMARK_JIEBA_CONFIG_DIR undefined); text_json variants are skipped gracefully during global init when runfiles are not yet available.

Detailed Changes:

  • Performance.cpp:
    • Add GetEnv() helper; add file-scope g_argv0/g_runfiles initialized in main() before benchmark execution.
    • All path functions follow: env var → Runfiles::Rlocation (BAZEL) → compile-time macro (CMake) → safe default.
    • Initialize() passes g_argv0 to SimpleConverter so its internal Runfiles::Create(argv0) resolves configs and dictionaries.
    • BuildBenchmarkConfigs() catches Exception& so text_json creation failures during global init are silently skipped.
  • src/benchmark/BUILD.bazel: new cc_binary; deps include google-benchmark, opencc libs, and @bazel_tools//tools/cpp/runfiles; data covers configs, dictionaries, test data, and command_line binary.
  • test/benchmark/BUILD.bazel: export zuozhuan.txt as benchmark_data.
  • .github/workflows/codspeed.yml: single Ubuntu job; bazel build //src/benchmark:performance; run binary directly via runfiles with only OPENCC_BENCHMARK_TEMP_DIR set; uses CODSPEED_TOKEN_FRANKSLIN secret.
  • Extend CodSpeed workflow trigger to opencc-wasm-develop branch

  • Wire CodSpeed compat library and fix benchmark registration order

Switch the benchmark binary to @codspeed_google_benchmark_compat (simulation mode) so CodSpeed can identify individual benchmark boundaries; add the required --compilation_mode=dbg and --copt=-O2 flags to the CI workflow. Move RegisterBenchmarks() out of the anonymous namespace and call it in main() after Bazel runfiles are initialised, so text_json variants are no longer silently skipped in Bazel builds.

Detailed Changes:

  • MODULE.bazel:
    • Add bazel_dep for codspeed_google_benchmark_compat 2.1.0 (dev_dependency).
  • src/benchmark/BUILD.bazel:
    • Replace //deps/google-benchmark:benchmark with @codspeed_google_benchmark_compat//:benchmark.
  • src/benchmark/Performance.cpp:
    • Remove kBenchmarksRegistered global; add DoRegisterBenchmarks() wrapper in opencc namespace.
    • Call opencc::DoRegisterBenchmarks() in main() after g_runfiles is initialised.
  • .github/workflows/codspeed.yml:
    • Pass --@codspeed_google_benchmark_compat//:codspeed_mode=simulation, --compilation_mode=dbg, --copt=-O2 to bazel build.
  • Fix CodSpeed benchmark naming and suppress DEBUG timing warning

Add Performance:: namespace prefix to all registered benchmark names so CodSpeed can identify their source file instead of falling back to unknown_file::. Update BenchmarkGroupName to strip the :: prefix before matching group labels. Add --copt=-DNDEBUG to the workflow build flags so NDEBUG is defined alongside --compilation_mode=dbg, eliminating Google Benchmark's spurious "Library was built as DEBUG" timing warning.

frankslin and others added 2 commits June 27, 2026 16:55
* Add Bazel benchmark target and CodSpeed CI workflow

Performance.cpp path resolution now checks env vars before compile-time
macros, and in Bazel builds uses Runfiles::Create(argv0) to locate
configs, dictionaries, test data, and the opencc CLI binary via the
.runfiles/ directory produced by bazel build. Jieba benchmarks are
excluded (OPENCC_BENCHMARK_JIEBA_CONFIG_DIR undefined); text_json
variants are skipped gracefully during global init when runfiles are not
yet available.

Detailed Changes:
- **Performance.cpp**:
  - Add GetEnv() helper; add file-scope g_argv0/g_runfiles initialized
    in main() before benchmark execution.
  - All path functions follow: env var → Runfiles::Rlocation (BAZEL) →
    compile-time macro (CMake) → safe default.
  - Initialize() passes g_argv0 to SimpleConverter so its internal
    Runfiles::Create(argv0) resolves configs and dictionaries.
  - BuildBenchmarkConfigs() catches Exception& so text_json creation
    failures during global init are silently skipped.
- **src/benchmark/BUILD.bazel**: new cc_binary; deps include
  google-benchmark, opencc libs, and @bazel_tools//tools/cpp/runfiles;
  data covers configs, dictionaries, test data, and command_line binary.
- **test/benchmark/BUILD.bazel**: export zuozhuan.txt as benchmark_data.
- **.github/workflows/codspeed.yml**: single Ubuntu job; bazel build
  //src/benchmark:performance; run binary directly via runfiles with only
  OPENCC_BENCHMARK_TEMP_DIR set; uses CODSPEED_TOKEN_FRANKSLIN secret.

* Extend CodSpeed workflow trigger to opencc-wasm-develop branch

* Wire CodSpeed compat library and fix benchmark registration order

Switch the benchmark binary to @codspeed_google_benchmark_compat (simulation
mode) so CodSpeed can identify individual benchmark boundaries; add the
required --compilation_mode=dbg and --copt=-O2 flags to the CI workflow.
Move RegisterBenchmarks() out of the anonymous namespace and call it in
main() after Bazel runfiles are initialised, so text_json variants are no
longer silently skipped in Bazel builds.

Detailed Changes:
- **MODULE.bazel**:
  - Add bazel_dep for codspeed_google_benchmark_compat 2.1.0 (dev_dependency).
- **src/benchmark/BUILD.bazel**:
  - Replace //deps/google-benchmark:benchmark with @codspeed_google_benchmark_compat//:benchmark.
- **src/benchmark/Performance.cpp**:
  - Remove kBenchmarksRegistered global; add DoRegisterBenchmarks() wrapper in opencc namespace.
  - Call opencc::DoRegisterBenchmarks() in main() after g_runfiles is initialised.
- **.github/workflows/codspeed.yml**:
  - Pass --@codspeed_google_benchmark_compat//:codspeed_mode=simulation, --compilation_mode=dbg, --copt=-O2 to bazel build.

* Fix CodSpeed benchmark naming and suppress DEBUG timing warning

Add Performance:: namespace prefix to all registered benchmark names so
CodSpeed can identify their source file instead of falling back to
unknown_file::. Update BenchmarkGroupName to strip the :: prefix before
matching group labels. Add --copt=-DNDEBUG to the workflow build flags so
NDEBUG is defined alongside --compilation_mode=dbg, eliminating Google
Benchmark's spurious "Library was built as DEBUG" timing warning.
Add LongTextConversionConfigs() covering all 14 conversion directions and
use it for BM_ConvertLongText, so every conversion chain (s2t, s2twp,
t2s, s2hk, jp2t, etc.) gets its own regression-detectable benchmark.
Restrict the CodSpeed workflow to BM_ConvertLongText.*ocd2: ocd2 format
means dictionaries are already loaded so the benchmark measures only the
mmseg segmentation and marisa-trie lookup CPU cost, not I/O. Initialization,
BM_Convert, and command-line benchmarks remain available for local runs
but are excluded from CodSpeed instrumentation.
@BYVoid

BYVoid commented Jun 28, 2026

Copy link
Copy Markdown
Owner

What does CodSpeed do

@frankslin

Copy link
Copy Markdown
Collaborator Author

What does CodSpeed do

CodSpeed helps detect performance regressions in code changes by running instrumented benchmarks in CI and comparing results over time.

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants