Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
134 changes: 134 additions & 0 deletions AGENTS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,134 @@
# TiFlash Agent Guide

This document provides essential information for agentic coding tools operating in the TiFlash repository.
It focuses on the fastest safe path to build, test, and navigate the codebase.

## 🚀 Quick Start
- **Configure (preset):** `cmake --preset dev`
- **Build:**
* tiflash binary: `cmake --build --preset dev`
* unit test binary: `cmake --build --preset unit-tests`
- **Run one test:** `cmake-build-debug/dbms/gtests_dbms --gtest_filter=TestName.*`

## 🛠 Build & Development

TiFlash uses CMake with presets for configuration and building.

### Build Presets
Common presets defined in `CMakePresets.json`:
- `dev`: DEBUG build with tests enabled. (Recommended for development)
- `release`: RELWITHDEBINFO build without tests.
- `asan`: AddressSanitizer build.
- `tsan`: ThreadSanitizer build.
- `benchmarks`: RELEASE build with benchmarks.

### Dependencies & Versions
- **CMake/Ninja/Clang/LLVM/Python/Rust**: Use versions supported by your platform toolchain.
- **Linux vs macOS**: Toolchains live under `release-linux-llvm/` and `release-darwin/` respectively.
- **Submodules/third-party**: Ensure any required submodules are initialized before building.

### Common Commands
- **Configure & Build (Dev):** `cmake --workflow --preset dev`
- **Preset-only (recommended):**
- Configure: `cmake --preset dev`
- Build: `cmake --build --preset dev`
- **Manual Build:**
```bash
mkdir cmake-build-debug && cd cmake-build-debug
cmake .. -GNinja -DCMAKE_BUILD_TYPE=DEBUG -DENABLE_TESTS=ON
ninja tiflash
```
- **Linting & Formatting:**
- Format diff: `python3 format-diff.py --diff_from $(git merge-base upstream/master HEAD)`
(Use `origin/master` or another base if `upstream` is not configured.)
- Clang-Tidy: `python3 release-linux-llvm/scripts/run-clang-tidy.py -p cmake-build-debug`

## 🧪 Testing

### Unit Tests (Google Test)
Build targets: `gtests_dbms`, `gtests_libdaemon`, `gtests_libcommon`.

- **Run all:** `cmake-build-debug/dbms/gtests_dbms`
- **Run single test:** `cmake-build-debug/dbms/gtests_dbms --gtest_filter=TestName.*`
- **List tests:** `cmake-build-debug/dbms/gtests_dbms --gtest_list_tests`
- **Parallel runner:**
```bash
python3 tests/gtest_10x.py cmake-build-debug/dbms/gtests_dbms
```
- **Other targets:**
- `cmake-build-debug/dbms/gtests_libdaemon`
- `cmake-build-debug/dbms/gtests_libcommon`

### Integration Tests
See `tests/AGENTS.md` for prerequisites and usage.

### Sanitizers
When running with ASAN/TSAN, use suppression files:
```bash
LSAN_OPTIONS="suppressions=tests/sanitize/asan.suppression" ./dbms/gtests_dbms
TSAN_OPTIONS="suppressions=tests/sanitize/tsan.suppression" ./dbms/gtests_dbms
```

## 🎨 Code Style (C++)

TiFlash follows a style based on Google, enforced by `clang-format` 17.0.0+.

### General
- **Naming:**
- Classes/Structs: `PascalCase` (e.g., `StorageDeltaMerge`)
- Methods/Variables: `camelCase` (e.g., `readBlock`, `totalBytes`)
- Files: `PascalCase` matching class name (e.g., `StorageDeltaMerge.cpp`)
- **Namespaces:** Primary code resides in `namespace DB`.
- **Headers:** Use `#pragma once`. Use relative paths from `dbms/src` (e.g., `#include <Core/Types.h>`).

### Types & Error Handling
- **Types:** Use explicit width types from `dbms/src/Core/Types.h`: `UInt8`, `UInt32`, `Int64`, `Float64`, `String`.
- **Smart Pointers:** Prefer `std::shared_ptr` and `std::unique_ptr`. Use `std::make_shared` and `std::make_unique`.
- **Error Handling:**
- Use `DB::Exception`.
- Prefer the fmt-style constructor with error code first: `throw Exception(ErrorCodes::SOME_CODE, "Message with {}", arg);`
- For fixed strings without formatting, `throw Exception("Message", ErrorCodes::SOME_CODE);` is still acceptable.
- Error codes are defined in `dbms/src/Common/ErrorCodes.cpp` and `errors.toml`.
- In broad `catch (...)` paths, prefer `tryLogCurrentException(log, "context")` to avoid duplicated exception-formatting code.
- **Logging:** Use macros like `LOG_INFO(log, "message {}", arg)`. `log` is usually a `DB::LoggerPtr`.
- When only log level differs by runtime condition, prefer `LOG_IMPL(log, level, ...)` (with `Poco::Message::Priority`) instead of duplicated `if/else` log blocks.

### Modern C++ Practices
- Prefer `auto` for complex iterators/templates.
- Use `std::string_view` for read-only string parameters.
- Use `fmt::format` for string construction.
- Prefer `FmtBuffer` for complex string building in performance-critical paths.

## 🦀 Rust Code
Rust is used in:
- `contrib/tiflash-proxy`: The proxy layer between TiFlash and TiKV.
- `contrib/tiflash-proxy-next-gen`: Disaggregated architecture components.

Follow standard Rust idioms and `cargo fmt`. Use `cargo clippy` for linting.

## 📚 Module-Specific Guides
For more detailed information on specific subsystems, refer to:
- **Storage Engine**: `dbms/src/Storages/AGENTS.md` (DeltaMerge, KVStore, PageStorage)
- **Computation Engine**: `dbms/src/Flash/AGENTS.md` (Planner, MPP, Pipeline)
- **TiDB Integration**: `dbms/src/TiDB/AGENTS.md` (Schema Sync, Decoding, Collation)
- **Testing Utilities**: `dbms/src/TestUtils/AGENTS.md` (Base classes, Mocking, Data generation)

## 📂 Directory Structure
- `dbms/src`: Main TiFlash C++ source code.
- `libs/`: Shared libraries used by TiFlash.
- `tests/`: Integration and unit test utilities.
- `docs/`: Design and development documentation.
- `release-linux-llvm/`: Build scripts and environment configurations for Linux.

## 💡 Debugging Tips
- **LLDB:** Use to debug crashes or hangs.
- **Coredumps:** Ensure coredumps are enabled in your environment.
- **Failpoints:** TiFlash uses failpoints and syncpoints for testing error paths.
- Search for `FAIL_POINT_TRIGGER_EXCEPTION` or `FAIL_POINT_PAUSE` for failpoints in the code.
- Search for `SyncPointCtl` or `SYNC_FOR` for syncpoints in the code.
- **Build artifacts:** If `compile_commands.json` is missing, ensure you configured with a preset.

## 📖 References
- `docs/DEVELOPMENT.md`: General engineering practices.
- `docs/design/`: Design documents for major features.
- [TiDB Developer Guide](https://pingcap.github.io/tidb-dev-guide/): General TiDB ecosystem information.
9 changes: 9 additions & 0 deletions dbms/src/Common/TiFlashMetrics.h
Original file line number Diff line number Diff line change
Expand Up @@ -845,6 +845,15 @@ static_assert(RAFT_REGION_BIG_WRITE_THRES * 4 < RAFT_REGION_BIG_WRITE_MAX, "Inva
F(type_to_finished, {"type", "to_finished"}), \
F(type_to_error, {"type", "to_error"}), \
F(type_to_cancelled, {"type", "to_cancelled"})) \
M(tiflash_storage_s3_lock_mgr_status, "S3 Lock Manager", Gauge, F(type_prelock_keys, {{"type", "prelock_keys"}})) \
M(tiflash_storage_s3_lock_mgr_counter, \
"S3 Lock Manager Counter", \
Counter, \
F(type_create_lock_local, {{"type", "create_lock_local"}}), \
F(type_create_lock_ingest, {{"type", "create_lock_ingest"}}), \
F(type_clean_lock, {{"type", "clean_lock"}}), \
F(type_clean_lock_erase_hit, {{"type", "clean_lock_erase_hit"}}), \
F(type_clean_lock_erase_miss, {{"type", "clean_lock_erase_miss"}})) \
M(tiflash_storage_s3_gc_status, \
"S3 GC status", \
Gauge, \
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -1283,9 +1283,9 @@ SegmentPtr DeltaMergeStore::segmentMergeDelta(

if (!isSegmentValid(lock, segment))
{
LOG_DEBUG(
LOG_INFO(
log,
"MergeDelta - Give up segmentMergeDelta because segment not valid, segment={}",
"MergeDelta - Give up segmentMergeDelta because segment not valid, reason=concurrent_update segment={}",
segment->simpleInfo());
wbs.setRollback();
return {};
Expand Down
25 changes: 14 additions & 11 deletions dbms/src/Storages/Page/V3/PageDirectory.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,7 @@
#include <shared_mutex>
#include <type_traits>
#include <utility>
#include <vector>


#ifdef FIU_ENABLE
Expand Down Expand Up @@ -1612,6 +1613,13 @@ std::unordered_set<String> PageDirectory<Trait>::apply(PageEntriesEdit && edit,
CurrentMetrics::Increment pending_writer_size{CurrentMetrics::PSPendingWriterNum};
Writer w;
w.edit = &edit;
// Capture this writer's checkpoint data_file_ids before write-group merge.
// Followers' edit objects are cleared by the owner during merge.
for (const auto & r : edit.getRecords())
{
if (r.entry.checkpoint_info.has_value())
w.applied_data_files.emplace(*r.entry.checkpoint_info.data_location.data_file_id);
}

Stopwatch watch;
std::unique_lock apply_lock(apply_mutex);
Expand Down Expand Up @@ -1639,9 +1647,9 @@ std::unordered_set<String> PageDirectory<Trait>::apply(PageEntriesEdit && edit,
throw Exception(ErrorCodes::LOGICAL_ERROR, "Unknown exception");
}
}
// the `applied_data_files` will be returned by the write
// group owner, others just return an empty set.
return {};
// Return per-writer ids instead of merged-group ids, so upper-layer
// lock cleanup can always clean locks created by this writer.
return std::move(w.applied_data_files);
}

/// This thread now is the write group owner, build the group. It will merge the
Expand Down Expand Up @@ -1703,7 +1711,6 @@ std::unordered_set<String> PageDirectory<Trait>::apply(PageEntriesEdit && edit,
});

SYNC_FOR("before_PageDirectory::apply_to_memory");
std::unordered_set<String> applied_data_files;
{
std::unique_lock table_lock(table_rw_mutex);

Expand Down Expand Up @@ -1775,12 +1782,6 @@ std::unordered_set<String> PageDirectory<Trait>::apply(PageEntriesEdit && edit,
"should not handle edit with invalid type, type={}",
magic_enum::enum_name(r.type));
}

// collect the applied remote data_file_ids
if (r.entry.checkpoint_info.has_value())
{
applied_data_files.emplace(*r.entry.checkpoint_info.data_location.data_file_id);
}
}
catch (DB::Exception & e)
{
Expand All @@ -1800,7 +1801,9 @@ std::unordered_set<String> PageDirectory<Trait>::apply(PageEntriesEdit && edit,
}

success = true;
return applied_data_files;
// Even for write-group owner, return only this writer's pre-captured ids.
// Other writers return their own ids in the `w.done` branch above.
return std::move(w.applied_data_files);
}

template <typename Trait>
Expand Down
3 changes: 3 additions & 0 deletions dbms/src/Storages/Page/V3/PageDirectory.h
Original file line number Diff line number Diff line change
Expand Up @@ -628,6 +628,9 @@ class PageDirectory
struct Writer
{
PageEntriesEdit * edit;
// Keep per-writer checkpoint lock keys before write-group merge so
// followers can still return their own applied ids for lock cleanup.
std::unordered_set<String> applied_data_files;
bool done = false; // The work has been performed by other thread
bool success = false; // The work complete successfully
std::unique_ptr<DB::Exception> exception;
Expand Down
Loading