Adds Cache Groups concepts to Cripts by zwoop · Pull Request #12743 · apache/trafficserver

zwoop · 2025-12-10T00:07:56Z

This is a second version, since half of the original patch was merged in a separate PR.

Copilot

Pull request overview

This PR introduces Cache Groups functionality to Cripts, providing infrastructure for managing associations between cache entries using custom identifiers. This implementation follows an emerging RFC draft for cache group invalidation patterns in HTTP caching.

Key changes include:

New Cache::Group class with disk persistence, rotating hash maps, and configurable aging policies
Thread-safe operations using shared_mutex with automatic periodic syncing to disk
Example implementation and comprehensive documentation for using Cache Groups in Cripts

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 11 comments.

Show a summary per file

File	Description
src/cripts/CacheGroup.cc	Core implementation with hash-based storage, disk I/O, transaction logging, and Manager singleton for lifecycle management
include/cripts/CacheGroup.hpp	Public API defining the Group class with Insert/Lookup methods and nested Manager class
src/cripts/CMakeLists.txt	Adds CacheGroup.hpp to the list of public headers
include/cripts/Matcher.hpp	Includes algorithm header (duplicate include)
example/cripts/cache_groups.cc	Working example demonstrating Cache Groups for cache invalidation workflows
doc/developer-guide/cripts/cripts-misc.en.rst	Documentation explaining Cache Groups concept and usage patterns

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

src/cripts/CacheGroup.cc

include/cripts/CacheGroup.hpp

src/cripts/CacheGroup.cc

include/cripts/Matcher.hpp

doc/developer-guide/cripts/cripts-misc.en.rst

bryancall

Thanks for adding Cache Groups to Cripts! This is a useful feature for implementing cache group invalidation patterns. I have a few observations:

Critical Bug: Iterator increment after erase()

In _cripts_cache_group_sync (line ~50-60), there's an iterator invalidation bug:

for (auto it = groups.begin(); it != groups.end() && processed < max_to_process; ++it) {
    if (auto group = it->second.lock()) {
      // ...
    } else {
      it = groups.erase(it);  // Returns next iterator, then loop does ++it, skipping an element
    }
}

When erase() returns the next valid iterator and then the loop increments again, an element gets skipped. Consider:

for (auto it = groups.begin(); it != groups.end() && processed < max_to_process; ) {
    if (auto group = it->second.lock()) {
      // ...
      ++it;
    } else {
      it = groups.erase(it);  // Don't increment here
    }
}

Error Handling

Missing file read error checking in LoadFromDisk(): The file.read() calls (lines ~229-232 and ~241) don't check if reads succeeded. If the file is corrupted or truncated, uninitialized data gets used.
clearLog() called unconditionally in WriteToDisk(): If syncMap() fails, the transaction log is still cleared, which could lead to data loss. Consider only clearing the log after all syncs succeed.
Inconsistent error reporting: Line 318 uses std::cerr while the rest of the code uses TSWarning. Should be consistent.
Missing filesystem error handling in Initialize() (lines ~85-86): create_directories and permissions can throw or fail silently. The clearLog() method (line 363) shows the correct pattern using error_code overloads.

Documentation & Style

Spelling: "hodling" → "holding" (line 289), "assosication" → "association" (docs line 421)
Duplicate #include <algorithm> in Matcher.hpp
The magic number 63072000 (2 years in seconds) appears multiple times - consider a named constant with documentation explaining the choice

Testing

This is a significant feature with complex persistence logic (disk serialization, transaction log replay, crash recovery). Would be good to have automated test coverage for these code paths.

Minor API Note

The Factory() returning void* requiring manual delete in do_delete_instance() is a bit awkward, but I understand this may be constrained by the Cripts plugin interface.

Copilot

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 15 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

include/cripts/CacheGroup.hpp

example/cripts/cache_groups.cc

doc/developer-guide/cripts/cripts-misc.en.rst

src/cripts/CacheGroup.cc

include/cripts/CacheGroup.hpp

src/cripts/CacheGroup.cc

- cleans up the notion around cached URLs and headers, and cache keys. - adds APIs to set the lookup status as well

bryancall

Thanks for the updates here. After another pass, I still see two blocking concerns that were previously raised.

Durability and data-loss risk in write path

WriteToDisk updates last_sync before syncMap confirms success, and clearLog is called unconditionally at the end.
If map sync or rename fails, this can both suppress retries and drop transaction-log recovery state.
Please gate last_sync advancement and log truncation on successful persistence of all required slots.

Missing automated coverage for persistence and recovery semantics

This feature adds significant on-disk behavior (serialization, map rotation, transaction replay, crash and restart recovery), but there is still no targeted test coverage for these paths.
Please add tests for truncated or corrupt map files, sync or rename failure handling, and log replay correctness across restart.

The iterator-erase issue in the periodic sync path looks fixed; thanks for addressing that.

Copilot

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 7 comments.

src/cripts/CacheGroup.cc

include/cripts/CacheGroup.hpp

src/cripts/CacheGroup.cc

Duplicate malformed review; superseded by latest review.

WriteToDisk previously updated last_sync and cleared the transaction log before confirming syncMap succeeded. On rename failure the log would be lost even though the map was never written to disk. syncMap now returns bool; WriteToDisk reverts last_sync on failure and only calls clearLog when all dirty slots have been successfully synced. Adds Catch2 unit tests covering basic insert/lookup, persist-and-reload, transaction log replay across restarts, corrupt/truncated/wrong-version map file handling, sync failure log preservation, and map rotation. More unit tests will be done in a future PR.

Add missing standard headers in CacheGroup.cc. Use std::error_code overload for filesystem::remove in syncMap error paths. Add <filesystem> and <system_error> to CacheGroup.hpp. Use fixed-width types in _MapHeader for a stable on-disk format. Fix max_to_process calculation in the sync continuation to spread groups evenly across the sync window.

Addressed.

bryancall

Everything else looks good -- persistence tests are solid, error handling is thorough, and all 8 existing tests pass clean with ASAN on Fedora 43. Just one bug in the revert logic that needs fixing.

Other minor observations (non-blocking, could be follow-up):

_Entry::timestamp (time_point) and _Entry::length (size_t) are written directly to disk but aren't fixed-width types like _MapHeader uses. The VERSION field provides a migration path, so this is fine for now.
The Cache Groups RST section is missing a .. _cripts-misc-cache-groups: anchor label for consistency with other sections.

src/cripts/CacheGroup.cc

zwoop · 2026-03-06T18:16:54Z

[approve ci freebsd]

zwoop · 2026-03-06T18:47:24Z

Everything else looks good -- persistence tests are solid, error handling is thorough, and all 8 existing tests pass clean with ASAN on Fedora 43. Just one bug in the revert logic that needs fixing.

Other minor observations (non-blocking, could be follow-up):

_Entry::timestamp (time_point) and _Entry::length (size_t) are written directly to disk but aren't fixed-width types like _MapHeader uses. The VERSION field provides a migration path, so this is fine for now.

The Cache Groups RST section is missing a .. _cripts-misc-cache-groups: anchor label for consistency with other sections.

Added the docs link, ignoring the first one (these files are transient, and most certainly won't go between hosts or architectures or ATS versions.

Fix sync retry logic: revert last_sync to its previous value on syncMap failure instead of setting it to last_write. The old code made the slot appear clean, preventing retries until a new Insert bumped last_write again.

zwoop added this to the 10.2.0 milestone Dec 10, 2025

zwoop self-assigned this Dec 10, 2025

zwoop added the Cripts label Dec 10, 2025

bryancall requested review from bryancall, cmcfarlen and Copilot December 15, 2025 22:50

Copilot started reviewing on behalf of bryancall December 15, 2025 22:51 View session

bryancall added the New Feature label Dec 15, 2025

Copilot AI reviewed Dec 15, 2025

View reviewed changes

bryancall requested changes Dec 16, 2025

View reviewed changes

zwoop force-pushed the CacheGroup branch from f55de47 to 12c5cc5 Compare December 22, 2025 17:38

zwoop requested a review from Copilot December 22, 2025 17:38

Copilot started reviewing on behalf of zwoop December 22, 2025 17:38 View session

Copilot AI reviewed Dec 22, 2025

View reviewed changes

zwoop force-pushed the CacheGroup branch from 12c5cc5 to 6eb4079 Compare December 30, 2025 22:19

zwoop added 2 commits February 3, 2026 13:43

Adds Cache Groups concepts to Cripts

a7d4224

- cleans up the notion around cached URLs and headers, and cache keys. - adds APIs to set the lookup status as well

Changes from code review

982fc6d

zwoop force-pushed the CacheGroup branch from 6eb4079 to 982fc6d Compare February 3, 2026 21:10

cmcfarlen previously approved these changes Feb 9, 2026

View reviewed changes

cmcfarlen added this to ATS v10.2.x Feb 10, 2026

bryancall requested review from bryancall and Copilot February 26, 2026 23:29

Copilot started reviewing on behalf of bryancall February 26, 2026 23:30 View session

This comment was marked as outdated.

Sign in to view

bryancall previously requested changes Feb 26, 2026

View reviewed changes

Copilot AI reviewed Feb 26, 2026

View reviewed changes

zwoop dismissed cmcfarlen’s stale review via cb279c4 February 27, 2026 16:23

cmcfarlen requested review from bryancall and cmcfarlen March 4, 2026 14:27

bryancall requested changes Mar 5, 2026

View reviewed changes

src/cripts/CacheGroup.cc Outdated Show resolved Hide resolved

zwoop force-pushed the CacheGroup branch from 5ec23ba to 52078ee Compare March 6, 2026 18:46

Address bryancall's review comments

8e4703c

Fix sync retry logic: revert last_sync to its previous value on syncMap failure instead of setting it to last_write. The old code made the slot appear clean, preventing retries until a new Insert bumped last_write again.

zwoop force-pushed the CacheGroup branch from 52078ee to 8e4703c Compare March 6, 2026 19:28

zwoop requested a review from bryancall March 7, 2026 03:59

Conversation

zwoop commented Dec 10, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

bryancall left a comment

Choose a reason for hiding this comment

Critical Bug: Iterator increment after erase()

Error Handling

Documentation & Style

Testing

Minor API Note

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

This comment was marked as outdated.

Uh oh!

This comment was marked as outdated.

Uh oh!

bryancall left a comment

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

bryancall left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

zwoop commented Mar 6, 2026

Uh oh!

zwoop commented Mar 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants