Skip to content

Adds Cache Groups concepts to Cripts#12743

Open
zwoop wants to merge 5 commits intoapache:masterfrom
zwoop:CacheGroup
Open

Adds Cache Groups concepts to Cripts#12743
zwoop wants to merge 5 commits intoapache:masterfrom
zwoop:CacheGroup

Conversation

@zwoop
Copy link
Contributor

@zwoop zwoop commented Dec 10, 2025

This is a second version, since half of the original patch was merged in a separate PR.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces Cache Groups functionality to Cripts, providing infrastructure for managing associations between cache entries using custom identifiers. This implementation follows an emerging RFC draft for cache group invalidation patterns in HTTP caching.

Key changes include:

  • New Cache::Group class with disk persistence, rotating hash maps, and configurable aging policies
  • Thread-safe operations using shared_mutex with automatic periodic syncing to disk
  • Example implementation and comprehensive documentation for using Cache Groups in Cripts

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 11 comments.

Show a summary per file
File Description
src/cripts/CacheGroup.cc Core implementation with hash-based storage, disk I/O, transaction logging, and Manager singleton for lifecycle management
include/cripts/CacheGroup.hpp Public API defining the Group class with Insert/Lookup methods and nested Manager class
src/cripts/CMakeLists.txt Adds CacheGroup.hpp to the list of public headers
include/cripts/Matcher.hpp Includes algorithm header (duplicate include)
example/cripts/cache_groups.cc Working example demonstrating Cache Groups for cache invalidation workflows
doc/developer-guide/cripts/cripts-misc.en.rst Documentation explaining Cache Groups concept and usage patterns

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Contributor

@bryancall bryancall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding Cache Groups to Cripts! This is a useful feature for implementing cache group invalidation patterns. I have a few observations:

Critical Bug: Iterator increment after erase()

In _cripts_cache_group_sync (line ~50-60), there's an iterator invalidation bug:

for (auto it = groups.begin(); it != groups.end() && processed < max_to_process; ++it) {
    if (auto group = it->second.lock()) {
      // ...
    } else {
      it = groups.erase(it);  // Returns next iterator, then loop does ++it, skipping an element
    }
}

When erase() returns the next valid iterator and then the loop increments again, an element gets skipped. Consider:

for (auto it = groups.begin(); it != groups.end() && processed < max_to_process; ) {
    if (auto group = it->second.lock()) {
      // ...
      ++it;
    } else {
      it = groups.erase(it);  // Don't increment here
    }
}

Error Handling

  1. Missing file read error checking in LoadFromDisk(): The file.read() calls (lines ~229-232 and ~241) don't check if reads succeeded. If the file is corrupted or truncated, uninitialized data gets used.

  2. clearLog() called unconditionally in WriteToDisk(): If syncMap() fails, the transaction log is still cleared, which could lead to data loss. Consider only clearing the log after all syncs succeed.

  3. Inconsistent error reporting: Line 318 uses std::cerr while the rest of the code uses TSWarning. Should be consistent.

  4. Missing filesystem error handling in Initialize() (lines ~85-86): create_directories and permissions can throw or fail silently. The clearLog() method (line 363) shows the correct pattern using error_code overloads.

Documentation & Style

  1. Spelling: "hodling" → "holding" (line 289), "assosication" → "association" (docs line 421)
  2. Duplicate #include <algorithm> in Matcher.hpp
  3. The magic number 63072000 (2 years in seconds) appears multiple times - consider a named constant with documentation explaining the choice

Testing

This is a significant feature with complex persistence logic (disk serialization, transaction log replay, crash recovery). Would be good to have automated test coverage for these code paths.

Minor API Note

The Factory() returning void* requiring manual delete in do_delete_instance() is a bit awkward, but I understand this may be constrained by the Cripts plugin interface.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 15 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

zwoop added 2 commits February 3, 2026 13:43
- cleans up the notion around cached URLs and headers, and cache keys.
- adds APIs to set the lookup status as well
cmcfarlen
cmcfarlen previously approved these changes Feb 9, 2026
bryancall

This comment was marked as outdated.

bryancall

This comment was marked as outdated.

Copy link
Contributor

@bryancall bryancall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates here. After another pass, I still see two blocking concerns that were previously raised.

  1. Durability and data-loss risk in write path
  • WriteToDisk updates last_sync before syncMap confirms success, and clearLog is called unconditionally at the end.
  • If map sync or rename fails, this can both suppress retries and drop transaction-log recovery state.
  • Please gate last_sync advancement and log truncation on successful persistence of all required slots.
  1. Missing automated coverage for persistence and recovery semantics
  • This feature adds significant on-disk behavior (serialization, map rotation, transaction replay, crash and restart recovery), but there is still no targeted test coverage for these paths.
  • Please add tests for truncated or corrupt map files, sync or rename failure handling, and log replay correctness across restart.

The iterator-erase issue in the periodic sync path looks fixed; thanks for addressing that.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 7 comments.

@bryancall bryancall dismissed stale reviews from themself February 26, 2026 23:36

Duplicate malformed review; superseded by latest review.

WriteToDisk previously updated last_sync and cleared the transaction
log before confirming syncMap succeeded. On rename failure the log
would be lost even though the map was never written to disk.

syncMap now returns bool; WriteToDisk reverts last_sync on failure and
only calls clearLog when all dirty slots have been successfully synced.

Adds Catch2 unit tests covering basic insert/lookup, persist-and-reload,
transaction log replay across restarts, corrupt/truncated/wrong-version
map file handling, sync failure log preservation, and map rotation.

More unit tests will be done in a future PR.
Add missing standard headers in CacheGroup.cc. Use std::error_code
overload for filesystem::remove in syncMap error paths. Add <filesystem>
and <system_error> to CacheGroup.hpp. Use fixed-width types in _MapHeader
for a stable on-disk format. Fix max_to_process calculation in the sync
continuation to spread groups evenly across the sync window.
Copy link
Contributor

@bryancall bryancall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything else looks good -- persistence tests are solid, error handling is thorough, and all 8 existing tests pass clean with ASAN on Fedora 43. Just one bug in the revert logic that needs fixing.

Other minor observations (non-blocking, could be follow-up):

  • _Entry::timestamp (time_point) and _Entry::length (size_t) are written directly to disk but aren't fixed-width types like _MapHeader uses. The VERSION field provides a migration path, so this is fine for now.
  • The Cache Groups RST section is missing a .. _cripts-misc-cache-groups: anchor label for consistency with other sections.

@zwoop
Copy link
Contributor Author

zwoop commented Mar 6, 2026

[approve ci freebsd]

@zwoop
Copy link
Contributor Author

zwoop commented Mar 6, 2026

Everything else looks good -- persistence tests are solid, error handling is thorough, and all 8 existing tests pass clean with ASAN on Fedora 43. Just one bug in the revert logic that needs fixing.

Other minor observations (non-blocking, could be follow-up):

  • _Entry::timestamp (time_point) and _Entry::length (size_t) are written directly to disk but aren't fixed-width types like _MapHeader uses. The VERSION field provides a migration path, so this is fine for now.
  • The Cache Groups RST section is missing a .. _cripts-misc-cache-groups: anchor label for consistency with other sections.

Added the docs link, ignoring the first one (these files are transient, and most certainly won't go between hosts or architectures or ATS versions.

Fix sync retry logic: revert last_sync to its previous value on
syncMap failure instead of setting it to last_write. The old code
made the slot appear clean, preventing retries until a new Insert
bumped last_write again.
@zwoop zwoop requested a review from bryancall March 7, 2026 03:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

4 participants