Context
The Entireio CLI uses 256-bucket sharded storage for checkpoint metadata: <id[:2]>/<id[2:]>/metadata.json. This distributes files across directories to avoid performance degradation from large flat directory listings in Git.
Current State
Our checkpoint system stores data on the egg/checkpoints/v2 branch with a multi-dimensional index structure. As the number of checkpoints grows (especially with multi-agent pipelines producing many sessions per issue), the branch could accumulate significant data.
Proposal
Evaluate and potentially adopt sharded storage for checkpoint metadata:
- Shard by first 2 characters of checkpoint ID (256 buckets)
- Keeps directory sizes manageable for Git operations
- Improves
git ls-tree and git checkout -- path performance
- Consider also adding checkpoint pruning/archival for old data
This is a scalability concern — not urgent but worth addressing before the checkpoint branch becomes unwieldy.
Reference
See entireio/cli — sharded path structure in checkpoint storage.
Authored-by: egg
Context
The Entireio CLI uses 256-bucket sharded storage for checkpoint metadata:
<id[:2]>/<id[2:]>/metadata.json. This distributes files across directories to avoid performance degradation from large flat directory listings in Git.Current State
Our checkpoint system stores data on the
egg/checkpoints/v2branch with a multi-dimensional index structure. As the number of checkpoints grows (especially with multi-agent pipelines producing many sessions per issue), the branch could accumulate significant data.Proposal
Evaluate and potentially adopt sharded storage for checkpoint metadata:
git ls-treeandgit checkout -- pathperformanceThis is a scalability concern — not urgent but worth addressing before the checkpoint branch becomes unwieldy.
Reference
See
entireio/cli— sharded path structure in checkpoint storage.Authored-by: egg