libCacheSim is fast with the features from underlying libCacheSim lib:
- High performance - over 20M requests/sec for a realistic trace replay
- High memory efficiency - predictable and small memory footprint
- Parallelism out-of-the-box - uses the many CPU cores to speed up trace analysis and cache simulations
libCacheSim is flexible and easy to use with:
- Seamless integration with open-source cache dataset consisting of thousands traces hosted on S3
- High-throughput simulation with the underlying libCacheSim lib
- Detailed cache requests and other internal data control
- Customized plugin cache development without any compilation
- OS: Linux / macOS
- Python: 3.9 -- 3.13
Binary installers for the latest released version are available at the Python Package Index (PyPI).
pip install libcachesimIt's recommended to use uv, a very fast Python environment manager, to create and manage Python environments:
uv venv --python 3.12 --seed
source .venv/bin/activate
uv pip install libcachesimFor users who want to run LRB, ThreeLCache, and GLCache eviction algorithms:
!!! important
If uv cannot find built wheels for your machine, the building system will skip these algorithms by default.
To enable them, you need to install all third-party dependencies first:
git clone https://github.com/cacheMon/libCacheSim-python.git
cd libCacheSim-python
bash scripts/install_deps.sh
# If you cannot install software directly (e.g., no sudo access)
bash scripts/install_deps_user.shThen, you can reinstall libcachesim using the following commands (may need to add --no-cache-dir to force it to build from scratch):
# Enable LRB
CMAKE_ARGS="-DENABLE_LRB=ON" uv pip install libcachesim
# Enable ThreeLCache
CMAKE_ARGS="-DENABLE_3L_CACHE=ON" uv pip install libcachesim
# Enable GLCache
CMAKE_ARGS="-DENABLE_GLCACHE=ON" uv pip install libcachesimIf there are no wheels suitable for your environment, consider building from source.
bash scripts/install.shRun all tests to ensure the package works.
python -m pytest tests/With libcachesim installed, you can start cache simulation for some eviction algorithm and cache traces:
import libcachesim as lcs
# Step 1: Get one trace from S3 bucket
URI = "cache_dataset_oracleGeneral/2007_msr/msr_hm_0.oracleGeneral.zst"
dl = lcs.DataLoader()
dl.load(URI)
# Step 2: Open trace and process efficiently
reader = lcs.TraceReader(
trace = dl.get_cache_path(URI),
trace_type = lcs.TraceType.ORACLE_GENERAL_TRACE,
reader_init_params = lcs.ReaderInitParam(ignore_obj_size=False)
)
# Step 3: Initialize cache
cache = lcs.S3FIFO(cache_size=1024*1024)
# Step 4: Process entire trace efficiently (C++ backend)
obj_miss_ratio, byte_miss_ratio = cache.process_trace(reader)
print(f"Object miss ratio: {obj_miss_ratio:.4f}, Byte miss ratio: {byte_miss_ratio:.4f}")
# Step 4.1: Process with limited number of requests
cache = lcs.S3FIFO(cache_size=1024*1024)
obj_miss_ratio, byte_miss_ratio = cache.process_trace(
reader,
start_req=0,
max_req=1000
)
print(f"Object miss ratio: {obj_miss_ratio:.4f}, Byte miss ratio: {byte_miss_ratio:.4f}")import libcachesim as lcs
# Create a cache
cache = lcs.LRU(cache_size=1024*1024) # 1MB cache
# Process requests
req = lcs.Request()
req.obj_id = 1
req.obj_size = 100
print(cache.get(req)) # False (first access)
print(cache.get(req)) # True (second access)Here is an example demonstrating how to use TraceAnalyzer:
import libcachesim as lcs
# Step 1: Get one trace from S3 bucket
URI = "cache_dataset_oracleGeneral/2007_msr/msr_hm_0.oracleGeneral.zst"
dl = lcs.DataLoader()
dl.load(URI)
reader = lcs.TraceReader(
trace = dl.get_cache_path(URI),
trace_type = lcs.TraceType.ORACLE_GENERAL_TRACE,
reader_init_params = lcs.ReaderInitParam(ignore_obj_size=False)
)
analysis_option = lcs.AnalysisOption(
req_rate=True, # Keep basic request rate analysis
access_pattern=False, # Disable access pattern analysis
size=True, # Keep size analysis
reuse=False, # Disable reuse analysis for small datasets
popularity=False, # Disable popularity analysis for small datasets (< 200 objects)
ttl=False, # Disable TTL analysis
popularity_decay=False, # Disable popularity decay analysis
lifetime=False, # Disable lifetime analysis
create_future_reuse_ccdf=False, # Disable experimental features
prob_at_age=False, # Disable experimental features
size_change=False, # Disable size change analysis
)
analysis_param = lcs.AnalysisParam()
analyzer = lcs.TraceAnalyzer(
reader, "example_analysis", analysis_option=analysis_option, analysis_param=analysis_param
)
analyzer.run()libCacheSim allows you to develop your own cache eviction algorithms and test them via the plugin system without any C/C++ compilation required.
The PluginCache allows you to define custom caching behavior through Python callback functions. You need to implement these callback functions:
| Function | Signature | Description |
|---|---|---|
init_hook |
((common_cache_params: CommonCacheParams)) -> Any |
Initialize your data structure |
hit_hook |
(data: Any, request: Request) -> None |
Handle cache hits |
miss_hook |
(data: Any, request: Request) -> None |
Handle cache misses |
eviction_hook |
(data: Any, request: Request) -> int |
Return object ID to evict |
remove_hook |
(data: Any, obj_id: int) -> None |
Clean up when object removed |
free_hook |
(data: Any) -> None |
[Optional] Final cleanup |
from collections import OrderedDict
from typing import Any
from libcachesim import PluginCache, LRU, CommonCacheParams, Request
def init_hook(_: CommonCacheParams) -> Any:
return OrderedDict()
def hit_hook(data: Any, req: Request) -> None:
data.move_to_end(req.obj_id, last=True)
def miss_hook(data: Any, req: Request) -> None:
data.__setitem__(req.obj_id, req.obj_size)
def eviction_hook(data: Any, _: Request) -> int:
return data.popitem(last=False)[0]
def remove_hook(data: Any, obj_id: int) -> None:
data.pop(obj_id, None)
def free_hook(data: Any) -> None:
data.clear()
plugin_lru_cache = PluginCache(
cache_size=128,
cache_init_hook=init_hook,
cache_hit_hook=hit_hook,
cache_miss_hook=miss_hook,
cache_eviction_hook=eviction_hook,
cache_remove_hook=remove_hook,
cache_free_hook=free_hook,
cache_name="Plugin_LRU",
)
reader = lcs.SyntheticReader(num_objects=1000, num_of_req=10000, obj_size=1)
req_miss_ratio, byte_miss_ratio = plugin_lru_cache.process_trace(reader)
ref_req_miss_ratio, ref_byte_miss_ratio = LRU(128).process_trace(reader)
print(f"plugin req miss ratio {req_miss_ratio}, ref req miss ratio {ref_req_miss_ratio}")
print(f"plugin byte miss ratio {byte_miss_ratio}, ref byte miss ratio {ref_byte_miss_ratio}")By defining custom hook functions for cache initialization, hit, miss, eviction, removal, and cleanup, users can easily prototype and test their own cache eviction algorithms.
- Check project documentation for detailed guides
- Open issues on GitHub
- Review examples in the main repository
Please cite the following papers if you use libCacheSim.
@inproceedings{yang2020-workload,
author = {Juncheng Yang and Yao Yue and K. V. Rashmi},
title = {A large-scale analysis of hundreds of in-memory cache clusters at Twitter},
booktitle = {14th USENIX Symposium on Operating Systems Design and Implementation (OSDI 20)},
year = {2020},
isbn = {978-1-939133-19-9},
pages = {191--208},
url = {https://www.usenix.org/conference/osdi20/presentation/yang},
publisher = {USENIX Association},
}
@inproceedings{yang2023-s3fifo,
title = {FIFO Queues Are All You Need for Cache Eviction},
author = {Juncheng Yang and Yazhuo Zhang and Ziyue Qiu and Yao Yue and K.V. Rashmi},
isbn = {9798400702297},
publisher = {Association for Computing Machinery},
booktitle = {Symposium on Operating Systems Principles (SOSP'23)},
pages = {130–149},
numpages = {20},
year={2023}
}
@inproceedings{yang2023-qdlp,
author = {Juncheng Yang and Ziyue Qiu and Yazhuo Zhang and Yao Yue and K.V. Rashmi},
title = {FIFO Can Be Better than LRU: The Power of Lazy Promotion and Quick Demotion},
year = {2023},
isbn = {9798400701955},
publisher = {Association for Computing Machinery},
doi = {10.1145/3593856.3595887},
booktitle = {Proceedings of the 19th Workshop on Hot Topics in Operating Systems (HotOS23)},
pages = {70–79},
numpages = {10},
}
If you used libCacheSim in your research, please cite the above papers.
See LICENSE for details.