“`html
In this article
Introduction
I’m seeking a cache simulator or benchmark suite suitable for evaluating the kind of tiered, ephemeral caching systems that Language Model Providers (LLM providers) use. For instance, Anthropic‘s 4-tier prompt cache where context is distributed across multiple tiers with varying costs, residency windows, and eviction rules.
Current Situation
I’ve already tried libCacheSim. This software is a robust tool for simulating classical caches like LRU, FIFO, ARC, SIEVE, S3-FIFO, W-TinyLFU, Belady oracle, and supports plugins with synthetic traces. However, it does not align well with the needs of tiered, multi-tier caching systems.
- libCacheSim focuses on single caches rather than a hierarchy of multiple tiers with different costs.
- It lacks support for partial or multi-tier residency of objects across different cache levels.
- The miss cost model is uniform, whereas LLM caching requires distinguishing between misses at various cache levels (e.g., from tier 1 to tier 3).
- The trace model is atomic get/put operations rather than edit streams where cached objects can mutate in place.
- There’s no first-class support for token-weighted object sizes, which are crucial for LLM caches.
Request for Guidance
Does anyone know of cache-testing software specifically designed to evaluate tiered, multi-tier caching systems like those used by LLM providers? Preferably something that models multiple tiers with per-tier cost/residency policies, tokenised objects, and edit-driven workloads. Academic code or research prototypes are equally valuable. Even partial matches (e.g., key-value cache simulators for inference servers) would be helpful.
Key Takeaways
- libCacheSim is a good starting point but lacks the features needed for tiered, multi-tier caching systems.
- No existing software specifically tailored to LLM-provider-style caches has been found yet.
- There’s a need for more research and development in this area to create suitable benchmarking tools.
“`

![Cache-testing software for LLM-provider-style tiered ephemeral caches? [D]](https://ai-maestro.online/wp-content/uploads/2026/05/cache-testing-software-for-llm-provider-style-tiered-ephemer-1024x576.jpg)


