Response Caching¶
AgentOpt caches LLM responses at the HTTP level to avoid redundant API calls during model selection.
How It Works¶
graph LR
A[LLM Call] --> B{In cache?}
B -->|Yes| C[Return cached response]
B -->|No| D[Call API]
D --> E[Store in memory]
E --> F[Background flush to SQLite]
| Property | Detail |
|---|---|
| Cache key | SHA-256 of the request body (model + messages + params), excluding stream |
| In-memory | Thread-safe dict — always active when caching is on |
| On disk | Optional SQLite database (cache.db), flushed every 10 seconds by a background thread |
Cached responses include the original latency measurement, so cost and latency comparisons remain fair.
Why It Matters¶
During model selection, many LLM calls are identical:
Shared model calls
If two combinations use the same planner model, the planner call for each datapoint is identical. With 9 combinations and 3 distinct planners, you pay for 3 unique planner calls per datapoint — not 9.
Re-runs are free
Tweak your eval function and re-run? Every LLM call hits the cache. Zero API cost, instant results.
Crash recovery
If a long run is interrupted, cached responses survive on disk. Resume without re-calling the API.
Enabling Disk Cache¶
By default, caching is in-memory only (lost when the process exits). To persist:
from agentopt.proxy import LLMTracker
tracker = LLMTracker(cache_dir="./llm_cache")
selector = BruteForceModelSelector(
...,
tracker=tracker,
)
results = selector.select_best()
# Cache automatically flushed to ./llm_cache/cache.db
On subsequent runs with the same cache_dir, entries are loaded from disk at startup.
Cache Lifecycle¶
| Event | What Happens |
|---|---|
LLMTracker(cache_dir=...) |
Creates DB if needed, loads existing entries into memory |
| LLM call (cache miss) | Response stored in memory, marked dirty |
| Background flush (every 10s) | Dirty entries written to SQLite |
tracker.stop() / select_best() returns |
Final flush to disk |
tracker.clear_cache() |
Clears memory and deletes all DB rows |
Disabling Cache¶
Inspecting the Cache¶
The cache is a standard SQLite database: