Query Result Caching
Matchy includes a built-in LRU (Least Recently Used) cache for query results, providing 2-10x performance improvements for workloads with repeated queries.
Overview
The cache stores query results in memory, eliminating the need to re-execute database lookups for previously seen queries. This is particularly valuable for:
- Web APIs serving repeated requests
- Firewalls checking the same IPs frequently
- Real-time threat detection with hot patterns
- High-traffic services with predictable query patterns
Performance
Cache performance depends on the hit rate (percentage of queries found in cache):
| Hit Rate | Speedup vs Uncached | Use Case |
|---|---|---|
| 0% | 1.0x (no benefit) | Batch processing, unique queries |
| 50% | 1.5-2x | Mixed workload |
| 80% | 3-5x | Web API, typical firewall |
| 95% | 5-8x | High-traffic service |
| 99% | 8-10x | Repeated pattern checking |
Zero overhead when disabled: The cache uses compile-time optimization, so disabling it has no performance cost.
Configuration
Enabling the Cache
Use the builder API to configure cache capacity:
#![allow(unused)]
fn main() {
use matchy::Database;
// Enable cache with 10,000 entry capacity
let db = Database::from("threats.mxy")
.cache_capacity(10_000)
.open()?;
// Use the database normally - caching is transparent
if let Some(result) = db.lookup("evil.com")? {
println!("Match: {:?}", result);
}
}
Disabling the Cache
Explicitly disable caching for memory-constrained environments:
#![allow(unused)]
fn main() {
let db = Database::from("threats.mxy")
.no_cache() // Disable caching
.open()?;
}
Default behavior: If you don’t specify cache configuration, a reasonable default cache is enabled.
Cache Management
Inspecting Cache Size
Check how many entries are currently cached:
#![allow(unused)]
fn main() {
println!("Cache entries: {}", db.cache_size());
}
Clearing the Cache
Clear all cached entries:
#![allow(unused)]
fn main() {
db.clear_cache();
println!("Cache cleared: {}", db.cache_size()); // 0
}
This is useful for:
- Memory management in long-running processes
- Testing with fresh cache state
- Resetting after configuration changes
How It Works
The cache is an LRU (Least Recently Used) cache:
- On first query: Result is computed and stored in cache
- On repeated query: Result is returned from cache (fast!)
- When cache is full: Least recently used entry is evicted
The cache is thread-safe using interior mutability, so multiple queries can safely share the same Database instance.
Cache Capacity Guidelines
Choose cache capacity based on your workload:
| Workload | Recommended Capacity | Reasoning |
|---|---|---|
| Web API (< 1000 req/s) | 1,000 - 10,000 | Covers hot patterns |
| Firewall (medium traffic) | 10,000 - 50,000 | Covers recent IPs |
| High-traffic service | 50,000 - 100,000 | Maximize hit rate |
| Memory-constrained | Disable cache | Save memory |
Memory usage: Each cache entry uses ~100-200 bytes, so:
- 10,000 entries ≈ 1-2 MB
- 100,000 entries ≈ 10-20 MB
When to Use Caching
✅ Use Caching For:
- Web APIs with repeated queries
- Firewalls checking the same IPs
- Real-time monitoring with hot patterns
- Long-running services with predictable queries
❌ Skip Caching For:
- Batch processing (all queries unique)
- One-time scans (no repeated queries)
- Memory-constrained environments
- Testing where you need fresh results
Example: Web API with Caching
#![allow(unused)]
fn main() {
use matchy::Database;
use std::sync::Arc;
// Create a shared database with caching
let db = Arc::new(
Database::from("threats.mxy")
.cache_capacity(50_000) // High capacity for web API
.open()?
);
// Share across request handlers
let db_clone = Arc::clone(&db);
tokio::spawn(async move {
// Handle requests
loop {
let query = receive_request().await;
// Cache hit on repeated queries!
if let Some(result) = db_clone.lookup(&query)? {
send_response(result).await;
}
}
});
}
Benchmarking Cache Performance
Use the provided benchmark to measure cache performance on your workload:
# Run the cache demo
cargo run --release --example cache_demo
# Or run the comprehensive benchmark
cargo bench --bench cache_bench
See examples/cache_demo.rs for a complete working example.
Comparison with No Cache
Here’s a typical performance comparison:
#![allow(unused)]
fn main() {
// Without cache (baseline)
let db_uncached = Database::from("db.mxy").no_cache().open()?;
// 10,000 queries: 2.5s → 4,000 QPS
// With cache (80% hit rate)
let db_cached = Database::from("db.mxy").cache_capacity(10_000).open()?;
// 10,000 queries: 0.8s → 12,500 QPS (3x faster!)
}
Summary
- Simple configuration: Just add
.cache_capacity(size)to the builder - Transparent operation: No code changes after configuration
- Significant speedup: 2-10x for high hit rates
- Zero overhead: No cost when disabled
- Thread-safe: Safe to share across threads
Query result caching is one of the easiest ways to improve Matchy performance for real-world workloads.