Query Result Caching

Matchy includes a built-in LRU (Least Recently Used) cache for query results, providing 2-10x performance improvements for workloads with repeated queries.

Overview

The cache stores query results in memory, eliminating the need to re-execute database lookups for previously seen queries. This is particularly valuable for:

Web APIs serving repeated requests
Firewalls checking the same IPs frequently
Real-time threat detection with hot patterns
High-traffic services with predictable query patterns

Performance

Cache performance depends on the hit rate (percentage of queries found in cache):

Hit Rate	Speedup vs Uncached	Use Case
0%	1.0x (no benefit)	Batch processing, unique queries
50%	1.5-2x	Mixed workload
80%	3-5x	Web API, typical firewall
95%	5-8x	High-traffic service
99%	8-10x	Repeated pattern checking

Zero overhead when disabled: The cache uses compile-time optimization, so disabling it has no performance cost.

Configuration

Enabling the Cache

Use the builder API to configure cache capacity:

#![allow(unused)]
fn main() {
use matchy::Database;

// Enable cache with 10,000 entry capacity
let db = Database::from("threats.mxy")
    .cache_capacity(10_000)
    .open()?;

// Use the database normally - caching is transparent
if let Some(result) = db.lookup("evil.com")? {
    println!("Match: {:?}", result);
}
}

Disabling the Cache

Explicitly disable caching for memory-constrained environments:

#![allow(unused)]
fn main() {
let db = Database::from("threats.mxy")
    .no_cache()  // Disable caching
    .open()?;
}

Default behavior: If you don’t specify cache configuration, a reasonable default cache is enabled.

Cache Management

Inspecting Cache Size

Check how many entries are currently cached:

#![allow(unused)]
fn main() {
println!("Cache entries: {}", db.cache_size());
}

Clearing the Cache

Clear all cached entries:

#![allow(unused)]
fn main() {
db.clear_cache();
println!("Cache cleared: {}", db.cache_size()); // 0
}

This is useful for:

Memory management in long-running processes
Testing with fresh cache state
Resetting after configuration changes

How It Works

The cache is an LRU (Least Recently Used) cache:

On first query: Result is computed and stored in cache
On repeated query: Result is returned from cache (fast!)
When cache is full: Least recently used entry is evicted

The cache is thread-safe using interior mutability, so multiple queries can safely share the same Database instance.

Cache Capacity Guidelines

Choose cache capacity based on your workload:

Workload	Recommended Capacity	Reasoning
Web API (< 1000 req/s)	1,000 - 10,000	Covers hot patterns
Firewall (medium traffic)	10,000 - 50,000	Covers recent IPs
High-traffic service	50,000 - 100,000	Maximize hit rate
Memory-constrained	Disable cache	Save memory

Memory usage: Each cache entry uses ~100-200 bytes, so:

10,000 entries ≈ 1-2 MB
100,000 entries ≈ 10-20 MB

When to Use Caching

✅ Use Caching For:

Web APIs with repeated queries
Firewalls checking the same IPs
Real-time monitoring with hot patterns
Long-running services with predictable queries

❌ Skip Caching For:

Batch processing (all queries unique)
One-time scans (no repeated queries)
Memory-constrained environments
Testing where you need fresh results

Example: Web API with Caching

#![allow(unused)]
fn main() {
use matchy::Database;
use std::sync::Arc;

// Create a shared database with caching
let db = Arc::new(
    Database::from("threats.mxy")
        .cache_capacity(50_000)  // High capacity for web API
        .open()?
);

// Share across request handlers
let db_clone = Arc::clone(&db);
tokio::spawn(async move {
    // Handle requests
    loop {
        let query = receive_request().await;
        
        // Cache hit on repeated queries!
        if let Some(result) = db_clone.lookup(&query)? {
            send_response(result).await;
        }
    }
});
}

Benchmarking Cache Performance

Use the provided benchmark to measure cache performance on your workload:

# Run the cache demo
cargo run --release --example cache_demo

# Or run the comprehensive benchmark
cargo bench --bench cache_bench

See examples/cache_demo.rs for a complete working example.

Comparison with No Cache

Here’s a typical performance comparison:

#![allow(unused)]
fn main() {
// Without cache (baseline)
let db_uncached = Database::from("db.mxy").no_cache().open()?;
// 10,000 queries: 2.5s → 4,000 QPS

// With cache (80% hit rate)
let db_cached = Database::from("db.mxy").cache_capacity(10_000).open()?;
// 10,000 queries: 0.8s → 12,500 QPS (3x faster!)
}

Summary

Simple configuration: Just add .cache_capacity(size) to the builder
Transparent operation: No code changes after configuration
Significant speedup: 2-10x for high hit rates
Zero overhead: No cost when disabled
Thread-safe: Safe to share across threads

Query result caching is one of the easiest ways to improve Matchy performance for real-world workloads.

Keyboard shortcuts

Matchy Documentation