Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

C API Overview

Matchy provides a stable C API for integration with C, C++, and other languages that support C FFI.

See First Database with C for a tutorial.

Design Principles

The C API follows these principles:

  1. Opaque handles: All Rust types are wrapped in opaque pointers
  2. Integer error codes: Functions return int status codes
  3. No panics: All panics are caught at the FFI boundary
  4. Memory safety: Clear ownership semantics for all pointers
  5. ABI stability: Uses #[repr(C)] and extern "C"

Header File

#include <matchy.h>

The header is auto-generated by cbindgen during release builds:

cargo build --release
# Generates include/matchy.h

Core Types

Opaque Handles

typedef struct matchy_database matchy_database;
typedef struct matchy_builder matchy_builder;
typedef struct matchy_result matchy_result;

These are opaque pointers - never dereference them directly.

Error Codes

typedef int matchy_error_t;

#define MATCHY_OK                    0
#define MATCHY_ERROR_INVALID_PARAM   1
#define MATCHY_ERROR_FILE_NOT_FOUND  2
#define MATCHY_ERROR_INVALID_FORMAT  3
#define MATCHY_ERROR_CORRUPT_DATA    4
#define MATCHY_ERROR_PATTERN_ERROR   5
#define MATCHY_ERROR_BUILD_FAILED    6
#define MATCHY_ERROR_UNKNOWN         99

Result Types

typedef enum {
    MATCHY_RESULT_IP = 1,
    MATCHY_RESULT_PATTERN = 2,
    MATCHY_RESULT_EXACT_STRING = 3,
} matchy_result_type;

Function Groups

The C API is organized into these groups:

Database Operations

  • matchy_open() - Open database (default settings)
  • matchy_open_with_options() - Open database with custom options
  • matchy_init_open_options() - Initialize option structure
  • matchy_open() - Open database (skip validation)
  • matchy_close() - Close database
  • matchy_query() - Query database (returns by value)
  • matchy_query_into() - Query database (writes to pointer, FFI-friendly)
  • matchy_get_stats() - Get database statistics
  • matchy_clear_cache() - Clear query cache

Builder Operations

  • matchy_builder_new() - Create builder
  • matchy_builder_add_ip() - Add IP entry
  • matchy_builder_add_pattern() - Add pattern entry
  • matchy_builder_add_exact() - Add exact string entry
  • matchy_builder_build() - Build database
  • matchy_builder_free() - Free builder

Result Operations

  • matchy_result_type() - Get result type
  • matchy_result_ip_prefix_len() - Get IP prefix length
  • matchy_result_pattern_count() - Get pattern count
  • matchy_result_free() - Free result

Extractor Operations

  • matchy_extractor_create() - Create extractor with flags
  • matchy_extractor_extract_chunk() - Extract patterns from data
  • matchy_extractor_free() - Free extractor
  • matchy_matches_free() - Free match results
  • matchy_item_type_name() - Get type name string

Error Handling Pattern

All functions return error codes:

matchy_database *db = NULL;
matchy_error_t err = matchy_open("database.mxy", &db);

if (err != MATCHY_OK) {
    fprintf(stderr, "Error opening database: %d\n", err);
    return 1;
}

// Use db...

matchy_close(db);

Memory Management

Ownership Rules

  1. Caller owns input strings - You must keep them valid during the call
  2. Callee owns output handles - Free them with the appropriate _free() function
  3. Results must be freed - Always call matchy_result_free()

Example

// You own this string
const char *path = "database.mxy";

// Matchy owns this handle after successful open
matchy_database *db = NULL;
if (matchy_open(path, &db) == MATCHY_OK) {
    // Use db...
    
    // Matchy owns this result
    matchy_result *result = NULL;
    if (matchy_lookup(db, "192.0.2.1", &result) == MATCHY_OK) {
        if (result != NULL) {
            // Use result...
            
            // You must free the result
            matchy_result_free(result);
        }
    }
    
    // You must close the database
    matchy_close(db);
}

Thread Safety

  • Database handles (matchy_database) are thread-safe for reading
  • Builder handles (matchy_builder) are NOT thread-safe
  • Result handles (matchy_result) should not be shared

Multiple threads can safely call matchy_lookup() on the same database:

// Thread 1
matchy_result *r1 = NULL;
matchy_lookup(db, "query1", &r1);

// Thread 2 (safe!)
matchy_result *r2 = NULL;
matchy_lookup(db, "query2", &r2);

Opening with Cache Options

Basic Opening (Default Cache)

// Opens with default cache (10,000 entries)
matchy_t *db = matchy_open("database.mxy");
if (db == NULL) {
    fprintf(stderr, "Failed to open database\n");
    return 1;
}

Custom Cache Configuration

// Initialize options structure
matchy_open_options_t opts;
matchy_init_open_options(&opts);

// Configure cache and validation
opts.cache_capacity = 100000;  // Large cache for high repetition

matchy_t *db = matchy_open_with_options("threats.mxy", &opts);
if (db == NULL) {
    fprintf(stderr, "Failed to open database\n");
    return 1;
}

No Cache

matchy_open_options_t opts;
matchy_init_open_options(&opts);
opts.cache_capacity = 0;  // Disable cache

matchy_t *db = matchy_open_with_options("database.mxy", &opts);

Get Statistics

matchy_stats_t stats;
matchy_get_stats(db, &stats);

printf("Total queries: %llu\n", stats.total_queries);
printf("Queries with match: %llu\n", stats.queries_with_match);
printf("IP queries: %llu\n", stats.ip_queries);
printf("String queries: %llu\n", stats.string_queries);

// Calculate rates
double cache_hit_rate = 0.0;
if (stats.cache_hits + stats.cache_misses > 0) {
    cache_hit_rate = (double)stats.cache_hits / 
                     (stats.cache_hits + stats.cache_misses);
}

double match_rate = 0.0;
if (stats.total_queries > 0) {
    match_rate = (double)stats.queries_with_match / stats.total_queries;
}

printf("Cache hit rate: %.1f%%\n", cache_hit_rate * 100.0);
printf("Match rate: %.1f%%\n", match_rate * 100.0);

matchy_stats_t Structure

typedef struct {
    uint64_t total_queries;
    uint64_t queries_with_match;
    uint64_t queries_without_match;
    uint64_t cache_hits;
    uint64_t cache_misses;
    uint64_t ip_queries;
    uint64_t string_queries;
} matchy_stats_t;

Clear Cache

// Do some queries (fills cache)
matchy_result_t result = matchy_query(db, "example.com");
matchy_free_result(&result);

// Clear cache to force fresh lookups
matchy_clear_cache(db);

Complete Example

#include <matchy.h>
#include <stdio.h>
#include <stdlib.h>

int main(void) {
    matchy_error_t err;
    
    // Build database
    matchy_builder *builder = matchy_builder_new();
    if (!builder) {
        fprintf(stderr, "Failed to create builder\n");
        return 1;
    }
    
    err = matchy_builder_add_ip(builder, "192.0.2.1/32", NULL);
    if (err != MATCHY_OK) {
        fprintf(stderr, "Failed to add IP: %d\n", err);
        matchy_builder_free(builder);
        return 1;
    }
    
    err = matchy_builder_add_pattern(builder, "*.example.com", NULL);
    if (err != MATCHY_OK) {
        fprintf(stderr, "Failed to add pattern: %d\n", err);
        matchy_builder_free(builder);
        return 1;
    }
    
    // Build to file
    err = matchy_builder_build(builder, "database.mxy");
    matchy_builder_free(builder);
    
    if (err != MATCHY_OK) {
        fprintf(stderr, "Failed to build: %d\n", err);
        return 1;
    }
    
    // Open database
    matchy_database *db = NULL;
    err = matchy_open("database.mxy", &db);
    if (err != MATCHY_OK) {
        fprintf(stderr, "Failed to open: %d\n", err);
        return 1;
    }
    
    // Query
    const char *queries[] = {
        "192.0.2.1",
        "test.example.com",
        "notfound.com",
    };
    
    for (int i = 0; i < 3; i++) {
        matchy_result *result = NULL;
        err = matchy_lookup(db, queries[i], &result);
        
        if (err != MATCHY_OK) {
            fprintf(stderr, "Lookup error for '%s': %d\n", queries[i], err);
            continue;
        }
        
        if (result == NULL) {
            printf("%s: Not found\n", queries[i]);
        } else {
            matchy_result_type type = matchy_result_type(result);
            printf("%s: Found (type %d)\n", queries[i], type);
            matchy_result_free(result);
        }
    }
    
    matchy_close(db);
    return 0;
}

Compilation

GCC/Clang

gcc -o myapp myapp.c \
    -I./include \
    -L./target/release \
    -lmatchy

Setting Library Path

# Linux
export LD_LIBRARY_PATH=./target/release:$LD_LIBRARY_PATH

# macOS
export DYLD_LIBRARY_PATH=./target/release:$DYLD_LIBRARY_PATH

Static Linking

# For static linking on Linux, you may need system libraries:
gcc -o myapp myapp.c \
    -I./include \
    ./target/release/libmatchy.a \
    -lpthread -ldl -lm

# On macOS, static linking usually just needs:
gcc -o myapp myapp.c \
    -I./include \
    ./target/release/libmatchy.a

Best Practices

1. Always Check Return Values

if (matchy_open(path, &db) != MATCHY_OK) {
    // Handle error
}

2. Initialize Pointers to NULL

matchy_database *db = NULL;  // Good
matchy_open(path, &db);

3. Free Resources in Reverse Order

matchy_result *result = NULL;
matchy_database *db = NULL;

matchy_open("db.mxy", &db);
matchy_lookup(db, "query", &result);

// Free in reverse order
matchy_result_free(result);
matchy_close(db);

4. Use Guards for Cleanup

matchy_database *db = NULL;
matchy_error_t err = matchy_open(path, &db);
if (err != MATCHY_OK) goto cleanup;

// ... use db ...

cleanup:
    if (db) matchy_close(db);
    return err;

Debugging

Valgrind

Check for memory leaks:

valgrind --leak-check=full --show-leak-kinds=all ./myapp

AddressSanitizer

Compile with sanitizer:

gcc -fsanitize=address -g -o myapp myapp.c -lmatchy
./myapp

Extractor API

The extractor API provides high-performance pattern extraction from text data.

Extraction Flags

Use these flags with matchy_extractor_create() to specify what to extract:

MATCHY_EXTRACT_DOMAINS   // Domain names (e.g., "example.com")
MATCHY_EXTRACT_EMAILS    // Email addresses
MATCHY_EXTRACT_IPV4      // IPv4 addresses
MATCHY_EXTRACT_IPV6      // IPv6 addresses
MATCHY_EXTRACT_HASHES    // File hashes (MD5, SHA1, SHA256, SHA384, SHA512)
MATCHY_EXTRACT_BITCOIN   // Bitcoin addresses
MATCHY_EXTRACT_ETHEREUM  // Ethereum addresses
MATCHY_EXTRACT_MONERO    // Monero addresses
MATCHY_EXTRACT_ALL       // All of the above

Item Types

Match results include an item type:

MATCHY_ITEM_TYPE_DOMAIN    // Domain name
MATCHY_ITEM_TYPE_EMAIL     // Email address
MATCHY_ITEM_TYPE_IPV4      // IPv4 address
MATCHY_ITEM_TYPE_IPV6      // IPv6 address
MATCHY_ITEM_TYPE_MD5       // MD5 hash
MATCHY_ITEM_TYPE_SHA1      // SHA1 hash
MATCHY_ITEM_TYPE_SHA256    // SHA256 hash
MATCHY_ITEM_TYPE_SHA384    // SHA384 hash
MATCHY_ITEM_TYPE_SHA512    // SHA512 hash
MATCHY_ITEM_TYPE_BITCOIN   // Bitcoin address
MATCHY_ITEM_TYPE_ETHEREUM  // Ethereum address
MATCHY_ITEM_TYPE_MONERO    // Monero address

Functions

  • matchy_extractor_create(flags) - Create extractor with specified flags
  • matchy_extractor_extract_chunk(extractor, data, len, matches) - Extract patterns
  • matchy_extractor_free(extractor) - Free extractor
  • matchy_matches_free(matches) - Free match results
  • matchy_item_type_name(type) - Get string name for item type

Example

#include <matchy.h>
#include <stdio.h>
#include <string.h>

int main() {
    // Create extractor for domains and IPs only
    matchy_extractor_t *ext = matchy_extractor_create(
        MATCHY_EXTRACT_DOMAINS | MATCHY_EXTRACT_IPV4 | MATCHY_EXTRACT_IPV6
    );
    if (!ext) {
        fprintf(stderr, "Failed to create extractor\n");
        return 1;
    }
    
    // Extract from text
    const char *text = "Check evil.com and 192.168.1.1";
    matchy_matches_t matches;
    
    int err = matchy_extractor_extract_chunk(
        ext,
        (const uint8_t *)text,
        strlen(text),
        &matches
    );
    
    if (err != MATCHY_SUCCESS) {
        fprintf(stderr, "Extraction failed: %d\n", err);
        matchy_extractor_free(ext);
        return 1;
    }
    
    // Process results
    for (size_t i = 0; i < matches.count; i++) {
        printf("%s: %s (bytes %zu-%zu)\n",
               matchy_item_type_name(matches.items[i].item_type),
               matches.items[i].value,
               matches.items[i].start,
               matches.items[i].end);
    }
    
    // Cleanup
    matchy_matches_free(&matches);
    matchy_extractor_free(ext);
    return 0;
}

Output:

Domain: evil.com (bytes 6-14)
IPv4: 192.168.1.1 (bytes 19-30)

Match Structure

typedef struct matchy_match_t {
    uint8_t item_type;      // MATCHY_ITEM_TYPE_* constant
    const char *value;      // Extracted value (null-terminated)
    size_t start;           // Start byte offset in input
    size_t end;             // End byte offset (exclusive)
} matchy_match_t;

typedef struct matchy_matches_t {
    const matchy_match_t *items;  // Array of matches
    size_t count;                  // Number of matches
} matchy_matches_t;

Thread Safety

  • Extractor handles (matchy_extractor_t*) are thread-safe for concurrent extraction
  • Multiple threads can safely call matchy_extractor_extract_chunk() on the same extractor
  • Each thread should have its own matchy_matches_t for results

See Also