Skip to main content
Jasonisnthappy is designed for high performance. This guide covers optimization techniques, benchmarks, and best practices.

Performance characteristics

Key metrics

Based on production benchmarks:

Bulk inserts

~19,000 docs/sec for 1000-doc batches

Single writes

~8ms per insert with full ACID guarantees

Reads

Sub-millisecond queries on collections with 2500+ documents

Concurrent scaling

Linear scaling with thread count up to core count

ACID compliance

All operations include:
  • Full ACID transaction support
  • MVCC snapshot isolation
  • Write-ahead logging (WAL)
  • Crash recovery
Performance numbers include durability guarantees (fsync). Jasonisnthappy doesn’t sacrifice safety for speed.

Write optimization

Batch operations

Use bulk inserts instead of individual inserts.
// Insert 1000 documents in one transaction
let docs: Vec<Value> = (0..1000)
    .map(|i| json!({"id": i, "name": format!("User {}", i)}))
    .collect();

let ids = users.insert_many(docs)?;  // ~50ms
// Throughput: ~19,000 docs/sec
Optimal batch size: 100-1000 documents per insert_many call balances throughput and memory usage.

Bulk write operations

Use bulk_write for mixed operations.
use jasonisnthappy::Database;
use serde_json::json;

let users = db.collection("users");

// Efficient: single transaction for multiple operations
let result = users.bulk_write()
    .insert(json!({"name": "Alice"}))
    .insert(json!({"name": "Bob"}))
    .update_one("name is \"Alice\"", json!({"age": 30}))
    .delete_many("status is \"inactive\"")
    .execute()?;

println!("Inserted: {}, Updated: {}, Deleted: {}",
    result.inserted_count,
    result.updated_count,
    result.deleted_count
);

Unordered writes

For maximum throughput, use unordered bulk writes.
// Unordered: continue on errors, faster
let result = users.bulk_write()
    .ordered(false)  // Don't stop on first error
    .insert(json!({"name": "User1"}))
    .insert(json!({"name": "User2"}))
    // ... 100s of operations ...
    .execute()?;

if !result.errors.is_empty() {
    eprintln!("Some operations failed: {:?}", result.errors);
}

Checkpoint management

Control when WAL is checkpointed.
// Default: auto-checkpoint every 1000 WAL frames
let db = Database::open("my.db")?;

// For bulk imports: increase threshold
db.set_auto_checkpoint_threshold(10000);  // Checkpoint less often

// ... bulk insert operations ...

// Manual checkpoint after bulk import
db.checkpoint()?;

// Restore default
db.set_auto_checkpoint_threshold(1000);
Increasing the checkpoint threshold reduces write amplification during bulk imports but uses more WAL space.

Read optimization

Use indexes

Indexes are crucial for query performance.
// Slow: scans entire collection
let user = users.find("email is \"alice@example.com\"")?;
// Time: O(n) - 50ms for 10,000 documents
See the Indexes guide for details.

Query optimization

Use the most specific query method.
// Best: Direct ID lookup
let user = users.find_by_id("user_123")?;  // 0.01ms

// Good: find_one with index
let user = users.find_one("email is \"alice@example.com\"")?;  // 0.02ms

// Slower: find (returns all matches)
let results = users.find("email is \"alice@example.com\"")?;  // 0.03ms
let user = results.first();

// Slowest: find_all then filter in code
let all_users = users.find_all()?;  // 2ms for 10,000 docs
let user = all_users.iter().find(|u| u["email"] == "alice@example.com");

Projection

Fetch only the fields you need.
let results = users.query()
    .filter("age > 25")
    .project(&["name", "email"])  // Only load 2 fields
    .execute()?;
// Less data transferred from disk

Typed operations

Use typed methods to avoid JSON parsing overhead.
use serde::{Serialize, Deserialize};

#[derive(Serialize, Deserialize)]
struct User {
    name: String,
    email: String,
}

// Faster: direct deserialization
let user: Option<User> = users.find_by_id_typed("user_123")?;

// Slower: JSON Value then application-level parsing
let user_value = users.find_by_id("user_123")?;
let user: User = serde_json::from_value(user_value)?;

Transaction optimization

Minimize transaction scope

Keep transactions short and focused.
let mut tx = db.begin()?;
let users = tx.collection("users");
users.insert(json!({"name": "Alice"}))?;
tx.commit()?;  // Fast commit

Use retryable transactions

For automatic conflict resolution.
use jasonisnthappy::TransactionConfig;

// Configure retries
db.set_transaction_config(TransactionConfig {
    max_retries: 5,
    retry_backoff_base_ms: 1,
    max_retry_backoff_ms: 100,
});

// Run with automatic retry
let result = db.run_transaction(|tx| {
    let users = tx.collection("users");
    let count = users.count()?;
    
    users.insert(json!({"count": count}))?;
    Ok(count)
})?;

MVCC benefits

Reads never block writes, writes never block reads.
use std::thread;

// Reader thread (never blocked)
let db_clone = db.clone();
let reader = thread::spawn(move || {
    let users = db_clone.collection("users");
    loop {
        let count = users.count().unwrap();
        println!("Count: {}", count);
        thread::sleep(Duration::from_millis(10));
    }
});

// Writer thread (runs concurrently)
let db_clone = db.clone();
let writer = thread::spawn(move || {
    let users = db_clone.collection("users");
    loop {
        users.insert(json!({"data": "test"})).unwrap();
        thread::sleep(Duration::from_millis(100));
    }
});

// Both threads run concurrently without blocking

Memory optimization

Page cache

Configure the page cache size.
use jasonisnthappy::DatabaseOptions;

let db = Database::open_with_options("my.db", DatabaseOptions {
    cache_size: 50_000,  // 50K pages = ~200MB
    ..Default::default()
})?;
Cache sizing guide:
  • Small DBs (< 100MB): 10,000 pages (~40MB)
  • Medium DBs (< 1GB): 25,000 pages (~100MB)
  • Large DBs (> 1GB): 50,000+ pages (~200MB+)

Limit result sets

Use pagination to avoid loading large result sets.
// Good: Paginate large result sets
let page_size = 100;
let results = users.query()
    .filter("status is \"active\"")
    .limit(page_size)
    .execute()?;

// Bad: Load all results into memory
let all_results = users.find("status is \"active\"")?;  // Could be millions

Garbage collection

Reclaim space from old MVCC versions.
// Periodically clean up old versions
let stats = db.garbage_collect()?;

println!("Versions removed: {}", stats.versions_removed);
println!("Pages freed: {}", stats.pages_freed);
println!("Bytes freed: {}", stats.bytes_freed);
Garbage collection is safe to run while the database is in use. It only removes versions no longer visible to any transaction.

Disk I/O optimization

SSD vs HDD

Jasonisnthappy performs best on SSDs.
StorageRandom readsSequential writesRecommended
SSDExcellentExcellent✅ Yes
NVMe SSDExceptionalExceptional✅ Best
HDDPoorGood⚠️ Limited

File placement

Separate database and WAL on different disks (advanced).
// Database on fast SSD
let db = Database::open("/mnt/ssd/app.db")?;

// WAL on separate disk (if supported)
// Note: Current implementation keeps WAL with database

Batch commits

Jasonisnthappy batches commits automatically for better throughput.
// Multiple concurrent writers are batched
let threads: Vec<_> = (0..4)
    .map(|i| {
        let db = db.clone();
        thread::spawn(move || {
            let users = db.collection("users");
            users.insert(json!({"thread": i})).unwrap();
        })
    })
    .collect();

// Commits are batched together for efficiency

Benchmarking

Running benchmarks

# Run all benchmarks
cargo run --release --example bench_all

# Results show:
# - Write throughput
# - Read latency
# - Bulk insert performance
# - Concurrent operations

Custom benchmarks

use std::time::Instant;

let db = Database::open("bench.db")?;
let users = db.collection("users");

// Benchmark inserts
let start = Instant::now();
for i in 0..1000 {
    users.insert(json!({"id": i}))?;
}
let elapsed = start.elapsed();
println!("1000 inserts: {:.2?}", elapsed);
println!("Throughput: {:.0} ops/sec", 1000.0 / elapsed.as_secs_f64());

// Benchmark queries
let start = Instant::now();
for i in 0..1000 {
    users.find_one(&format!("id is {}", i))?;
}
let elapsed = start.elapsed();
println!("1000 queries: {:.2?}", elapsed);
println!("Avg latency: {:.2}ms", elapsed.as_millis() as f64 / 1000.0);

Monitoring

Metrics

Track database performance with built-in metrics.
use jasonisnthappy::Database;

let db = Database::open("my.db")?;

// Get metrics snapshot
let metrics = db.metrics();

println!("Transactions:");
println!("  Begun: {}", metrics.transactions_begun);
println!("  Committed: {}", metrics.transactions_committed);
println!("  Rolled back: {}", metrics.transactions_rolled_back);
println!("  Conflicts: {}", metrics.transaction_conflicts);

println!("\nCache:");
println!("  Hits: {}", metrics.cache_hits);
println!("  Misses: {}", metrics.cache_misses);
let hit_rate = metrics.cache_hits as f64 / 
               (metrics.cache_hits + metrics.cache_misses) as f64 * 100.0;
println!("  Hit rate: {:.1}%", hit_rate);

println!("\nWAL:");
println!("  Frames written: {}", metrics.wal_frames_written);
println!("  Checkpoints: {}", metrics.checkpoints);

Performance tuning

Use metrics to identify bottlenecks.
1
Check cache hit rate
2
let metrics = db.metrics();
let hit_rate = metrics.cache_hits as f64 / 
               (metrics.cache_hits + metrics.cache_misses) as f64;

if hit_rate < 0.9 {
    println!("Cache too small - increase cache_size");
}
3
Monitor transaction conflicts
4
if metrics.transaction_conflicts > metrics.transactions_committed / 10 {
    println!("High conflict rate - review transaction patterns");
}
5
Track WAL growth
6
let frame_count = db.frame_count();
if frame_count > 10000 {
    println!("Large WAL - consider checkpoint");
    db.checkpoint()?;
}

Best practices summary

Do:
  • Use insert_many for bulk inserts (100-1000 docs)
  • Create indexes on frequently queried fields
  • Use find_one when you only need the first result
  • Project only needed fields
  • Keep transactions short
  • Use typed operations for hot paths
  • Configure adequate cache size
  • Run garbage collection periodically
Avoid:
  • Individual inserts in loops
  • Queries without indexes on large collections
  • Loading entire collections with find_all
  • Long-running transactions
  • Excessive checkpoint frequency
  • Too many indexes (slow writes)
  • Running on slow HDDs if possible

Performance checklist

1
Initial setup
2
  • Choose SSD storage
  • Configure appropriate cache size
  • Set auto-checkpoint threshold
  • 3
    Development
    4
  • Create indexes on query fields
  • Use batch operations
  • Project only needed fields
  • Use typed operations where possible
  • 5
    Production
    6
  • Monitor metrics (cache hit rate, conflicts)
  • Schedule garbage collection
  • Checkpoint during low traffic
  • Benchmark critical operations
  • Profile slow queries
  • 7
    Optimization
    8
  • Add missing indexes
  • Increase cache size if hit rate < 90%
  • Batch writes during bulk imports
  • Use read-only mode for replicas
  • Real-world performance

    E-commerce application

    // Product catalog: 100,000 products
    // - Indexed: name, category, price
    // - Cache: 50,000 pages (~200MB)
    // - Storage: NVMe SSD
    
    // Search products: 0.5-2ms
    let results = products.query()
        .filter("category is \"electronics\" and price < 500")
        .sort_by("price", SortOrder::Asc)
        .limit(20)
        .execute()?;
    
    // Add to cart: 8-12ms (with ACID)
    let cart = db.collection("carts");
    cart.insert(json!({
        "user_id": user_id,
        "product_id": product_id,
        "quantity": 1
    }))?;
    
    // Checkout (bulk write): 30-50ms
    let result = db.run_transaction(|tx| {
        // Update inventory, create order, clear cart
        // ... multiple operations ...
        Ok(order_id)
    })?;
    

    Analytics dashboard

    // Events: 1M+ documents
    // - Aggregation: 100-500ms
    // - Cached results: < 1ms
    
    let hourly_stats = events.aggregate()
        .match_("timestamp > 1704067200")
        .group_by("event_type")
        .count("total")
        .execute()?;  // ~200ms for 1M events
    

    Next steps

    Indexes

    Create indexes to speed up queries

    Querying

    Optimize query patterns

    CRUD operations

    Use efficient write operations

    Backup and restore

    Backup strategies for production