Performance - jasonisnthappy

Jasonisnthappy is designed for high performance. This guide covers optimization techniques, benchmarks, and best practices.

Performance characteristics

Key metrics

Based on production benchmarks:

Bulk inserts

~19,000 docs/sec for 1000-doc batches

Single writes

~8ms per insert with full ACID guarantees

Reads

Sub-millisecond queries on collections with 2500+ documents

Concurrent scaling

Linear scaling with thread count up to core count

ACID compliance

All operations include:

Full ACID transaction support
MVCC snapshot isolation
Write-ahead logging (WAL)
Crash recovery

Performance numbers include durability guarantees (fsync). Jasonisnthappy doesn’t sacrifice safety for speed.

Write optimization

Batch operations

Use bulk inserts instead of individual inserts.

// Insert 1000 documents in one transaction
let docs: Vec<Value> = (0..1000)
    .map(|i| json!({"id": i, "name": format!("User {}", i)}))
    .collect();

let ids = users.insert_many(docs)?;  // ~50ms
// Throughput: ~19,000 docs/sec

Optimal batch size: 100-1000 documents per insert_many call balances throughput and memory usage.

Bulk write operations

Use bulk_write for mixed operations.

use jasonisnthappy::Database;
use serde_json::json;

let users = db.collection("users");

// Efficient: single transaction for multiple operations
let result = users.bulk_write()
    .insert(json!({"name": "Alice"}))
    .insert(json!({"name": "Bob"}))
    .update_one("name is \"Alice\"", json!({"age": 30}))
    .delete_many("status is \"inactive\"")
    .execute()?;

println!("Inserted: {}, Updated: {}, Deleted: {}",
    result.inserted_count,
    result.updated_count,
    result.deleted_count
);

Unordered writes

For maximum throughput, use unordered bulk writes.

// Unordered: continue on errors, faster
let result = users.bulk_write()
    .ordered(false)  // Don't stop on first error
    .insert(json!({"name": "User1"}))
    .insert(json!({"name": "User2"}))
    // ... 100s of operations ...
    .execute()?;

if !result.errors.is_empty() {
    eprintln!("Some operations failed: {:?}", result.errors);
}

Checkpoint management

Control when WAL is checkpointed.

// Default: auto-checkpoint every 1000 WAL frames
let db = Database::open("my.db")?;

// For bulk imports: increase threshold
db.set_auto_checkpoint_threshold(10000);  // Checkpoint less often

// ... bulk insert operations ...

// Manual checkpoint after bulk import
db.checkpoint()?;

// Restore default
db.set_auto_checkpoint_threshold(1000);

Increasing the checkpoint threshold reduces write amplification during bulk imports but uses more WAL space.

Read optimization

Use indexes

Indexes are crucial for query performance.

Without index
With index

// Slow: scans entire collection
let user = users.find("email is \"alice@example.com\"")?;
// Time: O(n) - 50ms for 10,000 documents

// Fast: uses index lookup
db.create_index("users", "email_idx", "email", true)?;
let user = users.find("email is \"alice@example.com\"")?;
// Time: O(log n) - 0.02ms for 10,000 documents

See the Indexes guide for details.

Query optimization

Use the most specific query method.

// Best: Direct ID lookup
let user = users.find_by_id("user_123")?;  // 0.01ms

// Good: find_one with index
let user = users.find_one("email is \"alice@example.com\"")?;  // 0.02ms

// Slower: find (returns all matches)
let results = users.find("email is \"alice@example.com\"")?;  // 0.03ms
let user = results.first();

// Slowest: find_all then filter in code
let all_users = users.find_all()?;  // 2ms for 10,000 docs
let user = all_users.iter().find(|u| u["email"] == "alice@example.com");

Projection

Fetch only the fields you need.

let results = users.query()
    .filter("age > 25")
    .project(&["name", "email"])  // Only load 2 fields
    .execute()?;
// Less data transferred from disk

Typed operations

Use typed methods to avoid JSON parsing overhead.

use serde::{Serialize, Deserialize};

#[derive(Serialize, Deserialize)]
struct User {
    name: String,
    email: String,
}

// Faster: direct deserialization
let user: Option<User> = users.find_by_id_typed("user_123")?;

// Slower: JSON Value then application-level parsing
let user_value = users.find_by_id("user_123")?;
let user: User = serde_json::from_value(user_value)?;

Transaction optimization

Minimize transaction scope

Keep transactions short and focused.

let mut tx = db.begin()?;
let users = tx.collection("users");
users.insert(json!({"name": "Alice"}))?;
tx.commit()?;  // Fast commit

Use retryable transactions

For automatic conflict resolution.

use jasonisnthappy::TransactionConfig;

// Configure retries
db.set_transaction_config(TransactionConfig {
    max_retries: 5,
    retry_backoff_base_ms: 1,
    max_retry_backoff_ms: 100,
});

// Run with automatic retry
let result = db.run_transaction(|tx| {
    let users = tx.collection("users");
    let count = users.count()?;
    
    users.insert(json!({"count": count}))?;
    Ok(count)
})?;

MVCC benefits

Reads never block writes, writes never block reads.

use std::thread;

// Reader thread (never blocked)
let db_clone = db.clone();
let reader = thread::spawn(move || {
    let users = db_clone.collection("users");
    loop {
        let count = users.count().unwrap();
        println!("Count: {}", count);
        thread::sleep(Duration::from_millis(10));
    }
});

// Writer thread (runs concurrently)
let db_clone = db.clone();
let writer = thread::spawn(move || {
    let users = db_clone.collection("users");
    loop {
        users.insert(json!({"data": "test"})).unwrap();
        thread::sleep(Duration::from_millis(100));
    }
});

// Both threads run concurrently without blocking

Memory optimization

Page cache

Configure the page cache size.

use jasonisnthappy::DatabaseOptions;

let db = Database::open_with_options("my.db", DatabaseOptions {
    cache_size: 50_000,  // 50K pages = ~200MB
    ..Default::default()
})?;

Cache sizing guide:

Small DBs (< 100MB): 10,000 pages (~40MB)
Medium DBs (< 1GB): 25,000 pages (~100MB)
Large DBs (> 1GB): 50,000+ pages (~200MB+)

Limit result sets

Use pagination to avoid loading large result sets.

// Good: Paginate large result sets
let page_size = 100;
let results = users.query()
    .filter("status is \"active\"")
    .limit(page_size)
    .execute()?;

// Bad: Load all results into memory
let all_results = users.find("status is \"active\"")?;  // Could be millions

Garbage collection

Reclaim space from old MVCC versions.

// Periodically clean up old versions
let stats = db.garbage_collect()?;

println!("Versions removed: {}", stats.versions_removed);
println!("Pages freed: {}", stats.pages_freed);
println!("Bytes freed: {}", stats.bytes_freed);

Garbage collection is safe to run while the database is in use. It only removes versions no longer visible to any transaction.

Disk I/O optimization

SSD vs HDD

Jasonisnthappy performs best on SSDs.

Storage	Random reads	Sequential writes	Recommended
SSD	Excellent	Excellent	✅ Yes
NVMe SSD	Exceptional	Exceptional	✅ Best
HDD	Poor	Good	⚠️ Limited

File placement

Separate database and WAL on different disks (advanced).

// Database on fast SSD
let db = Database::open("/mnt/ssd/app.db")?;

// WAL on separate disk (if supported)
// Note: Current implementation keeps WAL with database

Batch commits

Jasonisnthappy batches commits automatically for better throughput.

// Multiple concurrent writers are batched
let threads: Vec<_> = (0..4)
    .map(|i| {
        let db = db.clone();
        thread::spawn(move || {
            let users = db.collection("users");
            users.insert(json!({"thread": i})).unwrap();
        })
    })
    .collect();

// Commits are batched together for efficiency

Benchmarking

Running benchmarks

# Run all benchmarks
cargo run --release --example bench_all

# Results show:
# - Write throughput
# - Read latency
# - Bulk insert performance
# - Concurrent operations

Custom benchmarks

use std::time::Instant;

let db = Database::open("bench.db")?;
let users = db.collection("users");

// Benchmark inserts
let start = Instant::now();
for i in 0..1000 {
    users.insert(json!({"id": i}))?;
}
let elapsed = start.elapsed();
println!("1000 inserts: {:.2?}", elapsed);
println!("Throughput: {:.0} ops/sec", 1000.0 / elapsed.as_secs_f64());

// Benchmark queries
let start = Instant::now();
for i in 0..1000 {
    users.find_one(&format!("id is {}", i))?;
}
let elapsed = start.elapsed();
println!("1000 queries: {:.2?}", elapsed);
println!("Avg latency: {:.2}ms", elapsed.as_millis() as f64 / 1000.0);

Monitoring

Metrics

Track database performance with built-in metrics.

use jasonisnthappy::Database;

let db = Database::open("my.db")?;

// Get metrics snapshot
let metrics = db.metrics();

println!("Transactions:");
println!("  Begun: {}", metrics.transactions_begun);
println!("  Committed: {}", metrics.transactions_committed);
println!("  Rolled back: {}", metrics.transactions_rolled_back);
println!("  Conflicts: {}", metrics.transaction_conflicts);

println!("\nCache:");
println!("  Hits: {}", metrics.cache_hits);
println!("  Misses: {}", metrics.cache_misses);
let hit_rate = metrics.cache_hits as f64 / 
               (metrics.cache_hits + metrics.cache_misses) as f64 * 100.0;
println!("  Hit rate: {:.1}%", hit_rate);

println!("\nWAL:");
println!("  Frames written: {}", metrics.wal_frames_written);
println!("  Checkpoints: {}", metrics.checkpoints);

Performance tuning

Use metrics to identify bottlenecks.

Check cache hit rate

let metrics = db.metrics();
let hit_rate = metrics.cache_hits as f64 / 
               (metrics.cache_hits + metrics.cache_misses) as f64;

if hit_rate < 0.9 {
    println!("Cache too small - increase cache_size");
}

Monitor transaction conflicts

if metrics.transaction_conflicts > metrics.transactions_committed / 10 {
    println!("High conflict rate - review transaction patterns");
}

Track WAL growth

let frame_count = db.frame_count();
if frame_count > 10000 {
    println!("Large WAL - consider checkpoint");
    db.checkpoint()?;
}

Best practices summary

Do:

Use insert_many for bulk inserts (100-1000 docs)
Create indexes on frequently queried fields
Use find_one when you only need the first result
Project only needed fields
Keep transactions short
Use typed operations for hot paths
Configure adequate cache size
Run garbage collection periodically

Avoid:

Individual inserts in loops
Queries without indexes on large collections
Loading entire collections with find_all
Long-running transactions
Excessive checkpoint frequency
Too many indexes (slow writes)
Running on slow HDDs if possible

Performance checklist

Real-world performance

E-commerce application

// Product catalog: 100,000 products
// - Indexed: name, category, price
// - Cache: 50,000 pages (~200MB)
// - Storage: NVMe SSD

// Search products: 0.5-2ms
let results = products.query()
    .filter("category is \"electronics\" and price < 500")
    .sort_by("price", SortOrder::Asc)
    .limit(20)
    .execute()?;

// Add to cart: 8-12ms (with ACID)
let cart = db.collection("carts");
cart.insert(json!({
    "user_id": user_id,
    "product_id": product_id,
    "quantity": 1
}))?;

// Checkout (bulk write): 30-50ms
let result = db.run_transaction(|tx| {
    // Update inventory, create order, clear cart
    // ... multiple operations ...
    Ok(order_id)
})?;

Analytics dashboard

// Events: 1M+ documents
// - Aggregation: 100-500ms
// - Cached results: < 1ms

let hourly_stats = events.aggregate()
    .match_("timestamp > 1704067200")
    .group_by("event_type")
    .count("total")
    .execute()?;  // ~200ms for 1M events

Next steps

Indexes

Create indexes to speed up queries

Querying

Optimize query patterns

CRUD operations

Use efficient write operations

Backup and restore

Backup strategies for production

Backup and restore

Rust SDK

​Performance characteristics

​Key metrics

Bulk inserts

Single writes

Reads

Concurrent scaling

​ACID compliance

​Write optimization

​Batch operations

​Bulk write operations

​Unordered writes

​Checkpoint management

​Read optimization

​Use indexes

​Query optimization

​Projection

​Typed operations

​Transaction optimization

​Minimize transaction scope

​Use retryable transactions

​MVCC benefits

​Memory optimization

​Page cache

​Limit result sets

​Garbage collection

​Disk I/O optimization

​SSD vs HDD

​File placement

​Batch commits

​Benchmarking

​Running benchmarks

​Custom benchmarks

​Monitoring

​Metrics

​Performance tuning

​Best practices summary

​Performance checklist

​Real-world performance

​E-commerce application

​Analytics dashboard

​Next steps

Indexes

Querying

CRUD operations

Backup and restore

Performance characteristics

Key metrics

ACID compliance

Write optimization

Batch operations

Bulk write operations

Unordered writes

Checkpoint management

Read optimization

Use indexes

Query optimization

Projection

Typed operations

Transaction optimization

Minimize transaction scope

Use retryable transactions

MVCC benefits

Memory optimization

Page cache

Limit result sets

Garbage collection

Disk I/O optimization

SSD vs HDD

File placement

Batch commits

Benchmarking

Running benchmarks

Custom benchmarks

Monitoring

Metrics

Performance tuning

Best practices summary

Performance checklist

Real-world performance

E-commerce application

Analytics dashboard

Next steps