Performance & Tuning

Chassis is built on HNSW (Hierarchical Navigable Small World) graphs. You can tune the trade-off between build speed, search speed, and recall accuracy using IndexOptions.

Configuration Options

You can pass an IndexOptions object when opening the index.

from chassis import VectorIndex, IndexOptions

options = IndexOptions(
    max_connections=32,    # 'M' in HNSW papers
    ef_construction=400,   # Build quality
    ef_search=100          # Search quality
)

index = VectorIndex("tuned.chassis", dimensions=128, options=options)

Understanding Parameters

Parameter	Description	Default	Impact
`max_connections`	Max edges per node in the graph.	16	Higher = Better recall, higher memory usage.
`ef_construction`	Size of the dynamic candidate list during build.	200	Higher = Slower build, higher quality graph.
`ef_search`	Size of the dynamic candidate list during search.	50	Higher = Slower search, better recall.

Batch Insertion Strategy

Calling flush() involves an fsync system call, which is expensive. For maximum write throughput:

Add vectors in batches (e.g., 1,000 to 10,000).
Call flush() only after the batch is complete.

# Bad: Slow due to excessive syscalls
for vec in huge_dataset:
    index.add(vec)
    index.flush() 

# Good: High throughput
batch_size = 1000
for i, vec in enumerate(huge_dataset):
    index.add(vec)
    if i % batch_size == 0:
        index.flush()
index.flush() # Final flush