The vault to search.
query is required + non-empty. limit defaults to 10.
min_signals (default 1) requires that many rankers fired for a hit.
granularity: "note" (default) collapses to best chunk per note;
"block" keeps each chunk distinct. graph_boost defaults to true.
Optionalembedding_model?: stringOptionalfolder?: stringOptionalgranularity?: "block" | "note"v2.2.0: "note" (default) returns 1 hit per note, picking the best chunk; "block" returns each chunk as a distinct hit so you see the multiple-paragraph case where one note covers a topic in two places.
Optionalgraph_boost?: booleanv2.3.0: post-RRF graph boost — rerank by counting how many other top-K hits link to each one. Default true; set false to disable for diagnostic comparison (e.g. measuring whether boost helped).
Optionallimit?: numberOptionalmin_signals?: numberServer-side context: ftsIndex (nullable), embedFile
(path may not exist), optional reranker config, optional
rerankerOverride (test injection point), optional hnsw context for
accelerated k-NN.
Path to the .embed.db (file may or may not exist — checked at call time).
FTS5 index, if --persistent-index is enabled at server start.
Optionalhnsw?: HnswSearchContext | nullv2.13.0 — optional HNSW context for the embeddings-search arm. When passed, the embedding-side k-NN goes through the in-memory HNSW index (sub-10ms at any scale) instead of the O(n) brute-force cosine in EmbedDb.search(). Built on serve start; lives in ServerDeps.hnswContext. Null/undefined → brute-force fallback.
Optionalreranker?: { alias?: string; topN?: number }v2.9.0 — optional cross-encoder reranker config. When set, the top-N hits from RRF (default 50) are re-scored by a BGE-style cross-encoder and re-sorted before truncation. Adds ~30-50ms per query on M1 CPU for a 50-candidate set.
alias resolves to a RERANKER_MODELS entry. topN defaults to 50.
Lazy-loaded — first call downloads the model from HuggingFace
(~25-110 MB depending on alias). Failures are swallowed and surface
via signal_errors.reranker so the whole search doesn't break on a
model load issue.
OptionalrerankerOverride?: { score(query: string, passages: readonly string[]): Promise<number[]> }v2.9.0 — test-only injection point. When set, this pre-loaded
reranker is used instead of lazy-loading via loadReranker(alias).
Lets unit tests validate the rerank-and-resort plumbing without
pulling in the real ML model. Unused in production callers.
A SearchHybridResponse with sorted matches, observability
in signals_used / signal_errors, and per-hit per_signal breakdown.
const result = await searchHybrid(
vault,
{ query: "RAG hybrid retrieval", limit: 10, folder: "Reference" },
{ ftsIndex, embedFile: "/path/to/vault.embed.db" }
);
for (const hit of result.matches) {
console.log(hit.path, hit.score, hit.per_signal);
}
console.log("Rankers fired:", result.signals_used);
Hybrid retrieval — fuses BM25 + TF-IDF + ML embeddings via Reciprocal Rank Fusion (Cormack et al, 2009). The recommended search entry point.
Most agents should call this rather than the single-ranker variants (searchText, semanticSearch, embeddingsSearch) because the umbrella auto-detects which signals are available and produces consistent recall across user setups. Gracefully degrades:
--persistent-index) → TF-IDF + embeddings (or just TF-IDF)build-embeddings) → BM25 + TF-IDFTwo unique signal layers ride on top of RRF: