The vault. Used for path-exclusion filtering and to error on missing index with a guidance message.
query is required + non-empty. limit defaults to 10,
min_score to 0.3 (relatively high cosine floor — embeddings cosine
has a tighter distribution than TF-IDF). model overrides the
embedder alias. hypothetical_answer enables HyDE.
Optionalfolder?: stringOptionalhypothetical_answer?: stringv3.1.0 — HyDE (Hypothetical Document Embeddings) augmentation.
When set, this string is embedded instead of query. The agent
generates a synthetic answer to its own question, embeds that,
and retrieves against the answer-shaped vector — typically beats
raw-query retrieval on under-specified queries by +2-5 NDCG@10.
The query string is still echoed in the response for caller
audit-trail; it does NOT influence retrieval when hypothetical_answer
is present.
Optionallimit?: numberOptionalmin_score?: numberOptionalmodel?: stringAbsolute path to the .embed.db. Existence is checked
before any model load so the error message is fast and clear.
Optionalhnsw: HnswSearchContext | nullOptional HNSW index context. When passed, k-NN routes through HNSW instead of brute-force cosine.
An EmbedSearchResponse with chunk-level matches and a
hyde: true marker iff HyDE actually fired.
ML embeddings retrieval — k-NN over a persistent vector index.
Hits a
.embed.db(SQLite) built byenquire-mcp build-embeddings. The index is opt-in and out-of-band: this function lazy-loads the@huggingface/transformersruntime + the embedder model only when called. If the user hasn't runbuild-embeddings, returns a clean error pointing to the setup command instead of blocking inside model load.Supports HyDE (Hypothetical Document Embeddings, Gao et al 2023): pass
hypothetical_answerand that text is embedded instead ofquery— typically +2-5 NDCG@10 on under-specified queries. Optional HNSW acceleration (sub-10ms k-NN at any scale) when an HnswSearchContext is provided; otherwise falls back to brute-force cosine inEmbedDb.Privacy contract: hits are filtered through
vault.isExcluded()before return — entries in the.embed.dbfor paths now matched by--exclude-glob/--read-pathsnever leak through. To keep the returned count stable under normal exclude-glob use, the search over-fetches by 2× (brute-force) or 6× (HNSW). Under extreme configurations where the majority of the embed-db is excluded, fewer thanlimitresults may be returned — this is accepted behavior: privacy takes precedence over result count.