QANATIX
Search

Search Overview

How QANATIX hybrid search works — semantic + keyword + identifier detection.

Search

QANATIX uses hybrid search combining dense vectors (semantic), sparse vectors (keyword), and exact-match identifier detection to return the most relevant results in < 200ms.

Query classification

Every query is automatically classified:

TypeDetectionStrategy
IdentifierPart numbers, EANs, CAS numbers, DIN/ISO/EN standards, SKUsExact match on indexed identifiers
Keyword1-3 wordsSparse vector (BM25) weighted
Semantic4+ words, natural languageDense + sparse fusion

Examples:

  • SS-M8-40-A2identifier — exact match, score = 1.0
  • stainless boltkeyword — BM25 weighted
  • ISO 9001 certified M8 suppliers in Germanysemantic — hybrid fusion

Hybrid fusion

For semantic and keyword queries, QANATIX runs two searches in parallel:

  1. Dense vector search — semantic similarity (OpenAI, Cohere, or BGE-M3)
  2. Sparse vector search — keyword relevance (BM25 or model-learned sparse)

Results are fused using Distribution-Based Score Fusion (DBSF) — scores normalized by their statistical distribution, not min/max. Prefetch: 50 candidates per signal.

With BGE-M3 embeddings, a third signal is available:

  1. ColBERT reranking — token-level late interaction for fine-grained matching (top 20 candidates)

Reranking

When enabled (default), results are re-scored with a cross-encoder model (BAAI/bge-reranker-v2-m3):

  • Triggered when 10+ results are returned and rerank=true
  • Over-fetches max(limit * 2, 20) candidates for the reranker pool
  • Final score: 0.3 * retrieval_score + 0.7 * reranker_score

Zero-result fallback

If hybrid search returns no results, QANATIX automatically:

  1. Relaxes filters (removes all filter constraints)
  2. Falls back to dense-only search
  3. Falls back to Postgres full-text search (degraded mode)

The response metadata.search_mode indicates which strategy was used.

Scoring

Results are scored 0-1 (DBSF normalized). Higher is better.

Score rangeInterpretation
0.8 - 1.0Excellent match
0.6 - 0.8Good match
0.3 - 0.6Partial match
< 0.3Weak match

Results with score < 0.05 are filtered out.

Identifier matches always return score = 1.0.

Caching

CacheTTLKey
Response cache30 secondstenant + vertical + normalized query
Query embedding cache1 hourprovider + model + dimensions + query hash
Entity embedding cache7 daysprovider + model + dimensions + text hash

Next

On this page