feat(qdrant): optimization — payload indexes, HNSW tuning, search params (v1)
Inspection findings:
- _ensure_collection() created collections with bare VectorParams (no HNSW/optimizer config)
- _do_search() had no SearchParams — used Qdrant defaults (ef often ~100, no indexed_only)
- No payload index management at all — filtered searches scanned unindexed fields every time
- collection_info() returned minimal data — impossible to inspect production state
- No way to create/ensure payload indexes via the API
Changes — qdrant/main.py:
- Add SEARCH_HNSW_EF env var (default 128, above Qdrant default for better recall)
- _ensure_collection(): configure HnswConfigDiff(m=16, ef_construct=200, on_disk=False)
and OptimizersConfigDiff(indexing_threshold=20000, default_segment_number=4) on creation
- _do_search(): use SearchParams(hnsw_ef, exact, indexed_only) on every query
- SearchUrlRequest + SearchVectorRequest: expose hnsw_ef, exact, indexed_only per request
- collection_info(): expand to full HNSW/optimizer/quantization/segment/payload_schema detail
- GET /collections/{name}/indexes — list all payload indexes
- POST /collections/{name}/indexes — create a single payload index
- POST /collections/{name}/ensure-indexes — idempotent bulk index creation (skip existing)
- POST /collections/{name}/configure — apply HNSW/optimizer changes to existing collections
Changes — gateway/main.py:
- Expose the 4 new qdrant-svc endpoints under /vectors/collections/{name}/...
Changes — docker-compose.yml:
- Add SEARCH_HNSW_EF=128 to qdrant-svc environment
Critical usage note for existing collections:
After deploying, call POST /vectors/collections/images/ensure-indexes with the
payload fields actually used in filter_metadata (is_public, category_id, etc.)
to add missing indexes. This is the highest-impact single action for filtered search.
This commit is contained in:
@@ -416,3 +416,33 @@ async def cards_render_meta(payload: Dict[str, Any]):
|
||||
"""Return crop and layout metadata for a card render (no image produced)."""
|
||||
async with httpx.AsyncClient(timeout=VISION_TIMEOUT) as client:
|
||||
return await _post_json(client, f"{CARD_RENDERER_URL}/render/meta", payload)
|
||||
|
||||
|
||||
# ---- Qdrant administration endpoints (index management + collection config) ----
|
||||
|
||||
@app.get("/vectors/collections/{name}/indexes")
|
||||
async def vectors_collection_indexes(name: str):
|
||||
"""List payload indexes for a collection."""
|
||||
async with httpx.AsyncClient(timeout=VISION_TIMEOUT) as client:
|
||||
return await _get_json(client, f"{QDRANT_SVC_URL}/collections/{name}/indexes")
|
||||
|
||||
|
||||
@app.post("/vectors/collections/{name}/indexes")
|
||||
async def vectors_create_payload_index(name: str, payload: Dict[str, Any]):
|
||||
"""Create a payload index on a field in a collection."""
|
||||
async with httpx.AsyncClient(timeout=VISION_TIMEOUT) as client:
|
||||
return await _post_json(client, f"{QDRANT_SVC_URL}/collections/{name}/indexes", payload)
|
||||
|
||||
|
||||
@app.post("/vectors/collections/{name}/ensure-indexes")
|
||||
async def vectors_ensure_indexes(name: str, payload: Dict[str, Any]):
|
||||
"""Idempotently ensure payload indexes exist for a list of fields."""
|
||||
async with httpx.AsyncClient(timeout=VISION_TIMEOUT) as client:
|
||||
return await _post_json(client, f"{QDRANT_SVC_URL}/collections/{name}/ensure-indexes", payload)
|
||||
|
||||
|
||||
@app.post("/vectors/collections/{name}/configure")
|
||||
async def vectors_configure_collection(name: str, payload: Dict[str, Any]):
|
||||
"""Update HNSW and optimizer configuration for a collection."""
|
||||
async with httpx.AsyncClient(timeout=VISION_TIMEOUT) as client:
|
||||
return await _post_json(client, f"{QDRANT_SVC_URL}/collections/{name}/configure", payload)
|
||||
|
||||
Reference in New Issue
Block a user