# Skinbase Vision Stack (CLIP + BLIP + YOLO + Qdrant + Card Renderer + Maturity + LLM) – Dockerized FastAPI This repository provides internal AI services for image analysis, vector search, card rendering, moderation, and text generation behind a single **Gateway API**. ## Services & Ports - `gateway` (exposed): `https://vision.klevze.net` - `clip`: internal only - `blip`: internal only - `yolo`: internal only - `qdrant`: vector DB (port `6333` exposed for direct access) - `qdrant-svc`: internal Qdrant API wrapper - `card-renderer`: internal card rendering service - `maturity`: internal NSFW/maturity classifier service - `llm`: internal text-generation service using a thin FastAPI shim over `llama-server` (profile-based, internal only) ## Run ```bash docker compose up -d --build ``` That starts the default vision stack only. The LLM service is disabled by default so operators are not forced to run Qwen3 on the same host. To also start the local llama.cpp service: ```bash docker compose --profile llm up -d --build ``` Before enabling the `llm` profile locally, place the GGUF model file described in [models/qwen3/README.md](models/qwen3/README.md) and set `LLM_ENABLED=true` in `.env`. If you use BLIP, create a `.env` file first. Required variables: ```bash API_KEY=your_api_key_here HUGGINGFACE_TOKEN=your_huggingface_token_here ``` `HUGGINGFACE_TOKEN` is required when the configured BLIP model is private, gated, or otherwise requires Hugging Face authentication. Optional maturity configuration (override in `.env` if needed): ```bash MATURITY_MODEL=Falconsai/nsfw_image_detection MATURITY_THRESHOLD_MATURE=0.80 MATURITY_THRESHOLD_REVIEW=0.60 MATURITY_ENABLED=true ``` Optional LLM configuration: ```bash LLM_ENABLED=false LLM_URL=http://llm:8080 LLM_DEFAULT_MODEL=qwen3-1.7b-instruct-q4_k_m LLM_TIMEOUT=120 LLM_MAX_TOKENS_DEFAULT=256 LLM_MAX_TOKENS_HARD_LIMIT=1024 LLM_MAX_REQUEST_BYTES=65536 # Local llm profile only MODEL_PATH=/models/Qwen3-1.7B-Instruct-Q4_K_M.gguf LLM_CONTEXT_SIZE=4096 LLM_THREADS=4 LLM_GPU_LAYERS=0 ``` Recommended production topology for the LLM: keep the gateway on the current vision host and point `LLM_URL` at a separate private machine or VPN-reachable container host. Running the full vision stack and Qwen3 together on a small 4c/8GB VPS will usually degrade both. Service startup now waits on container healthchecks, so first boot may take longer while models finish loading. ## Health ```bash curl -H "X-API-Key: " https://vision.klevze.net/health ``` LLM-specific gateway health: ```bash curl -H "X-API-Key: " https://vision.klevze.net/ai/health ``` ## LLM Smoke Test Use this checklist on a Docker-capable host after provisioning the GGUF file and setting `LLM_ENABLED=true`. 1. Start the gateway and local LLM profile. ```bash docker compose --profile llm up -d --build gateway llm ``` 2. Confirm the LLM container is running and healthy. ```bash docker compose ps llm docker compose logs --tail=100 llm ``` 3. Check the internal LLM health contract. ```bash curl http://127.0.0.1:8080/health ``` Expected fields: `status`, `model`, `context_size`, `threads`. 4. Check gateway health and LLM reachability. ```bash curl -H "X-API-Key: " http://127.0.0.1:8003/health curl -H "X-API-Key: " http://127.0.0.1:8003/ai/health ``` 5. Verify model discovery through the gateway. ```bash curl -H "X-API-Key: " http://127.0.0.1:8003/v1/models ``` 6. Run a short non-streaming chat completion. ```bash curl -H "X-API-Key: " -X POST http://127.0.0.1:8003/ai/chat \ -H "Content-Type: application/json" \ -d '{ "messages": [ {"role": "system", "content": "You are a concise assistant for Skinbase Nova."}, {"role": "user", "content": "Write one sentence about an artist who creates cinematic sci-fi wallpaper packs."} ], "max_tokens": 80 }' ``` 7. If anything fails, inspect the two relevant services first. ```bash docker compose logs --tail=200 llm docker compose logs --tail=200 gateway ``` ## Universal analyze (ALL) ### With URL ```bash curl -H "X-API-Key: " -X POST https://vision.klevze.net/analyze/all \ -H "Content-Type: application/json" \ -d '{"url":"https://files.skinbase.org/img/aa/bb/cc/md.webp","limit":5}' ``` ### With file upload (multipart) ```bash curl -H "X-API-Key: " -X POST https://vision.klevze.net/analyze/all/file \ -F "file=@/path/to/image.webp" \ -F "limit=5" ``` ## Individual services (via gateway) ### CLIP tags ```bash curl -H "X-API-Key: " -X POST https://vision.klevze.net/analyze/clip -H "Content-Type: application/json" \ -d '{"url":"https://files.skinbase.org/img/aa/bb/cc/md.webp","limit":5}' ``` ### CLIP tags (file) ```bash curl -H "X-API-Key: " -X POST https://vision.klevze.net/analyze/clip/file \ -F "file=@/path/to/image.webp" \ -F "limit=5" ``` ### BLIP caption ```bash curl -H "X-API-Key: " -X POST https://vision.klevze.net/analyze/blip -H "Content-Type: application/json" \ -d '{"url":"https://files.skinbase.org/img/aa/bb/cc/md.webp","variants":3}' ``` ### BLIP caption (file) ```bash curl -H "X-API-Key: " -X POST https://vision.klevze.net/analyze/blip/file \ -F "file=@/path/to/image.webp" \ -F "variants=3" \ -F "max_length=60" ``` ### YOLO detect ```bash curl -H "X-API-Key: " -X POST https://vision.klevze.net/analyze/yolo -H "Content-Type: application/json" \ -d '{"url":"https://files.skinbase.org/img/aa/bb/cc/md.webp","conf":0.25}' ``` ### YOLO detect (file) ```bash curl -H "X-API-Key: " -X POST https://vision.klevze.net/analyze/yolo/file \ -F "file=@/path/to/image.webp" \ -F "conf=0.25" ``` ## Maturity / NSFW analysis Analyzes an image and returns a normalized maturity signal for Nova moderation workflows. ### Analyze by URL ```bash curl -H "X-API-Key: " -X POST https://vision.klevze.net/analyze/maturity \ -H "Content-Type: application/json" \ -d '{"url":"https://files.skinbase.org/img/aa/bb/cc/md.webp"}' ``` ### Analyze from file upload ```bash curl -H "X-API-Key: " -X POST https://vision.klevze.net/analyze/maturity/file \ -F "file=@/path/to/image.webp" ``` Example response: ```json { "maturity_label": "mature", "confidence": 0.94, "score": 0.94, "labels": ["nsfw"], "model": "Falconsai/nsfw_image_detection", "threshold_used": 0.80, "analysis_time_ms": 183.0, "source": "maturity-service", "action_hint": "flag_high", "advisory": "High-confidence mature content detected" } ``` `action_hint` values: `safe`, `review`, `flag_high`. Nova should use these to decide blur/queue/flag behaviour. ## Vector DB (Qdrant) via gateway Qdrant point IDs must be either: - an unsigned integer - a UUID string If you send another string value, the wrapper may replace it with a generated UUID. In that case the original value is stored in the payload as `_original_id`. You can fetch a stored point by its preserved original application ID: ```bash curl -H "X-API-Key: " https://vision.klevze.net/vectors/points/by-original-id/img-001 ``` ### Store image embedding by URL ```bash curl -H "X-API-Key: " -X POST https://vision.klevze.net/vectors/upsert \ -H "Content-Type: application/json" \ -d '{"url":"https://files.skinbase.org/img/aa/bb/cc/md.webp","id":"550e8400-e29b-41d4-a716-446655440000","metadata":{"category":"wallpaper"}}' ``` ### Store image embedding by file upload ```bash curl -H "X-API-Key: " -X POST https://vision.klevze.net/vectors/upsert/file \ -F "file=@/path/to/image.webp" \ -F 'id=550e8400-e29b-41d4-a716-446655440001' \ -F 'metadata_json={"category":"photo"}' ``` ### Search similar images by URL ```bash curl -H "X-API-Key: " -X POST https://vision.klevze.net/vectors/search \ -H "Content-Type: application/json" \ -d '{"url":"https://files.skinbase.org/img/aa/bb/cc/md.webp","limit":5,"filter_metadata":{"is_public":true}}' ``` Optional search parameters: `hnsw_ef` (int), `exact` (bool), `indexed_only` (bool), `score_threshold` (float), `filter_metadata` (object). ### Search similar images by file upload ```bash curl -H "X-API-Key: " -X POST https://vision.klevze.net/vectors/search/file \ -F "file=@/path/to/image.webp" \ -F "limit=5" \ -F 'filter_metadata_json={"is_public":true}' ``` ### List collections ```bash curl -H "X-API-Key: " https://vision.klevze.net/vectors/collections ``` ### Get collection info ```bash curl -H "X-API-Key: " https://vision.klevze.net/vectors/collections/images ``` ### Full diagnostic inspect ```bash curl -H "X-API-Key: " https://vision.klevze.net/vectors/inspect ``` Returns HNSW config, optimizer config, quantization, segment count, payload index coverage, and RAM estimate for every collection. ### Payload index management ```bash # List indexes curl -H "X-API-Key: " https://vision.klevze.net/vectors/collections/images/indexes # Create a single index curl -H "X-API-Key: " -X POST https://vision.klevze.net/vectors/collections/images/indexes \ -H "Content-Type: application/json" \ -d '{"field":"is_public","type":"bool"}' # Ensure multiple indexes (idempotent) curl -H "X-API-Key: " -X POST https://vision.klevze.net/vectors/collections/images/ensure-indexes \ -H "Content-Type: application/json" \ -d '{"fields":[{"field":"is_public","type":"bool"},{"field":"category_id","type":"integer"}]}' ``` Supported index types: `keyword`, `integer`, `float`, `bool`, `geo`, `datetime`, `text`, `uuid`. ### Collection configuration (HNSW / optimizer / quantization) ```bash curl -H "X-API-Key: " -X POST https://vision.klevze.net/vectors/collections/images/configure \ -H "Content-Type: application/json" \ -d '{"hnsw_m":16,"hnsw_ef_construct":200,"indexing_threshold":20000,"quantization_type":"int8"}' ``` ### Delete points ```bash curl -H "X-API-Key: " -X POST https://vision.klevze.net/vectors/delete \ -H "Content-Type: application/json" \ -d '{"ids":["550e8400-e29b-41d4-a716-446655440000","550e8400-e29b-41d4-a716-446655440001"]}' ``` If you let the wrapper generate a UUID, use the returned `id` value for later `get`, `search`, or `delete` operations. ## Card Renderer ### List available templates ```bash curl -H "X-API-Key: " https://vision.klevze.net/cards/templates ``` ### Render a card from a URL ```bash curl -H "X-API-Key: " -X POST https://vision.klevze.net/cards/render \ -H "Content-Type: application/json" \ -d '{"url":"https://files.skinbase.org/img/aa/bb/cc/md.webp","title":"Artwork Title","username":"@artist","template":"nova-artwork-v1"}' ``` Returns binary image bytes (WebP by default). ### Render a card from a file upload ```bash curl -H "X-API-Key: " -X POST https://vision.klevze.net/cards/render/file \ -F "file=@/path/to/image.webp" \ -F "title=Artwork Title" \ -F "username=@artist" \ -F "template=nova-artwork-v1" ``` ### Get card layout metadata (no image rendered) ```bash curl -H "X-API-Key: " -X POST https://vision.klevze.net/cards/render/meta \ -H "Content-Type: application/json" \ -d '{"url":"https://files.skinbase.org/img/aa/bb/cc/md.webp","title":"Artwork Title"}' ``` ## LLM / Chat Completions The gateway exposes stable text-generation endpoints backed by the internal `llm` service. They reuse the existing `X-API-Key` protection and keep the LLM container internal-only. ### OpenAI-style chat endpoint ```bash curl -H "X-API-Key: " -X POST https://vision.klevze.net/v1/chat/completions \ -H "Content-Type: application/json" \ -d '{ "messages": [ {"role": "system", "content": "You are a concise assistant for Skinbase Nova."}, {"role": "user", "content": "Write a short creator biography for an artist who just hit 10,000 followers."} ], "temperature": 0.7, "max_tokens": 220, "stream": false }' ``` ### Project-friendly chat endpoint ```bash curl -H "X-API-Key: " -X POST https://vision.klevze.net/ai/chat \ -H "Content-Type: application/json" \ -d '{ "messages": [ {"role": "system", "content": "You are a concise assistant for Skinbase Nova."}, {"role": "user", "content": "Suggest metadata tags for a cyberpunk wallpaper pack."} ], "max_tokens": 180 }' ``` ### List models ```bash curl -H "X-API-Key: " https://vision.klevze.net/v1/models curl -H "X-API-Key: " https://vision.klevze.net/ai/models ``` ## Notes - Models are loaded at service startup; initial container start can take 1–2 minutes as model weights are downloaded. - Qdrant data is persisted in the project folder at `./data/qdrant`, so it survives container restarts and recreates. - The local `llm` profile does **not** auto-download Qwen3 weights. Mount the GGUF file explicitly and let startup fail fast if it is missing. - Remote image URLs are restricted to public `http`/`https` hosts. Localhost, private IP ranges, and non-image content types are rejected. - The maturity service uses `Falconsai/nsfw_image_detection` (ViT-based). Thresholds are configurable via `.env`. The model handles photos and stylized digital art but should be calibrated against real Skinbase content before production use. - For small VPS deployments, prefer `LLM_ENABLED=true` with `LLM_URL` pointing to a separate LLM host instead of running the `llm` profile on the same machine. - For production: add auth, rate limits, and restrict gateway exposure (private network). - GPU: you can add NVIDIA runtime later (compose profiles) if needed.