llm: add FastAPI shim, gateway LLM endpoints, tests, and docs

This commit is contained in:
2026-04-12 09:41:21 +02:00
parent baf497b015
commit 59c9584250
15 changed files with 1779 additions and 11 deletions

9
models/qwen3/README.md Normal file
View File

@@ -0,0 +1,9 @@
Place the Qwen3 GGUF model file for the local llm profile in this directory.
Expected default filename:
- `Qwen3-1.7B-Instruct-Q4_K_M.gguf`
You can use a different filename, but then set `MODEL_PATH` in `.env` to match the mounted path inside the container.
The model is intentionally not auto-downloaded at startup. Operators should provision it explicitly so container startup is predictable.