llm: add FastAPI shim, gateway LLM endpoints, tests, and docs

2026-04-12 09:41:21 +02:00
parent baf497b015
commit 59c9584250
15 changed files with 1779 additions and 11 deletions
--- a/models/qwen3/README.md
+++ b/models/qwen3/README.md
@@ -0,0 +1,9 @@
+Place the Qwen3 GGUF model file for the local llm profile in this directory.
+
+Expected default filename:
+
+- `Qwen3-1.7B-Instruct-Q4_K_M.gguf`
+
+You can use a different filename, but then set `MODEL_PATH` in `.env` to match the mounted path inside the container.
+
+The model is intentionally not auto-downloaded at startup. Operators should provision it explicitly so container startup is predictable.