feat(ai-runtime): complete ai runtime policy refactor (ADR-035)
This commit is contained in:
@@ -57,6 +57,12 @@ OLLAMA_EMBED_MODEL=nomic-embed-text
|
||||
OLLAMA_RAG_MODEL=typhoon2.5-np-dms:latest
|
||||
OLLAMA_URL=http://192.168.10.8:11434
|
||||
|
||||
# VRAM, Residency & Concurrency settings (Feature-235 AI Runtime Policy)
|
||||
AI_VRAM_HEADROOM_THRESHOLD_MB=3000
|
||||
AI_GPU_MAIN_MODEL_PRESSURE_THRESHOLD_MB=12000
|
||||
AI_OCR_RESIDENCY_WINDOW_SECONDS=120
|
||||
AI_REALTIME_CONCURRENCY=2
|
||||
|
||||
# Qdrant (ADR-023A)
|
||||
QDRANT_HOST=http://192.168.10.8:6333
|
||||
QDRANT_COLLECTION=lcbp3_documents
|
||||
|
||||
Reference in New Issue
Block a user