Files
lcbp3/specs/200-fullstacks/235-ai-runtime-policy-refactor/quickstart.md
T
admin 71c5e88181
CI / CD Pipeline / build (push) Has been skipped
CI / CD Pipeline / deploy (push) Has been skipped
690611:1705 ADR-035-235 #00 [skip CI]
2026-06-11 17:05:17 +07:00

3.7 KiB

// File: specs/200-fullstacks/235-ai-runtime-policy-refactor/quickstart.md // Change Log: // - 2026-06-11: Verification quickstart for AI Runtime Policy Refactor

Quickstart: AI Runtime Policy Refactor — Verification Guide

Prerequisites

  • Backend running (pnpm run start:dev in backend/)
  • OCR sidecar running on Desk-5439 (docker compose up in ocr-sidecar/)
  • Ollama running with np-dms-ai and np-dms-ocr tags registered
  • Admin user token available

Gate 1: Policy Contract Verification

1A. Reject model.key (should return 400)

curl -X POST http://localhost:3001/api/ai/jobs \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"type": "rag-query", "model": {"key": "typhoon2.5-np-dms:latest"}}' \
  | jq '.statusCode, .message'
# Expected: 400, message about model.key not allowed

1B. Reject parameter overrides (should return 400)

curl -X POST http://localhost:3001/api/ai/jobs \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"type": "rag-query", "temperature": 0.9}' \
  | jq '.statusCode'
# Expected: 400

1C. Valid executionProfile (should return 201)

curl -X POST http://localhost:3001/api/ai/jobs \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"type": "rag-query", "executionProfile": "balanced", "documentPublicId": "<uuid>"}' \
  | jq '.data.modelUsed'
# Expected: "np-dms-ai"

1D. large-context by non-admin (should return 403)

curl -X POST http://localhost:3001/api/ai/jobs \
  -H "Authorization: Bearer $NON_ADMIN_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"type": "rag-query", "executionProfile": "large-context"}' \
  | jq '.statusCode'
# Expected: 403

Gate 2: Canonical Naming Verification

2A. Check audit log after job

SELECT metadata->>'$.modelUsed' FROM ai_audit_logs ORDER BY created_at DESC LIMIT 1;
-- Expected: "np-dms-ai" (ไม่ใช่ "typhoon2.5-np-dms:latest")

2B. Check Admin Console (Manual)

  1. เปิด /admin/ai ใน browser
  2. ตรวจว่า model labels ทั้งหมดแสดง np-dms-ai และ np-dms-ocr
  3. ตรวจว่าไม่มี typhoon* ปรากฏใน UI

Gate 3: Adaptive OCR Residency Verification

3A. OCR under large-context profile

# ส่ง OCR job ขณะที่มี large-context job active
# ดู sidecar log
docker logs ocr-sidecar --tail 20
# Expected log line: keep_alive=0 reason=large-context-active

3B. OCR with headroom sufficient

# ส่ง OCR job เมื่อ GPU headroom สูง (ไม่มี model loaded หนัก)
docker logs ocr-sidecar --tail 20
# Expected log line: keep_alive=120 reason=headroom-sufficient

Gate 4: Retrieval CPU Fallback Verification

4A. Force GPU pressure then run RAG

# 1. Force load large model
curl http://localhost:11434/api/generate -d '{"model":"np-dms-ai","prompt":"warmup","keep_alive":-1}'

# 2. Run RAG query
curl -X POST http://localhost:3001/api/ai/jobs \
  -H "Authorization: Bearer $TOKEN" \
  -d '{"type":"rag-query","executionProfile":"balanced","documentPublicId":"<uuid>"}' \
  | jq '.data.status'
# Expected: "completed" (ไม่ fail)

# 3. ตรวจ sidecar log
docker logs ocr-sidecar --tail 20
# Expected: device=cpu reason=gpu-headroom-below-threshold

Automated Test Suite

# Backend unit + integration tests
cd backend
pnpm test -- --testPathPattern="ai-policy|ocr-residency|execution-profile"

# Sidecar tests
cd specs/04-Infrastructure-OPS/04-00-docker-compose/Desk-5439/ocr-sidecar
pytest tests/ -v

All tests must pass before cutover gate is considered complete.