Files
lcbp3/specs/200-fullstacks/235-ai-runtime-policy-refactor/quickstart.md
T
admin 71c5e88181
CI / CD Pipeline / build (push) Has been skipped
CI / CD Pipeline / deploy (push) Has been skipped
690611:1705 ADR-035-235 #00 [skip CI]
2026-06-11 17:05:17 +07:00

137 lines
3.7 KiB
Markdown

// File: specs/200-fullstacks/235-ai-runtime-policy-refactor/quickstart.md
// Change Log:
// - 2026-06-11: Verification quickstart for AI Runtime Policy Refactor
# Quickstart: AI Runtime Policy Refactor — Verification Guide
## Prerequisites
- Backend running (`pnpm run start:dev` in `backend/`)
- OCR sidecar running on Desk-5439 (`docker compose up` in ocr-sidecar/)
- Ollama running with `np-dms-ai` and `np-dms-ocr` tags registered
- Admin user token available
---
## Gate 1: Policy Contract Verification
### 1A. Reject model.key (should return 400)
```bash
curl -X POST http://localhost:3001/api/ai/jobs \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{"type": "rag-query", "model": {"key": "typhoon2.5-np-dms:latest"}}' \
| jq '.statusCode, .message'
# Expected: 400, message about model.key not allowed
```
### 1B. Reject parameter overrides (should return 400)
```bash
curl -X POST http://localhost:3001/api/ai/jobs \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{"type": "rag-query", "temperature": 0.9}' \
| jq '.statusCode'
# Expected: 400
```
### 1C. Valid executionProfile (should return 201)
```bash
curl -X POST http://localhost:3001/api/ai/jobs \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{"type": "rag-query", "executionProfile": "balanced", "documentPublicId": "<uuid>"}' \
| jq '.data.modelUsed'
# Expected: "np-dms-ai"
```
### 1D. large-context by non-admin (should return 403)
```bash
curl -X POST http://localhost:3001/api/ai/jobs \
-H "Authorization: Bearer $NON_ADMIN_TOKEN" \
-H "Content-Type: application/json" \
-d '{"type": "rag-query", "executionProfile": "large-context"}' \
| jq '.statusCode'
# Expected: 403
```
---
## Gate 2: Canonical Naming Verification
### 2A. Check audit log after job
```sql
SELECT metadata->>'$.modelUsed' FROM ai_audit_logs ORDER BY created_at DESC LIMIT 1;
-- Expected: "np-dms-ai" (ไม่ใช่ "typhoon2.5-np-dms:latest")
```
### 2B. Check Admin Console (Manual)
1. เปิด `/admin/ai` ใน browser
2. ตรวจว่า model labels ทั้งหมดแสดง `np-dms-ai` และ `np-dms-ocr`
3. ตรวจว่าไม่มี `typhoon*` ปรากฏใน UI
---
## Gate 3: Adaptive OCR Residency Verification
### 3A. OCR under large-context profile
```bash
# ส่ง OCR job ขณะที่มี large-context job active
# ดู sidecar log
docker logs ocr-sidecar --tail 20
# Expected log line: keep_alive=0 reason=large-context-active
```
### 3B. OCR with headroom sufficient
```bash
# ส่ง OCR job เมื่อ GPU headroom สูง (ไม่มี model loaded หนัก)
docker logs ocr-sidecar --tail 20
# Expected log line: keep_alive=120 reason=headroom-sufficient
```
---
## Gate 4: Retrieval CPU Fallback Verification
### 4A. Force GPU pressure then run RAG
```bash
# 1. Force load large model
curl http://localhost:11434/api/generate -d '{"model":"np-dms-ai","prompt":"warmup","keep_alive":-1}'
# 2. Run RAG query
curl -X POST http://localhost:3001/api/ai/jobs \
-H "Authorization: Bearer $TOKEN" \
-d '{"type":"rag-query","executionProfile":"balanced","documentPublicId":"<uuid>"}' \
| jq '.data.status'
# Expected: "completed" (ไม่ fail)
# 3. ตรวจ sidecar log
docker logs ocr-sidecar --tail 20
# Expected: device=cpu reason=gpu-headroom-below-threshold
```
---
## Automated Test Suite
```bash
# Backend unit + integration tests
cd backend
pnpm test -- --testPathPattern="ai-policy|ocr-residency|execution-profile"
# Sidecar tests
cd specs/04-Infrastructure-OPS/04-00-docker-compose/Desk-5439/ocr-sidecar
pytest tests/ -v
```
**All tests must pass** before cutover gate is considered complete.