6.5 KiB
Implementation Plan: AI Model Revision (ADR-023A)
Branch: main | Date: 2026-05-15 | Spec: spec.md
Feature Dir: specs/300-others/302-ai-model-revision/
Summary
Implement ADR-023A AI Architecture Revision: เปลี่ยน model stack จาก 3-model (gemma4:9b + Typhoon + nomic-embed-text) เป็น 2-model (gemma4:e4b Q8_0 + nomic-embed-text), แยก BullMQ เป็น 2 queues (ai-realtime/ai-batch), เพิ่ม OCR auto-detection, enforce multi-tenant QdrantService, implement Legacy Migration pipeline และ migration_review_queue, และลบ Typhoon Cloud API ออกจาก codebase ทั้งหมด
Technical Context
Language/Version: TypeScript 5.x (strict mode) Primary Dependencies:
- Backend: NestJS 10, BullMQ 5, TypeORM 0.3, ioredis (Redis 7), @qdrant/js-client-rest
- AI Infrastructure: Ollama (Desk-5439), PaddleOCR, PyMuPDF (Python sidecar)
- Queue: Redis 7 (same instance as existing BullMQ) Storage: MariaDB (existing) + Qdrant (external vector DB) + Local Storage (existing) Testing: Jest (NestJS unit/integration) Target Platform: QNAP NAS (NestJS container) + Admin Desktop Desk-5439 (Ollama) Performance Goals: ai-suggest < 30s; rag-query < 10s (p95 dequeue-to-response) Constraints: VRAM ≤ 5GB peak, concurrency=1 per queue (prevent GPU overflow) Scale/Scope: ~20,000 legacy docs (migration), ~50 new docs/day (production)
Constitution Check
GATE: Must pass before Phase 0 research.
| Rule | Status | Notes |
|---|---|---|
| ADR-019 UUID: no parseInt on UUID | ✅ PASS | BullMQ payloads ใช้ publicId: string เสมอ |
| ADR-009: no TypeORM migrations | ✅ PASS | migration_review_queue ผ่าน SQL delta (#14) |
| ADR-016: RBAC on all endpoints | ✅ PASS | AI endpoints จะมี CASL guard: ai.manage |
| ADR-007: error handling layered | ✅ PASS | BullMQ failed jobs → dead-letter + log |
| ADR-008: BullMQ for async | ✅ PASS | Inference ทั้งหมดผ่าน BullMQ (ไม่มี inline) |
| ADR-023/023A: no direct Ollama | ✅ PASS | n8n → DMS API → BullMQ → Ollama เท่านั้น |
| ADR-023A: QdrantService required projectPublicId | ✅ PASS | Enforce ที่ TypeScript compile-time |
TypeScript strict: no any, no console.log |
✅ PASS | Enforced ผ่าน eslint |
| Typhoon Cloud API removal | ⚠️ PENDING | rag/typhoon.service.ts ต้อง delete (T002) |
Project Structure
Documentation (this feature)
specs/300-others/302-ai-model-revision/
├── spec.md ✅ done
├── plan.md ✅ this file
├── research.md ✅ done
├── data-model.md ✅ done
├── quickstart.md (Phase 1)
├── contracts/ (Phase 1)
│ ├── ai-jobs.yaml
│ └── migration-queue.yaml
├── checklists/
│ └── requirements.md ✅ done
└── tasks.md (Phase 2 — speckit-tasks)
Schema Delta (ADR-009)
specs/03-Data-and-Storage/deltas/
└── 14-add-migration-review-queue.sql # new
Source Code
backend/src/modules/ai/
├── ai.module.ts # update: register 2 queues, remove Typhoon
├── ai.controller.ts # update: add /migration/queue endpoint
├── ai.service.ts # update: routing logic, queue selection
├── processors/
│ ├── ai-realtime.processor.ts # new: ai-realtime consumer
│ └── ai-batch.processor.ts # new: ai-batch consumer (replaces existing)
├── services/
│ ├── ollama.service.ts # update: model → gemma4:e4b
│ ├── qdrant.service.ts # update: enforce projectPublicId param
│ ├── ocr.service.ts # new: OCR auto-detect + PaddleOCR routing
│ ├── migration.service.ts # new: Legacy Migration pipeline
│ └── embedding.service.ts # new: full-doc chunking + embed
├── dto/
│ ├── create-ai-job.dto.ts # update: queue discriminator field
│ ├── migration-queue-item.dto.ts # new
│ └── rag-query.dto.ts # new
├── entities/
│ └── migration-review-queue.entity.ts # new
└── rag/
├── rag.service.ts # update: remove typhoon ref, use QdrantService
└── typhoon.service.ts # DELETE ← Tier 1 critical
backend/src/config/
└── bullmq.config.ts # update: add ai-batch queue config
frontend/app/(dashboard)/ai-staging/
├── page.tsx # update: add migration queue tab
└── migration-review/
└── page.tsx # new: Admin Migration Review UI
frontend/components/ai/
├── ai-suggestion-field.tsx # update: confidence threshold display
├── migration-queue-table.tsx # new: queue list + approve/reject
└── AiStatusBanner.tsx # update: show queue status (ai-batch paused)
Phases
Phase 0: Cleanup & Foundation (Tier 1 Critical First)
Goal: ลบ Typhoon ออก, ตั้ง BullMQ 2-queue, สร้าง Schema Delta
Tasks: T001–T008
Phase 1: Core AI Pipeline
Goal: OCR auto-detect, gemma4:e4b integration, ai-suggest + embed-document flows
Tasks: T009–T022
Phase 2: RAG Pipeline
Goal: QdrantService multi-tenancy, chunking, rag-query endpoint
Tasks: T023–T030
Phase 3: Legacy Migration Pipeline
Goal: migration_review_queue, n8n API endpoint, Admin Review UI
Tasks: T031–T042
Phase 4: Monitoring & Threshold Management
Goal: Admin Dashboard AI metrics, threshold config, audit log delete permission
Tasks: T043–T050
Complexity Tracking
| Violation | Why Needed | Simpler Alternative Rejected Because |
|---|---|---|
| 2-queue BullMQ (vs single) | RAG SLA requires isolation from batch jobs | Single queue + priority ไม่ป้องกัน long-running job block |
| External Qdrant (vs SQL FTS) | Semantic search capability ไม่มีใน MariaDB FULLTEXT | MariaDB FTS ไม่รองรับ multilingual semantic similarity |
| Python sidecar OCR | PaddleOCR เป็น Python library ไม่มี Node.js binding | ไม่มีทางเลือก OCR ภาษาไทยที่เทียบเท่าใน Node.js ecosystem |