Files
lcbp3/specs/300-others/302-ai-model-revision/plan.md
T
admin 1564f8648d
CI / CD Pipeline / build (push) Successful in 4m10s
CI / CD Pipeline / deploy (push) Successful in 3m52s
690524:1919 ADR-028-228-migration #04
2026-05-24 19:19:46 +07:00

181 lines
7.9 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Implementation Plan: AI Model Revision (ADR-023A)
**Branch**: `main` | **Date**: 2026-05-15 | **Spec**: [spec.md](./spec.md)
**Feature Dir**: `specs/300-others/302-ai-model-revision/`
---
## Summary
Implement ADR-023A AI Architecture Revision: เปลี่ยน model stack จาก 3-model (gemma4:9b + Typhoon + nomic-embed-text) เป็น 2-model (gemma4:e2b + nomic-embed-text), แยก BullMQ เป็น 2 queues (`ai-realtime`/`ai-batch`), เพิ่ม OCR auto-detection, enforce multi-tenant QdrantService, implement Legacy Migration pipeline และ migration_review_queue, และลบ Typhoon Cloud API ออกจาก codebase ทั้งหมด
---
## Technical Context
**Language/Version**: TypeScript 5.x (strict mode)
**Primary Dependencies**:
- Backend: NestJS 10, BullMQ 5, TypeORM 0.3, ioredis (Redis 7), @qdrant/js-client-rest
- AI Infrastructure: Ollama (Desk-5439), PaddleOCR, PyMuPDF (Python sidecar)
- Queue: Redis 7 (same instance as existing BullMQ)
**Storage**: MariaDB (existing) + Qdrant (external vector DB) + Local Storage (existing)
**Testing**: Jest (NestJS unit/integration)
**Target Platform**: QNAP NAS (NestJS container) + Admin Desktop Desk-5439 (Ollama)
**Performance Goals**: ai-suggest < 30s; rag-query < 10s (p95 dequeue-to-response)
**Constraints**: VRAM ≤ 3GB peak, concurrency=1 per queue (prevent GPU overflow)
**Scale/Scope**: ~20,000 legacy docs (migration), ~50 new docs/day (production)
---
## Constitution Check
_GATE: Must pass before Phase 0 research._
| Rule | Status | Notes |
|------|--------|-------|
| ADR-019 UUID: no parseInt on UUID | ✅ PASS | BullMQ payloads ใช้ `publicId: string` เสมอ |
| ADR-009: no TypeORM migrations | ✅ PASS | `migration_review_queue` ผ่าน SQL delta (#14) |
| ADR-016: RBAC on all endpoints | ✅ PASS | AI endpoints จะมี CASL guard: `ai.manage` |
| ADR-007: error handling layered | ✅ PASS | BullMQ failed jobs → dead-letter + log |
| ADR-008: BullMQ for async | ✅ PASS | Inference ทั้งหมดผ่าน BullMQ (ไม่มี inline) |
| ADR-023/023A: no direct Ollama | ✅ PASS | n8n → DMS API → BullMQ → Ollama เท่านั้น |
| ADR-023A: QdrantService required projectPublicId | ✅ PASS | Enforce ที่ TypeScript compile-time |
| TypeScript strict: no `any`, no `console.log` | ✅ PASS | Enforced ผ่าน eslint |
| **Typhoon Cloud API removal** | ⚠️ PENDING | `rag/typhoon.service.ts` ต้อง delete (T002) |
---
## Project Structure
### Documentation (this feature)
```text
specs/300-others/302-ai-model-revision/
├── spec.md ✅ done
├── plan.md ✅ this file
├── research.md ✅ done
├── data-model.md ✅ done
├── quickstart.md (Phase 1)
├── contracts/ (Phase 1)
│ ├── ai-jobs.yaml
│ └── migration-queue.yaml
├── checklists/
│ └── requirements.md ✅ done
└── tasks.md (Phase 2 — speckit-tasks)
```
### Schema Delta (ADR-009)
```text
specs/03-Data-and-Storage/deltas/
└── 14-add-migration-review-queue.sql # new
```
### Source Code
```text
backend/src/modules/ai/
├── ai.module.ts # update: register 2 queues, remove Typhoon
├── ai.controller.ts # update: add /migration/queue endpoint
├── ai.service.ts # update: routing logic, queue selection
├── processors/
│ ├── ai-realtime.processor.ts # new: ai-realtime consumer
│ └── ai-batch.processor.ts # new: ai-batch consumer (replaces existing)
├── services/
│ ├── ollama.service.ts # update: model → gemma4:e2b
│ ├── qdrant.service.ts # update: enforce projectPublicId param
│ ├── ocr.service.ts # new: OCR auto-detect + PaddleOCR routing
│ ├── migration.service.ts # new: Legacy Migration pipeline
│ └── embedding.service.ts # new: full-doc chunking + embed
├── dto/
│ ├── create-ai-job.dto.ts # update: queue discriminator field
│ ├── migration-queue-item.dto.ts # new
│ └── rag-query.dto.ts # new
├── entities/
│ └── migration-review-queue.entity.ts # new
└── rag/
├── rag.service.ts # update: remove typhoon ref, use QdrantService
└── typhoon.service.ts # DELETE ← Tier 1 critical
backend/src/config/
└── bullmq.config.ts # update: add ai-batch queue config
frontend/app/(dashboard)/ai-staging/
├── page.tsx # update: add migration queue tab
└── migration-review/
└── page.tsx # new: Admin Migration Review UI
frontend/components/ai/
├── ai-suggestion-field.tsx # update: confidence threshold display
├── migration-queue-table.tsx # new: queue list + approve/reject
└── AiStatusBanner.tsx # update: show queue status (ai-batch paused)
```
---
## Phases
### Phase 0: Cleanup & Foundation (Tier 1 Critical First)
**Goal**: ลบ Typhoon ออก, ตั้ง BullMQ 2-queue, สร้าง Schema Delta
Tasks: T001T008
### Phase 1: Core AI Pipeline
**Goal**: OCR auto-detect, gemma4:e2b integration, ai-suggest + embed-document flows
Tasks: T009T022
### Phase 2: RAG Pipeline
**Goal**: QdrantService multi-tenancy, chunking, rag-query endpoint
Tasks: T023T030
### Phase 3: Legacy Migration Pipeline
**Goal**: migration_review_queue, n8n API endpoint, Admin Review UI
Tasks: T031T042
### Phase 4: Monitoring & Threshold Management
**Goal**: Admin Dashboard AI metrics, threshold config, audit log delete permission
Tasks: T043T050
---
## Complexity Tracking
| Violation | Why Needed | Simpler Alternative Rejected Because |
|-----------|-----------|-------------------------------------|
| 2-queue BullMQ (vs single) | RAG SLA requires isolation from batch jobs | Single queue + priority ไม่ป้องกัน long-running job block |
| External Qdrant (vs SQL FTS) | Semantic search capability ไม่มีใน MariaDB FULLTEXT | MariaDB FTS ไม่รองรับ multilingual semantic similarity |
| Python sidecar OCR | PaddleOCR เป็น Python library ไม่มี Node.js binding | ไม่มีทางเลือก OCR ภาษาไทยที่เทียบเท่าใน Node.js ecosystem |
---
## 🔗 Cross-Spec Dependencies
### Dependencies กับ 204-rfa-approval-refactor
| Component | Impact | Coordination |
|-----------|--------|--------------|
| **BullMQ Queues** | `ai-realtime` และ `ai-batch` จะถูกใช้โดย RFA Reminder/Escalation | ตรวจสอบว่า RFA jobs ไม่ชนกับ AI jobs — ใช้ queue name prefix หรือ priority |
| **QdrantService** | อาจถูกใช้สำหรับ RFA document context | ตรวจสอบว่า RFA ใช้ projectPublicId filter ถูกต้อง |
| **Ollama GPU** | Shared resource กับ RFA (ถ้ามี AI features) | ตรวจสอบว่าไม่มี GPU contention |
### Shared Infrastructure
- **BullMQ**: 2 queues (`ai-realtime`, `ai-batch`) — RFA อาจเพิ่ม `rfa-reminders` queue
- **Redis**: ใช้ instance เดียวกัน — ตรวจสอบ memory usage
- **Audit Logs**: ใช้ table เดียวกัน — ตรวจสอบ action types ไม่ซ้ำกัน
### Deployment Sequence
1. Phase 0-1 ของ AI Model Revision (BullMQ 2-queue foundation)
2. Phase 1-3 ของ RFA Approval Refactor (ใช้ BullMQ ที่ setup แล้ว)
3. ทดสอบ integration ก่อน deploy ทั้งสอง features พร้อมกัน