feat(ai): unify AI architecture, implement RAG and legacy migration
This commit is contained in:
@@ -0,0 +1,156 @@
|
||||
# Implementation Plan: AI Model Revision (ADR-023A)
|
||||
|
||||
**Branch**: `main` | **Date**: 2026-05-15 | **Spec**: [spec.md](./spec.md)
|
||||
**Feature Dir**: `specs/300-others/302-ai-model-revision/`
|
||||
|
||||
---
|
||||
|
||||
## Summary
|
||||
|
||||
Implement ADR-023A AI Architecture Revision: เปลี่ยน model stack จาก 3-model (gemma4:9b + Typhoon + nomic-embed-text) เป็น 2-model (gemma4:e4b Q8_0 + nomic-embed-text), แยก BullMQ เป็น 2 queues (`ai-realtime`/`ai-batch`), เพิ่ม OCR auto-detection, enforce multi-tenant QdrantService, implement Legacy Migration pipeline และ migration_review_queue, และลบ Typhoon Cloud API ออกจาก codebase ทั้งหมด
|
||||
|
||||
---
|
||||
|
||||
## Technical Context
|
||||
|
||||
**Language/Version**: TypeScript 5.x (strict mode)
|
||||
**Primary Dependencies**:
|
||||
- Backend: NestJS 10, BullMQ 5, TypeORM 0.3, ioredis (Redis 7), @qdrant/js-client-rest
|
||||
- AI Infrastructure: Ollama (Desk-5439), PaddleOCR, PyMuPDF (Python sidecar)
|
||||
- Queue: Redis 7 (same instance as existing BullMQ)
|
||||
**Storage**: MariaDB (existing) + Qdrant (external vector DB) + Local Storage (existing)
|
||||
**Testing**: Jest (NestJS unit/integration)
|
||||
**Target Platform**: QNAP NAS (NestJS container) + Admin Desktop Desk-5439 (Ollama)
|
||||
**Performance Goals**: ai-suggest < 30s; rag-query < 10s (p95 dequeue-to-response)
|
||||
**Constraints**: VRAM ≤ 5GB peak, concurrency=1 per queue (prevent GPU overflow)
|
||||
**Scale/Scope**: ~20,000 legacy docs (migration), ~50 new docs/day (production)
|
||||
|
||||
---
|
||||
|
||||
## Constitution Check
|
||||
|
||||
_GATE: Must pass before Phase 0 research._
|
||||
|
||||
| Rule | Status | Notes |
|
||||
|------|--------|-------|
|
||||
| ADR-019 UUID: no parseInt on UUID | ✅ PASS | BullMQ payloads ใช้ `publicId: string` เสมอ |
|
||||
| ADR-009: no TypeORM migrations | ✅ PASS | `migration_review_queue` ผ่าน SQL delta (#14) |
|
||||
| ADR-016: RBAC on all endpoints | ✅ PASS | AI endpoints จะมี CASL guard: `ai.manage` |
|
||||
| ADR-007: error handling layered | ✅ PASS | BullMQ failed jobs → dead-letter + log |
|
||||
| ADR-008: BullMQ for async | ✅ PASS | Inference ทั้งหมดผ่าน BullMQ (ไม่มี inline) |
|
||||
| ADR-023/023A: no direct Ollama | ✅ PASS | n8n → DMS API → BullMQ → Ollama เท่านั้น |
|
||||
| ADR-023A: QdrantService required projectPublicId | ✅ PASS | Enforce ที่ TypeScript compile-time |
|
||||
| TypeScript strict: no `any`, no `console.log` | ✅ PASS | Enforced ผ่าน eslint |
|
||||
| **Typhoon Cloud API removal** | ⚠️ PENDING | `rag/typhoon.service.ts` ต้อง delete (T002) |
|
||||
|
||||
---
|
||||
|
||||
## Project Structure
|
||||
|
||||
### Documentation (this feature)
|
||||
|
||||
```text
|
||||
specs/300-others/302-ai-model-revision/
|
||||
├── spec.md ✅ done
|
||||
├── plan.md ✅ this file
|
||||
├── research.md ✅ done
|
||||
├── data-model.md ✅ done
|
||||
├── quickstart.md (Phase 1)
|
||||
├── contracts/ (Phase 1)
|
||||
│ ├── ai-jobs.yaml
|
||||
│ └── migration-queue.yaml
|
||||
├── checklists/
|
||||
│ └── requirements.md ✅ done
|
||||
└── tasks.md (Phase 2 — speckit-tasks)
|
||||
```
|
||||
|
||||
### Schema Delta (ADR-009)
|
||||
|
||||
```text
|
||||
specs/03-Data-and-Storage/deltas/
|
||||
└── 14-add-migration-review-queue.sql # new
|
||||
```
|
||||
|
||||
### Source Code
|
||||
|
||||
```text
|
||||
backend/src/modules/ai/
|
||||
├── ai.module.ts # update: register 2 queues, remove Typhoon
|
||||
├── ai.controller.ts # update: add /migration/queue endpoint
|
||||
├── ai.service.ts # update: routing logic, queue selection
|
||||
├── processors/
|
||||
│ ├── ai-realtime.processor.ts # new: ai-realtime consumer
|
||||
│ └── ai-batch.processor.ts # new: ai-batch consumer (replaces existing)
|
||||
├── services/
|
||||
│ ├── ollama.service.ts # update: model → gemma4:e4b
|
||||
│ ├── qdrant.service.ts # update: enforce projectPublicId param
|
||||
│ ├── ocr.service.ts # new: OCR auto-detect + PaddleOCR routing
|
||||
│ ├── migration.service.ts # new: Legacy Migration pipeline
|
||||
│ └── embedding.service.ts # new: full-doc chunking + embed
|
||||
├── dto/
|
||||
│ ├── create-ai-job.dto.ts # update: queue discriminator field
|
||||
│ ├── migration-queue-item.dto.ts # new
|
||||
│ └── rag-query.dto.ts # new
|
||||
├── entities/
|
||||
│ └── migration-review-queue.entity.ts # new
|
||||
└── rag/
|
||||
├── rag.service.ts # update: remove typhoon ref, use QdrantService
|
||||
└── typhoon.service.ts # DELETE ← Tier 1 critical
|
||||
|
||||
backend/src/config/
|
||||
└── bullmq.config.ts # update: add ai-batch queue config
|
||||
|
||||
frontend/app/(dashboard)/ai-staging/
|
||||
├── page.tsx # update: add migration queue tab
|
||||
└── migration-review/
|
||||
└── page.tsx # new: Admin Migration Review UI
|
||||
|
||||
frontend/components/ai/
|
||||
├── ai-suggestion-field.tsx # update: confidence threshold display
|
||||
├── migration-queue-table.tsx # new: queue list + approve/reject
|
||||
└── AiStatusBanner.tsx # update: show queue status (ai-batch paused)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Phases
|
||||
|
||||
### Phase 0: Cleanup & Foundation (Tier 1 Critical First)
|
||||
|
||||
**Goal**: ลบ Typhoon ออก, ตั้ง BullMQ 2-queue, สร้าง Schema Delta
|
||||
|
||||
Tasks: T001–T008
|
||||
|
||||
### Phase 1: Core AI Pipeline
|
||||
|
||||
**Goal**: OCR auto-detect, gemma4:e4b integration, ai-suggest + embed-document flows
|
||||
|
||||
Tasks: T009–T022
|
||||
|
||||
### Phase 2: RAG Pipeline
|
||||
|
||||
**Goal**: QdrantService multi-tenancy, chunking, rag-query endpoint
|
||||
|
||||
Tasks: T023–T030
|
||||
|
||||
### Phase 3: Legacy Migration Pipeline
|
||||
|
||||
**Goal**: migration_review_queue, n8n API endpoint, Admin Review UI
|
||||
|
||||
Tasks: T031–T042
|
||||
|
||||
### Phase 4: Monitoring & Threshold Management
|
||||
|
||||
**Goal**: Admin Dashboard AI metrics, threshold config, audit log delete permission
|
||||
|
||||
Tasks: T043–T050
|
||||
|
||||
---
|
||||
|
||||
## Complexity Tracking
|
||||
|
||||
| Violation | Why Needed | Simpler Alternative Rejected Because |
|
||||
|-----------|-----------|-------------------------------------|
|
||||
| 2-queue BullMQ (vs single) | RAG SLA requires isolation from batch jobs | Single queue + priority ไม่ป้องกัน long-running job block |
|
||||
| External Qdrant (vs SQL FTS) | Semantic search capability ไม่มีใน MariaDB FULLTEXT | MariaDB FTS ไม่รองรับ multilingual semantic similarity |
|
||||
| Python sidecar OCR | PaddleOCR เป็น Python library ไม่มี Node.js binding | ไม่มีทางเลือก OCR ภาษาไทยที่เทียบเท่าใน Node.js ecosystem |
|
||||
Reference in New Issue
Block a user