Files
lcbp3/specs/300-others/302-ai-model-revision/plan.md
T
admin 6cb3ae10ee
CI / CD Pipeline / build (push) Failing after 5m36s
CI / CD Pipeline / deploy (push) Has been skipped
feat(ai): unify AI architecture, implement RAG and legacy migration
2026-05-15 11:10:44 +07:00

6.5 KiB
Raw Blame History

Implementation Plan: AI Model Revision (ADR-023A)

Branch: main | Date: 2026-05-15 | Spec: spec.md Feature Dir: specs/300-others/302-ai-model-revision/


Summary

Implement ADR-023A AI Architecture Revision: เปลี่ยน model stack จาก 3-model (gemma4:9b + Typhoon + nomic-embed-text) เป็น 2-model (gemma4:e4b Q8_0 + nomic-embed-text), แยก BullMQ เป็น 2 queues (ai-realtime/ai-batch), เพิ่ม OCR auto-detection, enforce multi-tenant QdrantService, implement Legacy Migration pipeline และ migration_review_queue, และลบ Typhoon Cloud API ออกจาก codebase ทั้งหมด


Technical Context

Language/Version: TypeScript 5.x (strict mode) Primary Dependencies:

  • Backend: NestJS 10, BullMQ 5, TypeORM 0.3, ioredis (Redis 7), @qdrant/js-client-rest
  • AI Infrastructure: Ollama (Desk-5439), PaddleOCR, PyMuPDF (Python sidecar)
  • Queue: Redis 7 (same instance as existing BullMQ) Storage: MariaDB (existing) + Qdrant (external vector DB) + Local Storage (existing) Testing: Jest (NestJS unit/integration) Target Platform: QNAP NAS (NestJS container) + Admin Desktop Desk-5439 (Ollama) Performance Goals: ai-suggest < 30s; rag-query < 10s (p95 dequeue-to-response) Constraints: VRAM ≤ 5GB peak, concurrency=1 per queue (prevent GPU overflow) Scale/Scope: ~20,000 legacy docs (migration), ~50 new docs/day (production)

Constitution Check

GATE: Must pass before Phase 0 research.

Rule Status Notes
ADR-019 UUID: no parseInt on UUID PASS BullMQ payloads ใช้ publicId: string เสมอ
ADR-009: no TypeORM migrations PASS migration_review_queue ผ่าน SQL delta (#14)
ADR-016: RBAC on all endpoints PASS AI endpoints จะมี CASL guard: ai.manage
ADR-007: error handling layered PASS BullMQ failed jobs → dead-letter + log
ADR-008: BullMQ for async PASS Inference ทั้งหมดผ่าน BullMQ (ไม่มี inline)
ADR-023/023A: no direct Ollama PASS n8n → DMS API → BullMQ → Ollama เท่านั้น
ADR-023A: QdrantService required projectPublicId PASS Enforce ที่ TypeScript compile-time
TypeScript strict: no any, no console.log PASS Enforced ผ่าน eslint
Typhoon Cloud API removal ⚠️ PENDING rag/typhoon.service.ts ต้อง delete (T002)

Project Structure

Documentation (this feature)

specs/300-others/302-ai-model-revision/
├── spec.md              ✅ done
├── plan.md              ✅ this file
├── research.md          ✅ done
├── data-model.md        ✅ done
├── quickstart.md        (Phase 1)
├── contracts/           (Phase 1)
│   ├── ai-jobs.yaml
│   └── migration-queue.yaml
├── checklists/
│   └── requirements.md  ✅ done
└── tasks.md             (Phase 2 — speckit-tasks)

Schema Delta (ADR-009)

specs/03-Data-and-Storage/deltas/
└── 14-add-migration-review-queue.sql    # new

Source Code

backend/src/modules/ai/
├── ai.module.ts                         # update: register 2 queues, remove Typhoon
├── ai.controller.ts                     # update: add /migration/queue endpoint
├── ai.service.ts                        # update: routing logic, queue selection
├── processors/
│   ├── ai-realtime.processor.ts         # new: ai-realtime consumer
│   └── ai-batch.processor.ts            # new: ai-batch consumer (replaces existing)
├── services/
│   ├── ollama.service.ts                # update: model → gemma4:e4b
│   ├── qdrant.service.ts                # update: enforce projectPublicId param
│   ├── ocr.service.ts                   # new: OCR auto-detect + PaddleOCR routing
│   ├── migration.service.ts             # new: Legacy Migration pipeline
│   └── embedding.service.ts            # new: full-doc chunking + embed
├── dto/
│   ├── create-ai-job.dto.ts             # update: queue discriminator field
│   ├── migration-queue-item.dto.ts      # new
│   └── rag-query.dto.ts                 # new
├── entities/
│   └── migration-review-queue.entity.ts # new
└── rag/
    ├── rag.service.ts                   # update: remove typhoon ref, use QdrantService
    └── typhoon.service.ts               # DELETE ← Tier 1 critical

backend/src/config/
└── bullmq.config.ts                     # update: add ai-batch queue config

frontend/app/(dashboard)/ai-staging/
├── page.tsx                             # update: add migration queue tab
└── migration-review/
    └── page.tsx                         # new: Admin Migration Review UI

frontend/components/ai/
├── ai-suggestion-field.tsx              # update: confidence threshold display
├── migration-queue-table.tsx            # new: queue list + approve/reject
└── AiStatusBanner.tsx                   # update: show queue status (ai-batch paused)

Phases

Phase 0: Cleanup & Foundation (Tier 1 Critical First)

Goal: ลบ Typhoon ออก, ตั้ง BullMQ 2-queue, สร้าง Schema Delta

Tasks: T001T008

Phase 1: Core AI Pipeline

Goal: OCR auto-detect, gemma4:e4b integration, ai-suggest + embed-document flows

Tasks: T009T022

Phase 2: RAG Pipeline

Goal: QdrantService multi-tenancy, chunking, rag-query endpoint

Tasks: T023T030

Phase 3: Legacy Migration Pipeline

Goal: migration_review_queue, n8n API endpoint, Admin Review UI

Tasks: T031T042

Phase 4: Monitoring & Threshold Management

Goal: Admin Dashboard AI metrics, threshold config, audit log delete permission

Tasks: T043T050


Complexity Tracking

Violation Why Needed Simpler Alternative Rejected Because
2-queue BullMQ (vs single) RAG SLA requires isolation from batch jobs Single queue + priority ไม่ป้องกัน long-running job block
External Qdrant (vs SQL FTS) Semantic search capability ไม่มีใน MariaDB FULLTEXT MariaDB FTS ไม่รองรับ multilingual semantic similarity
Python sidecar OCR PaddleOCR เป็น Python library ไม่มี Node.js binding ไม่มีทางเลือก OCR ภาษาไทยที่เทียบเท่าใน Node.js ecosystem