np-dms/lcbp3

Fork 0

Files

T

admin 1564f8648d

CI / CD Pipeline / build (push) Successful in 4m10s

Details

CI / CD Pipeline / deploy (push) Successful in 3m52s

Details

690524:1919 ADR-028-228-migration #04

2026-05-24 19:19:46 +07:00

7.9 KiB

Raw Blame History

Implementation Plan: AI Model Revision (ADR-023A)

Branch: main | Date: 2026-05-15 | Spec: spec.md Feature Dir: specs/300-others/302-ai-model-revision/

Summary

Implement ADR-023A AI Architecture Revision: เปลี่ยน model stack จาก 3-model (gemma4:9b + Typhoon + nomic-embed-text) เป็น 2-model (gemma4:e2b + nomic-embed-text), แยก BullMQ เป็น 2 queues (ai-realtime/ai-batch), เพิ่ม OCR auto-detection, enforce multi-tenant QdrantService, implement Legacy Migration pipeline และ migration_review_queue, และลบ Typhoon Cloud API ออกจาก codebase ทั้งหมด

Technical Context

Language/Version: TypeScript 5.x (strict mode) Primary Dependencies:

Backend: NestJS 10, BullMQ 5, TypeORM 0.3, ioredis (Redis 7), @qdrant/js-client-rest
AI Infrastructure: Ollama (Desk-5439), PaddleOCR, PyMuPDF (Python sidecar)
Queue: Redis 7 (same instance as existing BullMQ) Storage: MariaDB (existing) + Qdrant (external vector DB) + Local Storage (existing) Testing: Jest (NestJS unit/integration) Target Platform: QNAP NAS (NestJS container) + Admin Desktop Desk-5439 (Ollama) Performance Goals: ai-suggest < 30s; rag-query < 10s (p95 dequeue-to-response) Constraints: VRAM ≤ 3GB peak, concurrency=1 per queue (prevent GPU overflow) Scale/Scope: ~20,000 legacy docs (migration), ~50 new docs/day (production)

Constitution Check

GATE: Must pass before Phase 0 research.

Rule	Status	Notes
ADR-019 UUID: no parseInt on UUID	✅ PASS	BullMQ payloads ใช้ `publicId: string` เสมอ
ADR-009: no TypeORM migrations	✅ PASS	`migration_review_queue` ผ่าน SQL delta (#14)
ADR-016: RBAC on all endpoints	✅ PASS	AI endpoints จะมี CASL guard: `ai.manage`
ADR-007: error handling layered	✅ PASS	BullMQ failed jobs → dead-letter + log
ADR-008: BullMQ for async	✅ PASS	Inference ทั้งหมดผ่าน BullMQ (ไม่มี inline)
ADR-023/023A: no direct Ollama	✅ PASS	n8n → DMS API → BullMQ → Ollama เท่านั้น
ADR-023A: QdrantService required projectPublicId	✅ PASS	Enforce ที่ TypeScript compile-time
TypeScript strict: no `any`, no `console.log`	✅ PASS	Enforced ผ่าน eslint
Typhoon Cloud API removal	⚠️ PENDING	`rag/typhoon.service.ts` ต้อง delete (T002)

Project Structure

Documentation (this feature)

specs/300-others/302-ai-model-revision/
├── spec.md              ✅ done
├── plan.md              ✅ this file
├── research.md          ✅ done
├── data-model.md        ✅ done
├── quickstart.md        (Phase 1)
├── contracts/           (Phase 1)
│   ├── ai-jobs.yaml
│   └── migration-queue.yaml
├── checklists/
│   └── requirements.md  ✅ done
└── tasks.md             (Phase 2 — speckit-tasks)

Schema Delta (ADR-009)

specs/03-Data-and-Storage/deltas/
└── 14-add-migration-review-queue.sql    # new

Source Code

backend/src/modules/ai/
├── ai.module.ts                         # update: register 2 queues, remove Typhoon
├── ai.controller.ts                     # update: add /migration/queue endpoint
├── ai.service.ts                        # update: routing logic, queue selection
├── processors/
│   ├── ai-realtime.processor.ts         # new: ai-realtime consumer
│   └── ai-batch.processor.ts            # new: ai-batch consumer (replaces existing)
├── services/
│   ├── ollama.service.ts                # update: model → gemma4:e2b
│   ├── qdrant.service.ts                # update: enforce projectPublicId param
│   ├── ocr.service.ts                   # new: OCR auto-detect + PaddleOCR routing
│   ├── migration.service.ts             # new: Legacy Migration pipeline
│   └── embedding.service.ts            # new: full-doc chunking + embed
├── dto/
│   ├── create-ai-job.dto.ts             # update: queue discriminator field
│   ├── migration-queue-item.dto.ts      # new
│   └── rag-query.dto.ts                 # new
├── entities/
│   └── migration-review-queue.entity.ts # new
└── rag/
    ├── rag.service.ts                   # update: remove typhoon ref, use QdrantService
    └── typhoon.service.ts               # DELETE ← Tier 1 critical

backend/src/config/
└── bullmq.config.ts                     # update: add ai-batch queue config

frontend/app/(dashboard)/ai-staging/
├── page.tsx                             # update: add migration queue tab
└── migration-review/
    └── page.tsx                         # new: Admin Migration Review UI

frontend/components/ai/
├── ai-suggestion-field.tsx              # update: confidence threshold display
├── migration-queue-table.tsx            # new: queue list + approve/reject
└── AiStatusBanner.tsx                   # update: show queue status (ai-batch paused)

Phases

Phase 0: Cleanup & Foundation (Tier 1 Critical First)

Goal: ลบ Typhoon ออก, ตั้ง BullMQ 2-queue, สร้าง Schema Delta

Tasks: T001–T008

Phase 1: Core AI Pipeline

Goal: OCR auto-detect, gemma4:e2b integration, ai-suggest + embed-document flows

Tasks: T009–T022

Phase 2: RAG Pipeline

Goal: QdrantService multi-tenancy, chunking, rag-query endpoint

Tasks: T023–T030

Phase 3: Legacy Migration Pipeline

Goal: migration_review_queue, n8n API endpoint, Admin Review UI

Tasks: T031–T042

Phase 4: Monitoring & Threshold Management

Goal: Admin Dashboard AI metrics, threshold config, audit log delete permission

Tasks: T043–T050

Complexity Tracking

Violation	Why Needed	Simpler Alternative Rejected Because
2-queue BullMQ (vs single)	RAG SLA requires isolation from batch jobs	Single queue + priority ไม่ป้องกัน long-running job block
External Qdrant (vs SQL FTS)	Semantic search capability ไม่มีใน MariaDB FULLTEXT	MariaDB FTS ไม่รองรับ multilingual semantic similarity
Python sidecar OCR	PaddleOCR เป็น Python library ไม่มี Node.js binding	ไม่มีทางเลือก OCR ภาษาไทยที่เทียบเท่าใน Node.js ecosystem

🔗 Cross-Spec Dependencies

Dependencies กับ 204-rfa-approval-refactor

Component	Impact	Coordination
BullMQ Queues	`ai-realtime` และ `ai-batch` จะถูกใช้โดย RFA Reminder/Escalation	ตรวจสอบว่า RFA jobs ไม่ชนกับ AI jobs — ใช้ queue name prefix หรือ priority
QdrantService	อาจถูกใช้สำหรับ RFA document context	ตรวจสอบว่า RFA ใช้ projectPublicId filter ถูกต้อง
Ollama GPU	Shared resource กับ RFA (ถ้ามี AI features)	ตรวจสอบว่าไม่มี GPU contention

Shared Infrastructure

BullMQ: 2 queues (ai-realtime, ai-batch) — RFA อาจเพิ่ม rfa-reminders queue
Redis: ใช้ instance เดียวกัน — ตรวจสอบ memory usage
Audit Logs: ใช้ table เดียวกัน — ตรวจสอบ action types ไม่ซ้ำกัน

Deployment Sequence

Phase 0-1 ของ AI Model Revision (BullMQ 2-queue foundation)
Phase 1-3 ของ RFA Approval Refactor (ใช้ BullMQ ที่ setup แล้ว)
ทดสอบ integration ก่อน deploy ทั้งสอง features พร้อมกัน

7.9 KiB Raw Blame History Unescape Escape