690524:1919 ADR-028-228-migration #04

2026-05-24 19:19:46 +07:00
parent 93fd95a6b3
commit 1564f8648d
22 changed files with 1422 additions and 255 deletions
@@ -13,29 +13,29 @@
 - Backend: NestJS Module (IntentClassifierModule) พร้อม Service สำหรับ Pattern Matching และ LLM Fallback
 - Database: ตาราง `ai_intent_definitions` และ `ai_intent_patterns` (SQL Delta ตาม ADR-009)
 - Caching: Redis (TTL 5 นาที) สำหรับ Patterns
- AI: Ollama gemma4:e4b Q8_0 บน Admin Desktop (Desk-5439) สำหรับ LLM Fallback
+- AI: Ollama gemma4:e2b บน Admin Desktop (Desk-5439) สำหรับ LLM Fallback
 - Frontend: Admin UI สำหรับจัดการ Intent และ Patterns + Test Console

 ---

 ## Technical Context

-**Language/Version**: TypeScript 5.x (NestJS 11) + Next.js 16  
-**Primary Dependencies**: 
+**Language/Version**: TypeScript 5.x (NestJS 11) + Next.js 16
+**Primary Dependencies**:
 - Backend: NestJS, TypeORM, ioredis (Redis), axios (Ollama HTTP)
- Frontend: React, TanStack Query, shadcn/ui components  
-**Storage**: MariaDB 11.8 (Intent Definitions/Patterns), Redis (Cache), Ollama (LLM)  
-**Testing**: Jest (Backend Unit/Integration), Vitest (Frontend Unit), Playwright (E2E)  
-**Target Platform**: QNAP NAS (Docker), Admin Desktop (Ollama)  
-**Project Type**: Web application (Backend + Frontend)  
-**Performance Goals**: 
+- Frontend: React, TanStack Query, shadcn/ui components
+**Storage**: MariaDB 11.8 (Intent Definitions/Patterns), Redis (Cache), Ollama (LLM)
+**Testing**: Jest (Backend Unit/Integration), Vitest (Frontend Unit), Playwright (E2E)
+**Target Platform**: QNAP NAS (Docker), Admin Desktop (Ollama)
+**Project Type**: Web application (Backend + Frontend)
+**Performance Goals**:
 - Pattern Match: < 10ms (cache hit), < 50ms (cache miss)
 - LLM Fallback: < 2000ms (รวม Pattern Check)
-**Constraints**: 
+**Constraints**:
 - GPU Budget: RTX 2060 Super 8GB (ใช้ร่วมกับ RAG, OCR, Embedding)
 - LLM Semaphore: Max 3 concurrent calls
 - Bilingual Input: ไทย/อังกฤษปน + typo tolerance
-**Scale/Scope**: 
+**Scale/Scope**:
 - 12 Intent Definitions (v1)
 - 50+ concurrent users
 - 70-80% Pattern Hit Rate target
@@ -146,7 +146,7 @@ frontend/

 **หัวข้อที่ต้อง Research**:
 1. Redis Cache Strategy สำหรับ Patterns (TTL + Invalidation)
-2. Ollama HTTP API Integration (gemma4:e4b Q8_0)
+2. Ollama HTTP API Integration (gemma4:e2b)
 3. Semaphore Pattern ใน NestJS (p-limit หรือ RxJS)
 4. Regex Validation ใน TypeORM/Class-Validator

@@ -1,13 +1,13 @@
 # Quick Start: Intent Classification System

-**Feature**: 224-intent-classification  
+**Feature**: 224-intent-classification
 **Date**: 2026-05-19

 ---

 ## Prerequisites

- Ollama Server บน Admin Desktop (Desk-5439) พร้อม Model `gemma4:e4b`
+- Ollama Server บน Admin Desktop (Desk-5439) พร้อม Model `gemma4:e2b`
 - Redis Server พร้อมใช้งาน
 - Database Schema อัปเดตผ่าน SQL Delta

@@ -56,7 +56,7 @@ INSERT INTO ai_intent_definitions (intent_code, description_th, description_en,
 ```env
 # Ollama Configuration
 OLLAMA_BASE_URL=http://192.168.10.10:11434
-OLLAMA_MODEL=gemma4:e4b
+OLLAMA_MODEL=gemma4:e2b
 OLLAMA_TIMEOUT_MS=5000

 # Intent Classification
@@ -1,8 +1,8 @@
 # Feature Specification: Intent Classification System

-**Feature Branch**: `224-intent-classification`  
-**Created**: 2026-05-19  
-**Status**: Draft  
+**Feature Branch**: `224-intent-classification`
+**Created**: 2026-05-19
+**Status**: Draft
 **Input**: ADR-024 Intent Classification Strategy + CONTEXT.md AI Runtime Layer

 ---
@@ -93,7 +93,7 @@
 - **FR-004**: ระบบต้องรองรับ Pattern Type 2 แบบ: `keyword` (case-insensitive includes) และ `regex` (RegExp.test)
 - **FR-005**: ระบบต้องมี Caching Layer ด้วย Redis (Key: `ai:intent:patterns:active`, TTL: 300 วินาที) เพื่อลดการ Query DB
 - **FR-006**: ระบบต้องทำ Pattern Matching ตามลำดับ Priority (ASC) — Pattern ที่มี priority ต่ำกว่าจะถูกตรวจสอบก่อน
- **FR-007**: หากไม่มี Pattern Match → ระบบต้องเรียก LLM Fallback (Ollama gemma4:e4b Q8_0) แบบ Synchronous
+- **FR-007**: หากไม่มี Pattern Match → ระบบต้องเรียก LLM Fallback (Ollama gemma4:e2b) แบบ Synchronous
 - **FR-008**: LLM Fallback ต้องใช้ Semaphore จำกัด Concurrent Calls สูงสุด 3 รายการพร้อมกัน
 - **FR-009**: ระบบต้อง Validate Confidence Score จาก LLM และ Override เป็น `FALLBACK` หาก confidence < 0.4
 - **FR-010**: ระบบต้องบันทึกทุก Classification Request ลง `ai_audit_logs` โดยมีข้อมูล: input, output, method, latency, projectPublicId, userPublicId
@@ -128,7 +128,7 @@

 ### Dependencies

- Ollama Server บน Admin Desktop (Desk-5439) พร้อม Model gemma4:e4b Q8_0
+- Ollama Server บน Admin Desktop (Desk-5439) พร้อม Model gemma4:e2b
 - Redis Cache Server พร้อมใช้งาน
 - Database Schema ตาราง `ai_intent_definitions` และ `ai_intent_patterns` (เพิ่มผ่าน SQL Delta)
 - AI Gateway Module ที่มีอยู่แล้ว (ADR-023A)
@@ -8,17 +8,17 @@

 ## Summary

-Refactor migration architecture ให้สอดคล้องกับ ADR-023A: n8n เรียกผ่าน BullMQ แทน Ollama โดยตรง, ใช้ `gemma4:e4b Q8_0`, OCR ผ่าน PyMuPDF/PaddleOCR, สร้าง Backend endpoint `/api/ai/jobs`, SQL delta สำหรับ `tags`/`correspondence_tags`, และ Migration Review UI
+Refactor migration architecture ให้สอดคล้องกับ ADR-023A: n8n เรียกผ่าน BullMQ แทน Ollama โดยตรง, ใช้ `gemma4:e2b`, OCR ผ่าน PyMuPDF/PaddleOCR, สร้าง Backend endpoint `/api/ai/jobs`, SQL delta สำหรับ `tags`/`correspondence_tags`, และ Migration Review UI

 ## Technical Context

-**Language/Version**: TypeScript 5.x, NestJS 10.x, Next.js 14.x  
-**Primary Dependencies**: BullMQ, TypeORM, CASL, TanStack Query, Zod  
-**Storage**: MariaDB (SQL delta via ADR-009), Qdrant (embedding), Redis (BullMQ)  
-**Testing**: Jest (Backend), Vitest (Frontend)  
-**Target Platform**: QNAP NAS (Backend + n8n), Admin Desktop Desk-5439 (Ollama + OCR Worker)  
-**Performance Goals**: Fast Path OCR < 5s/file; Slow Path OCR < 60s/file; AI inference < 30s  
-**Constraints**: VRAM peak ~4.3GB; BullMQ concurrency=1 (ai-batch); Token TTL ≤ 7 วัน  
+**Language/Version**: TypeScript 5.x, NestJS 10.x, Next.js 14.x
+**Primary Dependencies**: BullMQ, TypeORM, CASL, TanStack Query, Zod
+**Storage**: MariaDB (SQL delta via ADR-009), Qdrant (embedding), Redis (BullMQ)
+**Testing**: Jest (Backend), Vitest (Frontend)
+**Target Platform**: QNAP NAS (Backend + n8n), Admin Desktop Desk-5439 (Ollama + OCR Worker)
+**Performance Goals**: Fast Path OCR < 5s/file; Slow Path OCR < 60s/file; AI inference < 30s
+**Constraints**: VRAM peak ~2.5GB; BullMQ concurrency=1 (ai-batch); Token TTL ≤ 7 วัน
 **Scale/Scope**: 20,000 PDF documents; ~3 วินาที/record → ~16.6 ชั่วโมงรวม

 ## Constitution Check
@@ -31,7 +31,7 @@ Refactor migration architecture ให้สอดคล้องกับ ADR-0
 | ADR-008 | BullMQ สำหรับ background jobs | ✅ (ai-batch queue) |
 | ADR-023A | n8n → DMS API → BullMQ → Ollama (ห้าม direct) | ✅ |
 | ADR-007 | Layered error handling + user-friendly messages | ✅ |
-| ADR-023A | gemma4:e4b Q8_0 + nomic-embed-text เท่านั้น | ✅ |
+| ADR-023A | gemma4:e2b + nomic-embed-text เท่านั้น | ✅ |

 ## Project Structure

@@ -109,7 +109,7 @@ DBA หรือ DevOps สร้างตาราง `tags` และ `corresp
 - **FR-001b**: Backend ต้อง double-check `import_transactions` (document_number + batch_id + status != FAILED) ก่อน enqueue BullMQ — ถ้าซ้ำ return 409 พร้อม `existingJobId` (defense-in-depth ต่างหากจาก Idempotency-Key)
 - **FR-002**: ระบบต้องมี endpoint `GET /api/ai/jobs/:jobId` สำหรับ polling status และรับ AI output
 - **FR-003**: BullMQ Worker ต้องรัน OCR auto-detect: PyMuPDF (extracted_chars > 100) หรือ PaddleOCR + PyThaiNLP
- **FR-004**: AI inference ต้องใช้ `gemma4:e4b Q8_0` เท่านั้น ผ่าน Ollama บน Desk-5439 (ห้าม model อื่น)
+- **FR-004**: AI inference ต้องใช้ `gemma4:e2b` เท่านั้น ผ่าน Ollama บน Desk-5439 (ห้าม model อื่น)
 - **FR-005**: Temp files ต้องถูก auto-cleanup ใน 24 ชั่วโมง หลัง job `failed` หรือไม่มี commit (Scheduled BullMQ job)
 - **FR-005a**: Cleanup scheduler ต้อง exclude temp files ที่ถูก reference โดย `migration_review_queue.status = PENDING` — ห้ามลบ file ที่รออยู่ใน review queue
 - **FR-005b**: PENDING records ที่ไม่มี action ภายใน 30 วัน ต้อง auto-expire เป็น `EXPIRED` + cleanup temp file + แจ้ง Admin (BullMQ notification job)
@@ -39,7 +39,7 @@

 - [x] T009 [US1] สร้าง BullMQ Worker `MigrateDocumentWorker` ใน `backend/src/modules/ai/workers/migrate-document.worker.ts` — Step 1: fetch temp file from StorageService
 - [x] T010 [P] [US1] เพิ่ม OCR routing logic ใน Worker — PyMuPDF Fast Path (chars > 100) หรือ PaddleOCR Slow Path — เรียกผ่าน OCR Service HTTP API (ไม่ใช่ direct Ollama)
- [x] T011 [P] [US1] เพิ่ม gemma4:e4b inference ใน Worker — System Prompt + User Prompt สำหรับ metadata extraction + classification + tagging
+- [x] T011 [P] [US1] เพิ่ม gemma4:e2b inference ใน Worker — System Prompt + User Prompt สำหรับ metadata extraction + classification + tagging
 - [x] T012 [US1] เพิ่ม JSON validation + error handling ใน Worker (ADR-007) — ถ้า AI output ไม่ถูก format → mark job failed + log ใน `ai_audit_logs`
 - [x] T013 [US1] เพิ่ม `submitMigrationJob()` method ใน `backend/src/modules/ai/ai.service.ts` — (1) Idempotency-Key check; (2) double-check `import_transactions` (document_number + batch_id + status != FAILED) ก่อน enqueue → 409 พร้อม existingJobId ถ้าซ้ำ (FR-001b); (3) enqueue ไปยัง ai-batch queue
 - [x] T014 [US1] เพิ่ม `POST /api/ai/jobs` endpoint ใน `backend/src/modules/ai/ai.controller.ts` (JwtAuthGuard + CaslAbilityGuard + Idempotency-Key header validation)