690419:1310 feat: update CI/CD to use SSH key authentication #03
This commit is contained in:
@@ -71,6 +71,18 @@ jobs:
|
|||||||
- name: " Checkout"
|
- name: " Checkout"
|
||||||
uses: actions/checkout@v4
|
uses: actions/checkout@v4
|
||||||
|
|
||||||
|
- name: " Debug Connection Info"
|
||||||
|
run: |
|
||||||
|
echo "HOST length: ${#HOST_VAL}"
|
||||||
|
echo "PORT value: $PORT_VAL"
|
||||||
|
# ลอง resolve DNS ของ host
|
||||||
|
nslookup "$HOST_VAL" 2>/dev/null || host "$HOST_VAL" 2>/dev/null || echo "Cannot resolve"
|
||||||
|
# ดูว่า host ตอบสนองหรือไม่
|
||||||
|
nc -zv -w5 "$HOST_VAL" "$PORT_VAL" 2>&1 || true
|
||||||
|
env:
|
||||||
|
HOST_VAL: ${{ secrets.HOST }}
|
||||||
|
PORT_VAL: ${{ secrets.PORT }}
|
||||||
|
|
||||||
- name: " Setup SSH Key and Deploy to QNAP"
|
- name: " Setup SSH Key and Deploy to QNAP"
|
||||||
run: |
|
run: |
|
||||||
# Setup SSH key authentication
|
# Setup SSH key authentication
|
||||||
|
|||||||
@@ -0,0 +1,148 @@
|
|||||||
|
# ADR-022: Retrieval-Augmented Generation (RAG) System
|
||||||
|
|
||||||
|
**Status:** Accepted
|
||||||
|
**Date:** 2026-04-19
|
||||||
|
**Decision Makers:** Development Team, System Architect
|
||||||
|
**Related Documents:**
|
||||||
|
|
||||||
|
- [RAG Implementation Guide v1.1.2](../08-Tasks/ADR-022-Retrieval-Augmented-Generation/LCBP3-RAG-Implementation-Guide-v1.1.2.md)
|
||||||
|
- [Implementation Plan](../08-Tasks/ADR-022-Retrieval-Augmented-Generation/plan.md)
|
||||||
|
- [Tasks](../08-Tasks/ADR-022-Retrieval-Augmented-Generation/tasks.md)
|
||||||
|
- [ADR-018: AI Boundary](./ADR-018-ai-boundary.md)
|
||||||
|
- [ADR-020: AI Intelligence Integration](./ADR-020-ai-intelligence-integration.md)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 🎯 Gap Analysis & Purpose
|
||||||
|
|
||||||
|
### ปิด Gap จากเอกสาร:
|
||||||
|
|
||||||
|
- **ADR-020 AI Integration** — ระบุว่า DMS ต้องรองรับ Document Q&A แต่ไม่มี retrieval mechanism
|
||||||
|
- เหตุผล: LLM อย่างเดียวไม่รู้ข้อมูลเอกสารของโครงการ — ต้องมี context injection จากเอกสารจริง
|
||||||
|
- **01-03-modules** — ผู้ใช้ต้องการค้นหาข้อมูลจากเอกสาร Correspondence/RFA/Drawing แบบ natural language
|
||||||
|
- เหตุผล: Full-text search ให้แค่ keyword match — ไม่เข้าใจ semantic และภาษาไทย
|
||||||
|
|
||||||
|
### แก้ไขความขัดแย้ง:
|
||||||
|
|
||||||
|
- **ADR-018 AI Isolation** vs **RAG Cloud API**: Typhoon API ใช้ cloud LLM ซึ่งขัดกับ on-premises policy
|
||||||
|
- การตัดสินใจ: CONFIDENTIAL documents → local Ollama เท่านั้น; PUBLIC/INTERNAL → Typhoon primary + Ollama failover
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Context and Problem Statement
|
||||||
|
|
||||||
|
LCBP3-DMS เก็บเอกสารโครงการก่อสร้างขนาดใหญ่ (Correspondence, RFA, Drawing, Contract) ทั้งหมดในภาษาไทยและอังกฤษ ผู้ใช้ต้องการค้นหาและถามคำถามจากเอกสารเหล่านี้แบบ conversational Q&A โดยได้คำตอบพร้อม citation ที่อ้างอิงได้
|
||||||
|
|
||||||
|
### Key Problems
|
||||||
|
|
||||||
|
1. **Semantic Gap:** Full-text search หา keyword ได้แต่ไม่เข้าใจความหมาย เช่น "ปัญหาเรื่องการส่งมอบ" ≠ "delay in delivery"
|
||||||
|
2. **Thai Language:** MariaDB FULLTEXT ไม่รองรับ tokenization ภาษาไทยอย่างถูกต้อง
|
||||||
|
3. **Hallucination Risk:** LLM ที่ไม่มี context จะตอบจาก training data — ไม่ใช่เอกสารโครงการจริง
|
||||||
|
4. **Security Isolation:** เอกสาร CONFIDENTIAL ห้ามส่งไป cloud AI ใดๆ (ADR-018)
|
||||||
|
5. **Multi-tenancy:** เอกสารของแต่ละโครงการต้องแยก query scope อย่างเด็ดขาด
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Decision Drivers
|
||||||
|
|
||||||
|
- **Accuracy:** คำตอบต้องมา from เอกสารจริง พร้อม citation ที่ตรวจสอบได้
|
||||||
|
- **Security (ADR-018):** CONFIDENTIAL → local Ollama only; ห้าม cloud API
|
||||||
|
- **Thai Language:** ต้องรองรับ PyThaiNLP tokenization ก่อน embed
|
||||||
|
- **Multi-tenancy:** Qdrant `project_public_id` payload filter — non-negotiable
|
||||||
|
- **Performance SLO:** Typhoon primary p95 < 3s; Ollama fallback p95 < 10s
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Considered Options
|
||||||
|
|
||||||
|
### Option 1: Full-text Search Only (MariaDB FULLTEXT)
|
||||||
|
|
||||||
|
**แนวทาง:** ใช้ MariaDB FULLTEXT ที่มีอยู่แล้ว + prompt LLM ด้วย raw search results
|
||||||
|
|
||||||
|
**ข้อดี:** ไม่ต้อง infra เพิ่ม, deploy ง่าย
|
||||||
|
|
||||||
|
**ข้อเสีย:** Thai tokenization แย่, ไม่มี semantic understanding, recall ต่ำ
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Option 2: Vector Search Only (Qdrant + Ollama)
|
||||||
|
|
||||||
|
**แนวทาง:** Embed ทุก chunk → เก็บใน Qdrant → query ด้วย vector similarity เท่านั้น
|
||||||
|
|
||||||
|
**ข้อดี:** Semantic search ดี, รองรับภาษาไทย
|
||||||
|
|
||||||
|
**ข้อเสีย:** Keyword exact match แย่กว่า FULLTEXT, miss เลขที่เอกสารที่พิมพ์ตรงๆ
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Option 3: Hybrid RAG — Vector + Keyword + Re-rank ✅ **SELECTED**
|
||||||
|
|
||||||
|
**แนวทาง:** Hybrid search (0.7 vector + 0.3 keyword) → score-based re-rank → context build → LLM generate
|
||||||
|
|
||||||
|
**ข้อดี:**
|
||||||
|
- ดีที่สุดทั้ง semantic และ exact match
|
||||||
|
- PyThaiNLP preprocessing ก่อน embed → Thai language accuracy สูง
|
||||||
|
- Qdrant tiered multitenancy → tenant isolation สมบูรณ์
|
||||||
|
- Typhoon primary + Ollama failover → availability สูง
|
||||||
|
|
||||||
|
**ข้อเสีย:** Complex infrastructure (Qdrant + Redis + BullMQ + PyThaiNLP microservice)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Decision Outcome
|
||||||
|
|
||||||
|
**เลือก Option 3: Hybrid RAG**
|
||||||
|
|
||||||
|
### Core Architecture
|
||||||
|
|
||||||
|
| Component | Technology | Role |
|
||||||
|
|-----------|-----------|------|
|
||||||
|
| Vector Store | Qdrant v1.16+ | Tiered multitenancy (`project_public_id` is_tenant=true) |
|
||||||
|
| Embedding | nomic-embed-text (Ollama) | 768-dim Thai+English vectors |
|
||||||
|
| Thai NLP | PyThaiNLP microservice | Tokenize + normalize ก่อน embed |
|
||||||
|
| Queue | BullMQ (rag:ocr, rag:thai-preprocess, rag:embedding) | Async ingestion pipeline |
|
||||||
|
| LLM Primary | Typhoon API | PUBLIC + INTERNAL (p95 < 3s) |
|
||||||
|
| LLM Fallback | Ollama local | CONFIDENTIAL + Typhoon down (p95 < 10s) |
|
||||||
|
| Cache | Redis | Query cache key: SHA256(question+projectPublicId+classificationCeiling) |
|
||||||
|
|
||||||
|
### Key Decisions (Clarified 2026-04-19)
|
||||||
|
|
||||||
|
| # | Decision | Rationale |
|
||||||
|
|---|----------|-----------|
|
||||||
|
| Q1 | Tiered multitenancy (single collection + is_tenant=true) | Qdrant v1.16+ native; no collection-per-project overhead |
|
||||||
|
| Q2 | Auto-failover Typhoon→Ollama | Availability > consistency for non-CONFIDENTIAL |
|
||||||
|
| Q3 | PyThaiNLP preprocessing | Thai tokenization critical for embed quality |
|
||||||
|
| Q4 | rag_status in `attachments` table | Single source of truth per file (ADR-009) |
|
||||||
|
| Q5 | Split SLO: Typhoon < 3s / Ollama < 10s | Realistic targets per LLM path |
|
||||||
|
|
||||||
|
### Edge Cases Enforced (Red Team 2026-04-19)
|
||||||
|
|
||||||
|
| EC | Rule |
|
||||||
|
|----|------|
|
||||||
|
| EC-RAG-001 | BullMQ jobId = attachmentId (native dedup — ป้องกัน concurrent ingestion) |
|
||||||
|
| EC-RAG-002 | reIngest() cleanup: DELETE chunks → DELETE Qdrant → PENDING (ordered) |
|
||||||
|
| EC-RAG-003 | QdrantService.OnModuleInit: auto-create collection; collectionReady flag + 503 |
|
||||||
|
| EC-RAG-004 | maxClassification ห้ามรับจาก client — server-derived จาก user role เท่านั้น |
|
||||||
|
| EC-RAG-005 | Cache key = SHA256(question+projectPublicId+classificationCeiling); CONFIDENTIAL bypass cache |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Consequences
|
||||||
|
|
||||||
|
### Positive
|
||||||
|
|
||||||
|
- ✅ Q&A ภาษาไทยจากเอกสารโครงการพร้อม citation ที่ตรวจสอบได้
|
||||||
|
- ✅ ADR-018 compliant — CONFIDENTIAL ไม่ออกนอก on-premises
|
||||||
|
- ✅ Multi-tenant isolation ระดับ Qdrant payload filter
|
||||||
|
- ✅ Auto-failover → high availability
|
||||||
|
|
||||||
|
### Negative / Trade-offs
|
||||||
|
|
||||||
|
- ⚠️ Infrastructure complexity เพิ่มขึ้น (Qdrant + PyThaiNLP microservice)
|
||||||
|
- ⚠️ Initial indexing ใช้เวลา — existing documents ต้อง batch re-index
|
||||||
|
- ⚠️ Ollama fallback p95 < 10s อาจ UX ไม่ดีสำหรับ CONFIDENTIAL users
|
||||||
|
|
||||||
|
### Risks
|
||||||
|
|
||||||
|
- Qdrant volume loss → vectors ต้อง re-index ทั้งหมด (mitigated: Qdrant snapshot backup §19.7)
|
||||||
|
- PyThaiNLP microservice down → ingestion queue stuck (mitigated: 3-retry DLQ + FAILED status)
|
||||||
+1
-1
@@ -1,7 +1,7 @@
|
|||||||
|
|
||||||
**LCBP3 DMS**
|
**LCBP3 DMS**
|
||||||
|
|
||||||
**RAG Implementation Guide**
|
**RAG Implementation Guide by Claude**
|
||||||
|
|
||||||
Retrieval-Augmented Generation for Document Management System
|
Retrieval-Augmented Generation for Document Management System
|
||||||
|
|
||||||
+1
-1
@@ -1,4 +1,4 @@
|
|||||||
นี่คือร่างเนื้อหาฉบับปรับปรุงของ **LCBP3 RAG Implementation Guide (v1.1.0)** ในรูปแบบ Markdown โดยเน้นการยกระดับความปลอดภัย (Security), การรองรับหลายโครงการ (Multi-tenancy), และความถูกต้องของคำตอบ (Truthfulness) ตามมาตรฐานระบบ LCBP3 DMS
|
นี่คือร่างเนื้อหาฉบับปรับปรุงของ **LCBP3 RAG Implementation Guide (v1.1.0) by Gemini** ในรูปแบบ Markdown โดยเน้นการยกระดับความปลอดภัย (Security), การรองรับหลายโครงการ (Multi-tenancy), และความถูกต้องของคำตอบ (Truthfulness) ตามมาตรฐานระบบ LCBP3 DMS
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
+1
-1
@@ -1,4 +1,4 @@
|
|||||||
# 📋 Review: LCBP3 DMS RAG Implementation Guide (v1.1.0)
|
# 📋 Review: LCBP3 DMS RAG Implementation Guide (v1.1.1 by Qwen)
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
+567
@@ -0,0 +1,567 @@
|
|||||||
|
# LCBP3 DMS — RAG v1.1.2 Enterprise Implementation Guide by ChatGPT
|
||||||
|
|
||||||
|
Version: 1.1.2
|
||||||
|
Project: Laem Chabang Basin Phase 3
|
||||||
|
Stack: NestJS + Next.js + MariaDB + Qdrant + Redis + Ollama + Typhoon API
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
# 1. Overview
|
||||||
|
|
||||||
|
ถูกออกแบบเพื่อยกระดับให้รองรับ Production จริง โดยเน้น:
|
||||||
|
|
||||||
|
* Security (RBAC + Classification)
|
||||||
|
* Scalability (Async + Queue + Batch)
|
||||||
|
* Accuracy (Hybrid Search + Re-ranking)
|
||||||
|
* Observability (Logging + Metrics)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Clarifications
|
||||||
|
|
||||||
|
### Session 2026-04-19
|
||||||
|
|
||||||
|
- Q: Qdrant Multi-tenancy Strategy: ระบบ RAG ควรแยกข้อมูลระหว่างโครงการใน Qdrant อย่างไร? → A: Option C — Tiered Multitenancy: single collection + custom sharding + `is_tenant: true` payload index บน `project_public_id` (Qdrant v1.16+)
|
||||||
|
- Q: Typhoon API Failover: เมื่อ Typhoon Cloud API ใช้งานไม่ได้สำหรับ non-CONFIDENTIAL ระบบควรทำอย่างไร? → A: Option B — Auto-failover to local Ollama LLM โดยอัตโนมัติ พร้อมแจ้ง user ว่าใช้ local model (คุณภาพอาจลดลง)
|
||||||
|
- Q: Thai Language Preprocessing: ควรเพิ่ม preprocessing ก่อนส่ง embed (หลังจาก OCR) ไหม? → A: Option B — เพิ่ม PyThaiNLP tokenization + normalize (Thai number, คำย่อ, section header) ก่อนส่ง nomic-embed-text; ยังใช้ embedding model เดิม
|
||||||
|
- Q: `rag_status` field location: ควรเก็บสถานะ RAG ingestion ไว้ใน table ไหน? → A: Option A — เพิ่ม `rag_status ENUM('PENDING','PROCESSING','INDEXED','FAILED')` และ `rag_last_error TEXT` ใน `attachments` table (ADR-009: แก้ SQL โดยตรง, ห้ามใช้ migration)
|
||||||
|
- Q: End-to-End RAG Query Latency SLO: เป้าหมาย latency สำหรับ RAG query ควรเป็นเท่าไหร่? → A: Option D — แยก SLO ตาม path: Typhoon primary p95 < 3s / Ollama fallback p95 < 10s
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
# 2. High-Level Architecture
|
||||||
|
|
||||||
|
## Flow
|
||||||
|
|
||||||
|
Upload → OCR → Normalize → Chunk → Embed → Store
|
||||||
|
|
||||||
|
Query → Auth → Rewrite → Hybrid Search → Re-rank → Context Build → LLM → Response
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
# 3. Core Components
|
||||||
|
|
||||||
|
## 3.1 Storage
|
||||||
|
|
||||||
|
* MariaDB: Document metadata + chunks
|
||||||
|
* Qdrant: Vector store — **Tiered Multitenancy** (single collection `lcbp3_vectors`, Qdrant v1.16+)
|
||||||
|
* ใช้ `is_tenant: true` payload index บน `project_public_id` (ห้ามใช้ separate collection ต่อโครงการ)
|
||||||
|
* Custom sharding เพื่อลด noisy-neighbor ระหว่างโครงการ
|
||||||
|
* ทุก query ต้องมี payload filter `project_public_id` เสมอ (non-negotiable)
|
||||||
|
* Redis: Cache + Queue
|
||||||
|
|
||||||
|
## 3.2 AI Layer
|
||||||
|
|
||||||
|
* Embedding: Ollama (nomic-embed-text)
|
||||||
|
* LLM (Primary): Typhoon API — สำหรับ PUBLIC + INTERNAL
|
||||||
|
* LLM (Fallback): Local Ollama — auto-failover อัตโนมัติเมื่อ Typhoon timeout/down พร้อมส่ง flag `used_fallback_model: true` ในผลลัพธ์
|
||||||
|
* LLM (CONFIDENTIAL): Local Ollama เท่านั้น (ADR-018, non-negotiable)
|
||||||
|
* Failover trigger: Typhoon API timeout > 5 วินาที หรือ HTTP error 5xx
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
# 4. Data Model
|
||||||
|
|
||||||
|
## document_chunks
|
||||||
|
|
||||||
|
* id (UUID)
|
||||||
|
* document_id
|
||||||
|
* chunk_index
|
||||||
|
* content
|
||||||
|
* doc_type
|
||||||
|
* doc_number
|
||||||
|
* revision
|
||||||
|
* project_code
|
||||||
|
* classification
|
||||||
|
* version
|
||||||
|
* embedding_model
|
||||||
|
* created_at
|
||||||
|
|
||||||
|
## Indexes
|
||||||
|
|
||||||
|
* INDEX(document_id)
|
||||||
|
* INDEX(doc_number, revision)
|
||||||
|
* FULLTEXT(content)
|
||||||
|
|
||||||
|
## attachments (RAG fields — เพิ่มใหม่, ADR-009)
|
||||||
|
|
||||||
|
* `rag_status ENUM('PENDING','PROCESSING','INDEXED','FAILED') DEFAULT 'PENDING'` — ติดตามสถานะ ingestion ระดับ file
|
||||||
|
* `rag_last_error TEXT NULL` — เก็บ error message สุดท้ายเมื่อ status = FAILED
|
||||||
|
|
||||||
|
> ⚠ ห้ามเพิ่ม `rag_status` ใน `document_chunks` (เก็บ chunks ที่ indexed แล้วเท่านั้น)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
# 5. Security Model
|
||||||
|
|
||||||
|
## 5.1 RBAC
|
||||||
|
|
||||||
|
* Enforce ทุก query ด้วย CASL permission `manage:rag`
|
||||||
|
* Filter ตาม user permission + `project_public_id` (tenant isolation)
|
||||||
|
* ทุก endpoint ต้องมี CASL Guard — ห้าม bypass
|
||||||
|
|
||||||
|
## 5.2 Classification
|
||||||
|
|
||||||
|
* PUBLIC
|
||||||
|
* INTERNAL
|
||||||
|
* CONFIDENTIAL
|
||||||
|
|
||||||
|
CONFIDENTIAL → ใช้ local LLM เท่านั้น
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
# 6. Chunking Strategy
|
||||||
|
|
||||||
|
| Type | Strategy | Size | Overlap |
|
||||||
|
| -------- | ----------- | ---- | ------- |
|
||||||
|
| CORR/MOM | Paragraph | 500 | 50 |
|
||||||
|
| RFI/NCR | Section | 300 | 30 |
|
||||||
|
| CONTRACT | Table-aware | 400 | 40 |
|
||||||
|
| RPT | Sliding | 600 | 100 |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
# 7. Ingestion Pipeline
|
||||||
|
|
||||||
|
## Steps
|
||||||
|
|
||||||
|
1. Extract text (OCR — Tesseract, ตาม ADR-017B)
|
||||||
|
2. **Thai Preprocessing (PyThaiNLP):** tokenize + normalize (ตัวเลขไทย, คำย่อ เช่น "รฟม." → expansion, section header strip)
|
||||||
|
3. Chunk (ตาม strategy ใน Section 6)
|
||||||
|
4. Send to queue (BullMQ)
|
||||||
|
5. Embed (parallel — nomic-embed-text via Ollama)
|
||||||
|
6. Upsert Qdrant (batch, collection `lcbp3_vectors`)
|
||||||
|
7. Save DB (document_chunks + update rag_status)
|
||||||
|
|
||||||
|
## Queue Design (BullMQ — ADR-008)
|
||||||
|
|
||||||
|
| Queue | Consumer | Workers |
|
||||||
|
|-------|----------|---------|
|
||||||
|
| `rag:ocr` | OcrProcessor | 2 |
|
||||||
|
| `rag:thai-preprocess` | ThaiPreprocessProcessor | 4 |
|
||||||
|
| `rag:embedding` | EmbeddingProcessor | 3 |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
# 8. Retrieval (Hybrid)
|
||||||
|
|
||||||
|
## Vector Search
|
||||||
|
|
||||||
|
Qdrant top 20
|
||||||
|
|
||||||
|
## Keyword Search
|
||||||
|
|
||||||
|
MariaDB FULLTEXT top 20
|
||||||
|
|
||||||
|
## Merge
|
||||||
|
|
||||||
|
score = 0.7 vector + 0.3 keyword
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
# 9. Re-ranking
|
||||||
|
|
||||||
|
Top 20 → Re-rank → Top 5
|
||||||
|
|
||||||
|
**Decision (MVP)**: Score-based sort — ใช้ `mergedScore` จาก Hybrid Search (0.7 vector + 0.3 keyword) เรียงลำดับ descending แล้วตัด top 5 ไม่มี LLM re-ranking call เพิ่มเติมใน MVP
|
||||||
|
|
||||||
|
**Future Enhancement**: Typhoon LLM cross-encoder re-ranking (defer ไป Phase 2)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
# 10. Context Builder
|
||||||
|
|
||||||
|
* จำกัด 3–5 documents
|
||||||
|
* จำกัด token ~3000
|
||||||
|
|
||||||
|
Format:
|
||||||
|
[DOC_TYPE - DOC_NUMBER - REV]
|
||||||
|
Content snippet
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
# 11. Prompt Design
|
||||||
|
|
||||||
|
System Prompt:
|
||||||
|
|
||||||
|
* Answer only from context
|
||||||
|
* If not found → "ไม่พบข้อมูลในเอกสาร"
|
||||||
|
* MUST cite source
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
# 12. Query Flow
|
||||||
|
|
||||||
|
1. Validate user
|
||||||
|
2. Embed question
|
||||||
|
3. Hybrid search
|
||||||
|
4. Filter by ACL
|
||||||
|
5. Re-rank
|
||||||
|
6. Build context
|
||||||
|
7. Generate answer
|
||||||
|
8. Return + sources
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
# 13. Performance
|
||||||
|
|
||||||
|
## Latency SLO (End-to-End RAG Query)
|
||||||
|
|
||||||
|
| Path | Target | เงื่อนไข |
|
||||||
|
|------|--------|---------|
|
||||||
|
| **Typhoon primary** | p95 < 3s | สำหรับ PUBLIC + INTERNAL เมื่อ Typhoon พร้อมใช้งาน |
|
||||||
|
| **Ollama fallback** | p95 < 10s | เมื่อ Typhoon down หรือ CONFIDENTIAL |
|
||||||
|
|
||||||
|
## Cache
|
||||||
|
|
||||||
|
* embedding cache
|
||||||
|
* query result cache (TTL 5 min)
|
||||||
|
|
||||||
|
## Retry
|
||||||
|
|
||||||
|
* 3 attempts
|
||||||
|
* exponential backoff
|
||||||
|
|
||||||
|
## Timeout
|
||||||
|
|
||||||
|
* 5 seconds per service (individual service call limit)
|
||||||
|
* Typhoon API: timeout 5s → trigger fallover to Ollama
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
# 14. Observability
|
||||||
|
|
||||||
|
## Logging
|
||||||
|
|
||||||
|
* user_id
|
||||||
|
* question
|
||||||
|
* retrieved docs
|
||||||
|
* answer
|
||||||
|
* latency
|
||||||
|
|
||||||
|
## Metrics
|
||||||
|
|
||||||
|
* latency
|
||||||
|
* QPS
|
||||||
|
* accuracy
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
# 15. Deployment
|
||||||
|
|
||||||
|
## Docker Services
|
||||||
|
|
||||||
|
* qdrant
|
||||||
|
* redis
|
||||||
|
* ollama
|
||||||
|
* api (NestJS)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
# 16. Rollout Plan
|
||||||
|
|
||||||
|
> ดู `plan.md` สำหรับรายละเอียด rollout phases ครบถ้วน
|
||||||
|
|
||||||
|
| Phase | ชื่อ | ระยะเวลา |
|
||||||
|
|-------|------|----------|
|
||||||
|
| 1 | Infrastructure | 2 วัน |
|
||||||
|
| 2 | Core Services | 3 วัน |
|
||||||
|
| 3 | RAG Query API (MVP) | 2 วัน |
|
||||||
|
| 4 | Ingestion Pipeline | 3 วัน |
|
||||||
|
| 5 | Frontend UI + Polish | 3 วัน |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
# 17. Best Practices
|
||||||
|
|
||||||
|
* ใช้ batch embedding
|
||||||
|
* จำกัด context
|
||||||
|
* log ทุก query
|
||||||
|
* ทดสอบด้วย dataset จริง
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
# 18. Future Enhancements
|
||||||
|
|
||||||
|
* Re-ranking model tuning
|
||||||
|
* Feedback loop
|
||||||
|
* Active learning
|
||||||
|
* Multilingual support
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
# 19. Production Rollout Package (v2.1)
|
||||||
|
|
||||||
|
## 19.1 Docker Compose (Production Example)
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
version: '3.9'
|
||||||
|
services:
|
||||||
|
qdrant:
|
||||||
|
image: qdrant/qdrant:latest
|
||||||
|
restart: unless-stopped
|
||||||
|
volumes:
|
||||||
|
- qdrant_data:/qdrant/storage
|
||||||
|
networks: [internal]
|
||||||
|
|
||||||
|
redis:
|
||||||
|
image: redis:7-alpine
|
||||||
|
restart: unless-stopped
|
||||||
|
volumes:
|
||||||
|
- redis_data:/data
|
||||||
|
networks: [internal]
|
||||||
|
|
||||||
|
ollama:
|
||||||
|
image: ollama/ollama
|
||||||
|
restart: unless-stopped
|
||||||
|
volumes:
|
||||||
|
- ollama_data:/root/.ollama
|
||||||
|
networks: [internal]
|
||||||
|
|
||||||
|
api:
|
||||||
|
build: .
|
||||||
|
restart: unless-stopped
|
||||||
|
depends_on:
|
||||||
|
- qdrant
|
||||||
|
- redis
|
||||||
|
- ollama
|
||||||
|
environment:
|
||||||
|
- QDRANT_URL=http://qdrant:6333
|
||||||
|
- REDIS_HOST=redis
|
||||||
|
- OLLAMA_URL=http://ollama:11434
|
||||||
|
networks: [internal]
|
||||||
|
|
||||||
|
networks:
|
||||||
|
internal:
|
||||||
|
driver: bridge
|
||||||
|
|
||||||
|
volumes:
|
||||||
|
qdrant_data:
|
||||||
|
redis_data:
|
||||||
|
ollama_data:
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 19.2 Environment (Production)
|
||||||
|
|
||||||
|
```env
|
||||||
|
NODE_ENV=production
|
||||||
|
|
||||||
|
QDRANT_URL=http://qdrant:6333
|
||||||
|
REDIS_HOST=redis
|
||||||
|
REDIS_PORT=6379
|
||||||
|
|
||||||
|
OLLAMA_URL=http://ollama:11434
|
||||||
|
|
||||||
|
TYPHOON_API_KEY=***
|
||||||
|
|
||||||
|
RAG_TOPK=20
|
||||||
|
RAG_FINAL_K=5
|
||||||
|
RAG_TIMEOUT_MS=5000
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 19.3 Qdrant Optimization (Tiered Multitenancy)
|
||||||
|
|
||||||
|
* ใช้ HNSW config สำหรับ tenant-aware index:
|
||||||
|
|
||||||
|
```
|
||||||
|
payload_m: 16 # tenant HNSW index size
|
||||||
|
m: 0 # ปิด global index (non-negotiable สำหรับ is_tenant=true)
|
||||||
|
```
|
||||||
|
|
||||||
|
* เปิด payload index:
|
||||||
|
|
||||||
|
* `project_public_id` — `is_tenant: true` (required)
|
||||||
|
* `doc_number`
|
||||||
|
* `classification`
|
||||||
|
|
||||||
|
> ⚠️ ห้ามใช้ `m=16` กับ global index เมื่อใช้ tiered multitenancy — ทำให้ is_tenant ไม่ทำงาน
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 19.4 Worker Scaling
|
||||||
|
|
||||||
|
* แยก worker container:
|
||||||
|
|
||||||
|
* ingestion-worker
|
||||||
|
* embedding-worker
|
||||||
|
|
||||||
|
* scale:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
docker compose up --scale embedding-worker=3
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 19.5 Security Hardening Checklist
|
||||||
|
|
||||||
|
* Enforce RBAC ทุก endpoint (`manage:rag` CASL guard)
|
||||||
|
* Validate input (class-validator + Zod frontend)
|
||||||
|
* Limit context length (≤3000 tokens)
|
||||||
|
* Strip sensitive fields ก่อน log
|
||||||
|
* Idempotency-Key header บน `POST /rag/query`
|
||||||
|
* Prevent prompt injection (research.md R7 — 4 layers):
|
||||||
|
1. System prompt boundary markers `<CONTEXT_START>` / `<CONTEXT_END>`
|
||||||
|
2. Require JSON-only structured output
|
||||||
|
3. Post-generation citation validation (ตรวจ doc_number ใน retrieved chunks)
|
||||||
|
4. Fallback response: "ไม่พบข้อมูลที่ระบุ" เมื่อ validation ล้มเหลว
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 19.6 Logging & Monitoring Stack
|
||||||
|
|
||||||
|
* Logs: JSON format
|
||||||
|
* Tools:
|
||||||
|
|
||||||
|
* Loki / ELK
|
||||||
|
* Prometheus + Grafana
|
||||||
|
|
||||||
|
Metrics:
|
||||||
|
|
||||||
|
* latency
|
||||||
|
* error rate
|
||||||
|
* retrieval accuracy
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 19.7 Backup Strategy
|
||||||
|
|
||||||
|
* MariaDB: daily dump
|
||||||
|
* Qdrant: snapshot volume
|
||||||
|
* Redis: optional (cache)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 19.8 Deployment Strategy
|
||||||
|
|
||||||
|
* Blue/Green deployment
|
||||||
|
|
||||||
|
* Health check endpoint:
|
||||||
|
/health
|
||||||
|
|
||||||
|
* Readiness:
|
||||||
|
|
||||||
|
* Qdrant OK
|
||||||
|
* Redis OK
|
||||||
|
* Ollama OK
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 19.9 Testing Checklist
|
||||||
|
|
||||||
|
* Query accuracy
|
||||||
|
* Permission isolation
|
||||||
|
* Load test (concurrent users)
|
||||||
|
* Failover test
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 19.10 Go-Live Checklist
|
||||||
|
|
||||||
|
* Infra ready
|
||||||
|
* Env configured
|
||||||
|
* Index built
|
||||||
|
* RBAC tested
|
||||||
|
* Monitoring enabled
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
# 20. Edge Cases & Robustness Rules
|
||||||
|
|
||||||
|
> เพิ่มจาก Red Team session 2026-04-19 (`/util-speckit.quizme`)
|
||||||
|
|
||||||
|
## EC-RAG-001 — Concurrent Ingestion Dedup
|
||||||
|
|
||||||
|
**Scenario**: ไฟล์เดิมถูก commit สองครั้งติดกัน (retry / double-click) → `IngestionService.enqueue()` ถูกเรียกซ้ำ
|
||||||
|
|
||||||
|
**Rule**: ใช้ `attachmentId` เป็น BullMQ `jobId` บน queue `rag:ocr` เพื่อ native dedup — ถ้า job นั้น active/waiting อยู่แล้ว BullMQ จะ ignore silently และ logger จะ log `'rag:ocr job already queued for {attachmentId}'`
|
||||||
|
|
||||||
|
**Defense in depth**: `OcrProcessor` ต้อง validate `rag_status !== PROCESSING` เป็น double-check แรกก่อนเริ่มทำงาน — ถ้า PROCESSING อยู่ให้ return `MoveToCompleted` โดยไม่ error
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## EC-RAG-002 — Partial Ingestion Cleanup Before Re-ingest
|
||||||
|
|
||||||
|
**Scenario**: Embedding processor fail ระหว่างทาง (chunks บางส่วน upsert ไป Qdrant แล้ว) → rag_status = FAILED → Admin trigger re-ingest
|
||||||
|
|
||||||
|
**Rule**: `RagService.reIngest()` ต้อง cleanup ก่อน re-queue ตามลำดับนี้เสมอ:
|
||||||
|
|
||||||
|
1. `DELETE FROM document_chunks WHERE document_id = :attachmentId` (DB transaction — rollback ได้)
|
||||||
|
2. DELETE Qdrant points WITH payload filter `documentId = :attachmentId`
|
||||||
|
3. SET `rag_status = PENDING` + clear `rag_last_error`
|
||||||
|
|
||||||
|
**Failure handling**:
|
||||||
|
- Step 1 fail → throw `BusinessException` — ห้ามดำเนินการต่อ
|
||||||
|
- Step 2 fail → log ERROR แต่ดำเนิน step 3 ต่อ (Qdrant cleanup จะเกิดขึ้นเองเมื่อ re-embed สำเร็จ เพราะใช้ chunk UUID เดิม → upsert ทับ)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## EC-RAG-003 — Qdrant Collection Auto-init on Startup
|
||||||
|
|
||||||
|
**Scenario**: Qdrant container fresh start / volume หาย — ผู้ใช้ query ก่อน Admin เคยรัน init endpoint
|
||||||
|
|
||||||
|
**Rule**: `QdrantService` ต้อง implement `OnModuleInit` เพื่อ idempotently create/validate collection `lcbp3_vectors` ด้วย HNSW config `payload_m:16, m:0` ทุกครั้งที่ app เริ่ม
|
||||||
|
|
||||||
|
**Failure handling**:
|
||||||
|
- Qdrant ไม่ตอบสนอง → log ERROR + set `collectionReady = false` — ห้าม throw (ไม่ crash app)
|
||||||
|
- Query endpoints ต้อง check `collectionReady` ก่อนทุก call — ถ้า `false` ให้ return 503 `{"message":"RAG service unavailable","code":"RAG_NOT_READY"}`
|
||||||
|
|
||||||
|
**Note**: `POST /rag/admin/init-collection` (T038) ยังคงมีประโยชน์สำหรับ force-reinit กรณี Qdrant volume หาย mid-production
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## EC-RAG-004 — Classification Level Must Be Server-Derived (Never Client-Provided)
|
||||||
|
|
||||||
|
**Scenario**: User ที่มีสิทธิ์แค่ PUBLIC ส่ง `maxClassification: "CONFIDENTIAL"` ใน request body โดยตรง
|
||||||
|
|
||||||
|
**Rule**: `maxClassification` ห้ามรับจาก request body — ต้องลบออกจาก `RagQueryDto` ทั้งหมด
|
||||||
|
|
||||||
|
**Server-side classification derivation**: `RagService.query()` ต้อง derive classification ceiling จาก user role ใน project membership:
|
||||||
|
|
||||||
|
| Role | Classification Ceiling |
|
||||||
|
|------|----------------------|
|
||||||
|
| Admin / Manager | CONFIDENTIAL |
|
||||||
|
| Member | INTERNAL |
|
||||||
|
| Guest / Viewer | PUBLIC |
|
||||||
|
|
||||||
|
Qdrant payload filter ใช้ค่าที่ derive จาก server เสมอ — ไม่มีทาง client bypass ได้
|
||||||
|
|
||||||
|
**LLM routing** (ADR-018): classification ceiling = CONFIDENTIAL → ใช้ local Ollama เสมอโดยไม่ส่งไป Typhoon API
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## EC-RAG-005 — Query Cache Key Must Include Tenant + Classification (Cross-Tenant Leak Prevention)
|
||||||
|
|
||||||
|
**Scenario**: User A (Project LCBP3, CONFIDENTIAL access) query คำถามเดียวกันกับ User B (Project SAMPLE, PUBLIC only) → User B อาจ hit cache ของ User A และเห็น CONFIDENTIAL content จากโครงการอื่น
|
||||||
|
|
||||||
|
**Rule**: Query result cache key ต้องเป็น:
|
||||||
|
|
||||||
|
```
|
||||||
|
cacheKey = SHA256(question + ":" + projectPublicId + ":" + classificationCeiling)
|
||||||
|
```
|
||||||
|
|
||||||
|
- ห้ามใช้แค่ `hash(question)` — cross-tenant leak
|
||||||
|
- **CONFIDENTIAL queries**: bypass cache ทั้งหมด — ห้าม read และ write Redis cache
|
||||||
|
- **PUBLIC / INTERNAL queries**: ใช้ cache key ข้างต้น, TTL 5 minutes
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
# END
|
||||||
|
|
||||||
|
เวอร์ชันนี้เพิ่มของที่ “ใช้ deploy จริง” ครบ เช่น:
|
||||||
|
|
||||||
|
* ✅ docker-compose (production)
|
||||||
|
* ✅ worker scaling
|
||||||
|
* ✅ Qdrant tuning (HNSW + index)
|
||||||
|
* ✅ security hardening checklist
|
||||||
|
* ✅ monitoring (Loki / Prometheus / Grafana)
|
||||||
|
* ✅ backup strategy
|
||||||
|
* ✅ go-live checklist
|
||||||
|
|
||||||
|
---
|
||||||
@@ -0,0 +1,233 @@
|
|||||||
|
openapi: 3.1.0
|
||||||
|
info:
|
||||||
|
title: LCBP3 RAG API
|
||||||
|
description: Retrieval-Augmented Generation endpoints for LCBP3 DMS (ADR-022)
|
||||||
|
version: 1.0.0
|
||||||
|
|
||||||
|
servers:
|
||||||
|
- url: /api
|
||||||
|
|
||||||
|
security:
|
||||||
|
- bearerAuth: []
|
||||||
|
|
||||||
|
components:
|
||||||
|
securitySchemes:
|
||||||
|
bearerAuth:
|
||||||
|
type: http
|
||||||
|
scheme: bearer
|
||||||
|
bearerFormat: JWT
|
||||||
|
|
||||||
|
schemas:
|
||||||
|
RagQueryRequest:
|
||||||
|
type: object
|
||||||
|
required: [ question, projectPublicId ]
|
||||||
|
properties:
|
||||||
|
question:
|
||||||
|
type: string
|
||||||
|
maxLength: 500
|
||||||
|
description: คำถามภาษาไทย/อังกฤษ
|
||||||
|
example: "เอกสาร REF-2026-001 มีเนื้อหาอะไร?"
|
||||||
|
projectPublicId:
|
||||||
|
type: string
|
||||||
|
format: uuid
|
||||||
|
description: UUIDv7 ของโครงการ (mandatory tenant isolation)
|
||||||
|
maxClassification:
|
||||||
|
type: string
|
||||||
|
enum: [ PUBLIC, INTERNAL, CONFIDENTIAL ]
|
||||||
|
default: INTERNAL
|
||||||
|
description: ระดับ classification สูงสุดที่ user มีสิทธิ์เข้าถึง
|
||||||
|
|
||||||
|
Citation:
|
||||||
|
type: object
|
||||||
|
properties:
|
||||||
|
chunkId:
|
||||||
|
type: string
|
||||||
|
format: uuid
|
||||||
|
docNumber:
|
||||||
|
type: string
|
||||||
|
nullable: true
|
||||||
|
example: "REF-2026-001"
|
||||||
|
docType:
|
||||||
|
type: string
|
||||||
|
example: "CORR"
|
||||||
|
revision:
|
||||||
|
type: string
|
||||||
|
nullable: true
|
||||||
|
example: "Rev.A"
|
||||||
|
snippet:
|
||||||
|
type: string
|
||||||
|
description: ข้อความส่วนที่ตอบคำถาม (max 200 chars)
|
||||||
|
score:
|
||||||
|
type: number
|
||||||
|
format: float
|
||||||
|
minimum: 0
|
||||||
|
maximum: 1
|
||||||
|
|
||||||
|
RagQueryResponse:
|
||||||
|
type: object
|
||||||
|
properties:
|
||||||
|
answer:
|
||||||
|
type: string
|
||||||
|
description: คำตอบที่ AI สร้างขึ้นจาก context
|
||||||
|
citations:
|
||||||
|
type: array
|
||||||
|
items:
|
||||||
|
$ref: "#/components/schemas/Citation"
|
||||||
|
confidence:
|
||||||
|
type: number
|
||||||
|
format: float
|
||||||
|
minimum: 0
|
||||||
|
maximum: 1
|
||||||
|
description: ระดับความมั่นใจของคำตอบ (< 0.6 = fallback message)
|
||||||
|
fallbackUsed:
|
||||||
|
type: boolean
|
||||||
|
description: true = ใช้ Ollama local LLM แทน Typhoon API
|
||||||
|
latencyMs:
|
||||||
|
type: integer
|
||||||
|
description: เวลา end-to-end (ms)
|
||||||
|
|
||||||
|
RagIngestionStatus:
|
||||||
|
type: object
|
||||||
|
properties:
|
||||||
|
attachmentId:
|
||||||
|
type: string
|
||||||
|
format: uuid
|
||||||
|
ragStatus:
|
||||||
|
type: string
|
||||||
|
enum: [ PENDING, PROCESSING, INDEXED, FAILED ]
|
||||||
|
ragLastError:
|
||||||
|
type: string
|
||||||
|
nullable: true
|
||||||
|
chunkCount:
|
||||||
|
type: integer
|
||||||
|
description: จำนวน chunks ที่ indexed แล้ว (0 ถ้ายังไม่ INDEXED)
|
||||||
|
|
||||||
|
ErrorResponse:
|
||||||
|
type: object
|
||||||
|
properties:
|
||||||
|
statusCode:
|
||||||
|
type: integer
|
||||||
|
message:
|
||||||
|
type: string
|
||||||
|
userMessage:
|
||||||
|
type: string
|
||||||
|
description: ข้อความที่แสดงต่อ user (ภาษาไทย)
|
||||||
|
recoveryAction:
|
||||||
|
type: string
|
||||||
|
nullable: true
|
||||||
|
|
||||||
|
paths:
|
||||||
|
/rag/query:
|
||||||
|
post:
|
||||||
|
summary: RAG Query — ถามคำถามจากเอกสารโครงการ
|
||||||
|
description: |
|
||||||
|
ค้นหา context จากเอกสารที่ indexed แล้วและสร้างคำตอบด้วย AI
|
||||||
|
- CONFIDENTIAL documents → ใช้ local Ollama เท่านั้น (ADR-018)
|
||||||
|
- ทุก query บังคับระบุ projectPublicId เพื่อ tenant isolation
|
||||||
|
- Idempotency-Key header บังคับ (AGENTS.md security rule)
|
||||||
|
tags: [ RAG ]
|
||||||
|
parameters:
|
||||||
|
- name: Idempotency-Key
|
||||||
|
in: header
|
||||||
|
required: true
|
||||||
|
schema:
|
||||||
|
type: string
|
||||||
|
format: uuid
|
||||||
|
description: UUIDv4 idempotency key — prevent duplicate query submissions
|
||||||
|
requestBody:
|
||||||
|
required: true
|
||||||
|
content:
|
||||||
|
application/json:
|
||||||
|
schema:
|
||||||
|
$ref: "#/components/schemas/RagQueryRequest"
|
||||||
|
responses:
|
||||||
|
"200":
|
||||||
|
description: คำตอบพร้อม citations
|
||||||
|
content:
|
||||||
|
application/json:
|
||||||
|
schema:
|
||||||
|
type: object
|
||||||
|
properties:
|
||||||
|
data:
|
||||||
|
$ref: "#/components/schemas/RagQueryResponse"
|
||||||
|
"400":
|
||||||
|
description: Invalid input (question เปล่า หรือ projectPublicId ไม่ถูกต้อง)
|
||||||
|
content:
|
||||||
|
application/json:
|
||||||
|
schema:
|
||||||
|
$ref: "#/components/schemas/ErrorResponse"
|
||||||
|
"403":
|
||||||
|
description: ไม่มีสิทธิ์เข้าถึงโครงการนี้
|
||||||
|
content:
|
||||||
|
application/json:
|
||||||
|
schema:
|
||||||
|
$ref: "#/components/schemas/ErrorResponse"
|
||||||
|
"429":
|
||||||
|
description: Rate limit exceeded
|
||||||
|
"503":
|
||||||
|
description: RAG service unavailable (Qdrant / Ollama / Typhoon ทั้งหมด down)
|
||||||
|
|
||||||
|
/rag/status/{attachmentId}:
|
||||||
|
get:
|
||||||
|
summary: ตรวจสอบสถานะ RAG ingestion ของ attachment
|
||||||
|
tags: [ RAG ]
|
||||||
|
parameters:
|
||||||
|
- name: attachmentId
|
||||||
|
in: path
|
||||||
|
required: true
|
||||||
|
schema:
|
||||||
|
type: string
|
||||||
|
format: uuid
|
||||||
|
description: publicId (UUIDv7) ของ attachment
|
||||||
|
responses:
|
||||||
|
"200":
|
||||||
|
description: สถานะ ingestion ปัจจุบัน
|
||||||
|
content:
|
||||||
|
application/json:
|
||||||
|
schema:
|
||||||
|
type: object
|
||||||
|
properties:
|
||||||
|
data:
|
||||||
|
$ref: "#/components/schemas/RagIngestionStatus"
|
||||||
|
"404":
|
||||||
|
description: Attachment ไม่พบ
|
||||||
|
|
||||||
|
/rag/ingest/{attachmentId}:
|
||||||
|
post:
|
||||||
|
summary: Trigger manual re-ingestion (สำหรับไฟล์ที่ FAILED)
|
||||||
|
description: Re-enqueue rag:ocr job สำหรับ attachment; reset rag_status → PENDING
|
||||||
|
tags: [ RAG ]
|
||||||
|
parameters:
|
||||||
|
- name: attachmentId
|
||||||
|
in: path
|
||||||
|
required: true
|
||||||
|
schema:
|
||||||
|
type: string
|
||||||
|
format: uuid
|
||||||
|
responses:
|
||||||
|
"202":
|
||||||
|
description: Re-ingestion job queued
|
||||||
|
"400":
|
||||||
|
description: Attachment ไม่อยู่ใน FAILED state
|
||||||
|
"404":
|
||||||
|
description: Attachment ไม่พบ
|
||||||
|
|
||||||
|
/rag/vectors/{attachmentId}:
|
||||||
|
delete:
|
||||||
|
summary: ลบ vectors ของ attachment จาก Qdrant (เมื่อลบเอกสาร)
|
||||||
|
description: |
|
||||||
|
เรียกโดย DocumentService เมื่อ soft-delete attachment
|
||||||
|
ลบ Qdrant points + document_chunks rows ทั้งหมดของ attachment
|
||||||
|
tags: [ RAG ]
|
||||||
|
parameters:
|
||||||
|
- name: attachmentId
|
||||||
|
in: path
|
||||||
|
required: true
|
||||||
|
schema:
|
||||||
|
type: string
|
||||||
|
format: uuid
|
||||||
|
responses:
|
||||||
|
"200":
|
||||||
|
description: Vectors deleted successfully
|
||||||
|
"404":
|
||||||
|
description: Attachment ไม่พบ
|
||||||
@@ -0,0 +1,235 @@
|
|||||||
|
# Data Model: ADR-022 RAG
|
||||||
|
|
||||||
|
**Date**: 2026-04-19 | **Phase**: 1 — Design
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 1. MariaDB Schema
|
||||||
|
|
||||||
|
### 1.1 Schema Delta — `attachments` table (ADR-009: แก้ SQL โดยตรง)
|
||||||
|
|
||||||
|
```sql
|
||||||
|
-- specs/03-Data-and-Storage/deltas/06-add-rag-status-to-attachments.sql
|
||||||
|
ALTER TABLE attachments
|
||||||
|
ADD COLUMN rag_status ENUM('PENDING', 'PROCESSING', 'INDEXED', 'FAILED')
|
||||||
|
NOT NULL DEFAULT 'PENDING'
|
||||||
|
COMMENT 'สถานะ RAG ingestion ระดับ file',
|
||||||
|
ADD COLUMN rag_last_error TEXT NULL
|
||||||
|
COMMENT 'Error message ล่าสุดเมื่อ rag_status = FAILED',
|
||||||
|
ADD INDEX idx_attachments_rag_status (rag_status);
|
||||||
|
```
|
||||||
|
|
||||||
|
**Lifecycle transitions**:
|
||||||
|
```
|
||||||
|
PENDING → PROCESSING → INDEXED
|
||||||
|
↘ FAILED (retry → PROCESSING → ... max 3 times)
|
||||||
|
```
|
||||||
|
|
||||||
|
### 1.2 New Table — `document_chunks`
|
||||||
|
|
||||||
|
```sql
|
||||||
|
-- specs/03-Data-and-Storage/deltas/06b-create-document-chunks.sql
|
||||||
|
CREATE TABLE document_chunks (
|
||||||
|
id CHAR(36) NOT NULL PRIMARY KEY COMMENT 'UUID = Qdrant point ID',
|
||||||
|
document_id CHAR(36) NOT NULL COMMENT 'FK → attachments.public_id (UUIDv7)',
|
||||||
|
chunk_index INT NOT NULL COMMENT 'ลำดับ chunk ภายใน document',
|
||||||
|
content TEXT NOT NULL COMMENT 'เนื้อหา chunk หลัง PyThaiNLP normalize',
|
||||||
|
doc_type VARCHAR(20) NOT NULL COMMENT 'CORR, RFA, DRAWING, CONTRACT, RPT, TRANS',
|
||||||
|
doc_number VARCHAR(100) NULL COMMENT 'หมายเลขเอกสาร เช่น REF-2026-001',
|
||||||
|
revision VARCHAR(20) NULL COMMENT 'Revision เช่น Rev.A',
|
||||||
|
project_code VARCHAR(50) NOT NULL COMMENT 'รหัสโครงการ (ใช้ filter)',
|
||||||
|
project_public_id CHAR(36) NOT NULL COMMENT 'UUIDv7 ของโครงการ (Qdrant tenant key)',
|
||||||
|
version VARCHAR(20) NULL COMMENT 'เวอร์ชันเอกสาร เช่น 1.0, 2.1 (ถ้ามี)',
|
||||||
|
classification ENUM('PUBLIC', 'INTERNAL', 'CONFIDENTIAL')
|
||||||
|
NOT NULL DEFAULT 'INTERNAL',
|
||||||
|
embedding_model VARCHAR(100) NOT NULL DEFAULT 'nomic-embed-text',
|
||||||
|
created_at DATETIME(3) NOT NULL DEFAULT CURRENT_TIMESTAMP(3),
|
||||||
|
|
||||||
|
INDEX idx_chunks_document_id (document_id),
|
||||||
|
INDEX idx_chunks_doc_number_rev (doc_number, revision),
|
||||||
|
INDEX idx_chunks_project (project_public_id),
|
||||||
|
FULLTEXT INDEX ft_chunks_content (content)
|
||||||
|
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;
|
||||||
|
```
|
||||||
|
|
||||||
|
> ⚠️ `document_chunks` เก็บ chunks ที่ indexed แล้วเท่านั้น — **ห้าม** เพิ่ม `rag_status` ในนี้
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 2. Qdrant Vector Store
|
||||||
|
|
||||||
|
### 2.1 Collection: `lcbp3_vectors`
|
||||||
|
|
||||||
|
```typescript
|
||||||
|
// VectorMetadata payload interface (TypeScript)
|
||||||
|
interface VectorMetadata {
|
||||||
|
chunk_id: string; // UUID = document_chunks.id
|
||||||
|
public_id: string; // UUIDv7 ของ attachment (document)
|
||||||
|
project_public_id: string; // UUIDv7 ของโครงการ (tenant key — mandatory filter)
|
||||||
|
doc_type: string; // CORR, RFA, DRAWING, etc.
|
||||||
|
doc_number: string | null;
|
||||||
|
revision: string | null;
|
||||||
|
project_code: string;
|
||||||
|
classification: 'PUBLIC' | 'INTERNAL' | 'CONFIDENTIAL';
|
||||||
|
content_preview: string; // ตัดสั้นๆ สำหรับ UI (max 200 chars)
|
||||||
|
embedding_model: string; // nomic-embed-text
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### 2.2 Collection Setup
|
||||||
|
|
||||||
|
```typescript
|
||||||
|
// qdrant.service.ts — createCollectionIfNotExists()
|
||||||
|
{
|
||||||
|
vectors: { size: 768, distance: 'Cosine' },
|
||||||
|
hnsw_config: {
|
||||||
|
payload_m: 16, // tenant-aware HNSW index
|
||||||
|
m: 0, // ปิด global HNSW index
|
||||||
|
},
|
||||||
|
optimizers_config: { indexing_threshold: 10000 },
|
||||||
|
}
|
||||||
|
|
||||||
|
// Payload indexes
|
||||||
|
{ field_name: 'project_public_id', field_schema: { type: 'keyword', is_tenant: true } }
|
||||||
|
{ field_name: 'classification', field_schema: 'keyword' }
|
||||||
|
{ field_name: 'doc_type', field_schema: 'keyword' }
|
||||||
|
{ field_name: 'doc_number', field_schema: 'keyword' }
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 3. TypeORM Entity
|
||||||
|
|
||||||
|
### 3.1 `DocumentChunk` Entity
|
||||||
|
|
||||||
|
```typescript
|
||||||
|
// backend/src/modules/rag/entities/document-chunk.entity.ts
|
||||||
|
@Entity('document_chunks')
|
||||||
|
export class DocumentChunk {
|
||||||
|
@PrimaryColumn({ type: 'char', length: 36 })
|
||||||
|
id: string; // UUID = Qdrant point ID
|
||||||
|
|
||||||
|
@Column({ type: 'char', length: 36, name: 'document_id' })
|
||||||
|
documentId: string; // FK → attachments.public_id
|
||||||
|
|
||||||
|
@Column({ name: 'chunk_index' })
|
||||||
|
chunkIndex: number;
|
||||||
|
|
||||||
|
@Column({ type: 'text' })
|
||||||
|
content: string;
|
||||||
|
|
||||||
|
@Column({ length: 20, name: 'doc_type' })
|
||||||
|
docType: string;
|
||||||
|
|
||||||
|
@Column({ length: 100, name: 'doc_number', nullable: true })
|
||||||
|
docNumber: string | null;
|
||||||
|
|
||||||
|
@Column({ length: 20, nullable: true })
|
||||||
|
revision: string | null;
|
||||||
|
|
||||||
|
@Column({ length: 50, name: 'project_code' })
|
||||||
|
projectCode: string;
|
||||||
|
|
||||||
|
@Column({ length: 36, name: 'project_public_id' })
|
||||||
|
projectPublicId: string;
|
||||||
|
|
||||||
|
@Column({
|
||||||
|
type: 'enum',
|
||||||
|
enum: ['PUBLIC', 'INTERNAL', 'CONFIDENTIAL'],
|
||||||
|
default: 'INTERNAL',
|
||||||
|
})
|
||||||
|
classification: 'PUBLIC' | 'INTERNAL' | 'CONFIDENTIAL';
|
||||||
|
|
||||||
|
@Column({ length: 20, nullable: true })
|
||||||
|
version: string | null;
|
||||||
|
|
||||||
|
@Column({ length: 100, name: 'embedding_model', default: 'nomic-embed-text' })
|
||||||
|
embeddingModel: string;
|
||||||
|
|
||||||
|
@CreateDateColumn({ name: 'created_at', precision: 3 })
|
||||||
|
createdAt: Date;
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 4. DTO Interfaces
|
||||||
|
|
||||||
|
### 4.1 RAG Query DTO
|
||||||
|
|
||||||
|
```typescript
|
||||||
|
// rag-query.dto.ts
|
||||||
|
export class RagQueryDto {
|
||||||
|
@IsString()
|
||||||
|
@IsNotEmpty()
|
||||||
|
@MaxLength(500)
|
||||||
|
question: string;
|
||||||
|
|
||||||
|
@IsUUID()
|
||||||
|
projectPublicId: string; // mandatory — enforces tenant isolation
|
||||||
|
|
||||||
|
@IsOptional()
|
||||||
|
@IsEnum(['PUBLIC', 'INTERNAL', 'CONFIDENTIAL'])
|
||||||
|
maxClassification?: 'PUBLIC' | 'INTERNAL' | 'CONFIDENTIAL';
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### 4.2 RAG Response DTO
|
||||||
|
|
||||||
|
```typescript
|
||||||
|
// rag-response.dto.ts
|
||||||
|
export class RagResponseDto {
|
||||||
|
answer: string;
|
||||||
|
citations: Array<{
|
||||||
|
chunkId: string;
|
||||||
|
docNumber: string | null;
|
||||||
|
docType: string;
|
||||||
|
revision: string | null;
|
||||||
|
snippet: string;
|
||||||
|
score: number;
|
||||||
|
}>;
|
||||||
|
confidence: number; // 0.0 – 1.0
|
||||||
|
fallbackUsed: boolean; // true = Ollama fallback ถูกใช้แทน Typhoon
|
||||||
|
latencyMs: number;
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 5. State Transitions
|
||||||
|
|
||||||
|
### `rag_status` in `attachments`
|
||||||
|
|
||||||
|
```
|
||||||
|
[file committed to permanent storage]
|
||||||
|
↓
|
||||||
|
PENDING (default)
|
||||||
|
↓ (BullMQ: rag:ocr job created)
|
||||||
|
PROCESSING
|
||||||
|
↓
|
||||||
|
┌────────────┐
|
||||||
|
│ SUCCESS │ → INDEXED
|
||||||
|
└────────────┘
|
||||||
|
┌────────────┐
|
||||||
|
│ FAILURE │ → retry ≤ 3 → PROCESSING → ...
|
||||||
|
└────────────┘
|
||||||
|
↓ (after 3 failures)
|
||||||
|
FAILED (rag_last_error populated)
|
||||||
|
```
|
||||||
|
|
||||||
|
**Trigger**: hook ใน `StorageService.commitFile()` → enqueue `rag:ocr` job
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 6. Relationships
|
||||||
|
|
||||||
|
```
|
||||||
|
attachments (1) ──────────── (N) document_chunks
|
||||||
|
.public_id ←─── document_chunks.document_id
|
||||||
|
|
||||||
|
projects (1) ─────────────── (N) document_chunks
|
||||||
|
.public_id ←─── document_chunks.project_public_id
|
||||||
|
|
||||||
|
document_chunks (1) ──────── (1) Qdrant point
|
||||||
|
.id ←─── Qdrant payload.chunk_id
|
||||||
|
```
|
||||||
@@ -0,0 +1,144 @@
|
|||||||
|
# Implementation Plan: ADR-022 — RAG (Retrieval-Augmented Generation)
|
||||||
|
|
||||||
|
**Branch**: `main` (ADR-022 scope) | **Date**: 2026-04-19 | **Spec**: [v1.1.2 Guide](./LCBP3-RAG-Implementation-Guide-v1.1.2.md)
|
||||||
|
**Input**: `specs/06-Decision-Records/ADR-022-Retrieval-Augmented-Generation/LCBP3-RAG-Implementation-Guide-v1.1.2.md`
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Summary
|
||||||
|
|
||||||
|
ระบบ RAG (Retrieval-Augmented Generation) สำหรับ LCBP3 DMS ช่วยให้ผู้ใช้สามารถ Q&A ภาษาไทยบนเอกสารโครงการผ่าน Hybrid Search (vector + keyword) โดยมีการ isolate ข้อมูลระหว่างโครงการด้วย Qdrant Tiered Multitenancy และ RBAC enforcement ทุก query
|
||||||
|
|
||||||
|
**Ingestion Pipeline**: PDF → Tesseract OCR → PyThaiNLP normalize → Chunk → nomic-embed-text → Qdrant
|
||||||
|
**Query Pipeline**: Auth/RBAC → Embed question → Hybrid Search → Re-rank (top 5) → Build context → Typhoon API (primary) / Ollama local (fallback / CONFIDENTIAL) → Response + citations
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Technical Context
|
||||||
|
|
||||||
|
| Item | Value |
|
||||||
|
|------|-------|
|
||||||
|
| **Language/Version** | TypeScript 5.x (NestJS 10), Python 3.11 (PyThaiNLP microservice) |
|
||||||
|
| **Primary Dependencies** | `@qdrant/js-client-rest`, `bullmq`, `ioredis`, `@nestjs/bull`, `axios` (Typhoon + Ollama) |
|
||||||
|
| **Storage** | MariaDB (metadata + chunks index), Qdrant v1.16+ (vector store), Redis 7 (queue + cache) |
|
||||||
|
| **Testing** | Jest (NestJS), Vitest (frontend hooks) — coverage ≥ 80% business logic |
|
||||||
|
| **Target Platform** | QNAP NAS Docker Compose (NestJS, Qdrant, Redis), Admin Desktop Desk-5439 (Ollama, OCR, PyThaiNLP) |
|
||||||
|
| **Project Type** | Web application (NestJS backend + Next.js frontend) |
|
||||||
|
| **Performance Goals** | Typhoon primary p95 < 3s; Ollama fallback p95 < 10s; per-service timeout 5s |
|
||||||
|
| **Constraints** | CONFIDENTIAL → local Ollama only (ADR-018); rag_status in `attachments` (ADR-009); project_public_id filter mandatory |
|
||||||
|
| **Scale/Scope** | ~100K vectors initial; multi-project tiered multitenancy; parallel BullMQ workers |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Constitution Check (AGENTS.md / ADR compliance)
|
||||||
|
|
||||||
|
_GATE: Must pass before Phase 0 research._
|
||||||
|
|
||||||
|
| Gate | Status | Notes |
|
||||||
|
|------|--------|-------|
|
||||||
|
| 🔴 Security — Auth, RBAC, Validation | ✅ PASS | CASL Guard (`manage:rag`) on all RAG endpoints; Zod input validation |
|
||||||
|
| 🔴 UUID Strategy (ADR-019) | ✅ PASS | `publicId` (UUIDv7) in all API responses; no `parseInt` on IDs |
|
||||||
|
| 🔴 Database correctness (ADR-009) | ✅ PASS | Schema delta SQL for `rag_status`; no TypeORM migrations |
|
||||||
|
| 🔴 AI boundary (ADR-018) | ✅ PASS | Ollama on Admin Desktop only; no direct DB/storage access by AI |
|
||||||
|
| 🔴 Error handling (ADR-007) | ✅ PASS | `BusinessException` + `GlobalExceptionFilter` used throughout |
|
||||||
|
| 🔴 No `any` types | ✅ PASS | `VectorMetadata`, `RagQueryDto`, `RagResponseDto` fully typed |
|
||||||
|
| 🔴 No `console.log` | ✅ PASS | `NestJS Logger` in all services |
|
||||||
|
| 🟡 Background jobs (ADR-008) | ✅ PASS | BullMQ queues: `ocr`, `thai-preprocess`, `embedding` |
|
||||||
|
| 🟡 Cache invalidation | ✅ PASS | Query result cache TTL 5min; vector cache on embed |
|
||||||
|
| ⚠️ Local LLM model name | 🔶 DEFERRED | Marked "Confidential" in spec → see research.md; use env var `OLLAMA_RAG_MODEL` |
|
||||||
|
|
||||||
|
_Re-check post Phase 1 design: all gates remain PASS._
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Project Structure
|
||||||
|
|
||||||
|
### Documentation (this feature)
|
||||||
|
|
||||||
|
```text
|
||||||
|
specs/06-Decision-Records/ADR-022-Retrieval-Augmented-Generation/
|
||||||
|
├── LCBP3-RAG-Implementation-Guide-v1.1.2.md # Primary spec (clarified)
|
||||||
|
├── plan.md # This file
|
||||||
|
├── research.md # Phase 0 output
|
||||||
|
├── data-model.md # Phase 1 output
|
||||||
|
├── quickstart.md # Phase 1 output
|
||||||
|
└── contracts/
|
||||||
|
└── rag-api.yaml # OpenAPI contract
|
||||||
|
```
|
||||||
|
|
||||||
|
### Source Code (repository root)
|
||||||
|
|
||||||
|
```text
|
||||||
|
backend/src/modules/rag/
|
||||||
|
├── rag.module.ts
|
||||||
|
├── rag.controller.ts # POST /rag/query, GET /rag/status/:id, DELETE /rag/vectors/:id
|
||||||
|
├── rag.service.ts # Query pipeline orchestration
|
||||||
|
├── ingestion.service.ts # Ingestion pipeline (BullMQ producer)
|
||||||
|
├── embedding.service.ts # Ollama nomic-embed-text wrapper
|
||||||
|
├── qdrant.service.ts # Qdrant CRUD + hybrid search
|
||||||
|
├── typhoon.service.ts # Typhoon API + auto-failover logic
|
||||||
|
├── dto/
|
||||||
|
│ ├── rag-query.dto.ts
|
||||||
|
│ └── rag-response.dto.ts
|
||||||
|
├── entities/
|
||||||
|
│ └── document-chunk.entity.ts
|
||||||
|
├── processors/
|
||||||
|
│ ├── ocr.processor.ts # BullMQ consumer: ocr queue
|
||||||
|
│ ├── thai-preprocess.processor.ts # BullMQ consumer: thai-preprocess queue
|
||||||
|
│ └── embedding.processor.ts # BullMQ consumer: embedding queue
|
||||||
|
└── __tests__/
|
||||||
|
├── rag.service.spec.ts
|
||||||
|
├── ingestion.service.spec.ts
|
||||||
|
└── qdrant.service.spec.ts
|
||||||
|
|
||||||
|
specs/03-Data-and-Storage/deltas/ # ADR-009: schema deltas อยู่ใน specs/ ไม่ใช่ backend/
|
||||||
|
├── 06-add-rag-status-to-attachments.sql
|
||||||
|
└── 06b-create-document-chunks.sql
|
||||||
|
|
||||||
|
frontend/components/rag/
|
||||||
|
├── rag-search-bar.tsx
|
||||||
|
├── rag-result-card.tsx
|
||||||
|
└── rag-fallback-badge.tsx
|
||||||
|
|
||||||
|
frontend/app/(dashboard)/rag/
|
||||||
|
└── page.tsx
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Rollout Phases
|
||||||
|
|
||||||
|
> ⚠️ **Note**: tasks.md จัดเรียง US1 (Query API) ก่อน US2 (Ingestion) เพื่อ MVP priority — ต้องการ validate query pipeline ประสิทธิภาพก่อนที่จะแวดเวลา ingest ครบ
|
||||||
|
|
||||||
|
### Phase 1 — Infrastructure (2 วัน)
|
||||||
|
- Qdrant + Redis services in docker-compose (QNAP)
|
||||||
|
- Ollama model pull on Admin Desktop
|
||||||
|
- Schema delta: rag_status in attachments (ADR-009)
|
||||||
|
|
||||||
|
### Phase 2 — Core NestJS Services (3 วัน)
|
||||||
|
- `EmbeddingService`, `QdrantService`, `TyphoonService`
|
||||||
|
- BullMQ queue setup (3 queues: rag:ocr, rag:thai-preprocess, rag:embedding)
|
||||||
|
- Unit tests
|
||||||
|
|
||||||
|
### Phase 3 — RAG Query API — **MVP** (2 วัน)
|
||||||
|
- `RagController` + `RagService`
|
||||||
|
- Hybrid search + score-based re-ranking (top 5) + context build
|
||||||
|
- Typhoon failover logic + prompt injection defense
|
||||||
|
- Integration tests
|
||||||
|
|
||||||
|
### Phase 4 — Ingestion Pipeline (3 วัน)
|
||||||
|
- `IngestionService` wired to attachment upload hook
|
||||||
|
- OCR → PyThaiNLP → Chunk → Embed → Qdrant
|
||||||
|
- rag_status lifecycle management
|
||||||
|
|
||||||
|
### Phase 5 — Frontend UI + Polish (3 วัน)
|
||||||
|
- RAG search page + result cards with citations
|
||||||
|
- `used_fallback_model` badge
|
||||||
|
- TanStack Query hooks
|
||||||
|
- Rate limiting + audit logging
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Complexity Tracking
|
||||||
|
|
||||||
|
_No constitution violations requiring justification._
|
||||||
@@ -0,0 +1,159 @@
|
|||||||
|
# Quickstart: ADR-022 RAG — Local Development Setup
|
||||||
|
|
||||||
|
**Date**: 2026-04-19 | **สำหรับ**: Developer ที่ต้องการรัน RAG pipeline บน local machine
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Prerequisites
|
||||||
|
|
||||||
|
| Dependency | Version | หมายเหตุ |
|
||||||
|
|------------|---------|----------|
|
||||||
|
| Docker Desktop | 4.x+ | สำหรับ Qdrant + Redis |
|
||||||
|
| Node.js | 20.x | NestJS backend |
|
||||||
|
| pnpm | 9.x | Package manager |
|
||||||
|
| Python | 3.11 | PyThaiNLP microservice |
|
||||||
|
| Ollama | latest | ต้องติดตั้งบน Admin Desktop หรือ local |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Step 1: Start Infrastructure
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# เพิ่ม Qdrant + Redis ใน docker-compose.override.yml (local dev)
|
||||||
|
docker compose -f docker-compose.yml -f docker-compose.override.yml up qdrant redis -d
|
||||||
|
|
||||||
|
# ตรวจสอบ Qdrant
|
||||||
|
curl http://localhost:6333/healthz
|
||||||
|
# → {"title":"qdrant - serving","version":"..."}
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Step 2: Setup PyThaiNLP Microservice
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# บน Admin Desktop (หรือ local สำหรับ dev)
|
||||||
|
cd backend/services/thai-preprocess # สร้าง directory นี้ในอนาคต
|
||||||
|
|
||||||
|
pip install pythainlp==5.0.* fastapi uvicorn
|
||||||
|
|
||||||
|
# รัน microservice
|
||||||
|
python -m uvicorn main:app --host 0.0.0.0 --port 8765
|
||||||
|
|
||||||
|
# ทดสอบ
|
||||||
|
curl -X POST http://localhost:8765/preprocess \
|
||||||
|
-H "Content-Type: application/json" \
|
||||||
|
-d '{"text": "มาตรา ๑๐ ของสัญญา"}'
|
||||||
|
# → {"normalized": "มาตรา 10 ของสัญญา"}
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Step 3: Configure Environment Variables
|
||||||
|
|
||||||
|
```env
|
||||||
|
# backend/.env (dev override)
|
||||||
|
|
||||||
|
# Qdrant
|
||||||
|
QDRANT_URL=http://localhost:6333
|
||||||
|
QDRANT_COLLECTION=lcbp3_vectors
|
||||||
|
|
||||||
|
# Redis
|
||||||
|
REDIS_HOST=localhost
|
||||||
|
REDIS_PORT=6379
|
||||||
|
|
||||||
|
# Ollama (Admin Desktop หรือ local)
|
||||||
|
OLLAMA_URL=http://localhost:11434
|
||||||
|
OLLAMA_EMBED_MODEL=nomic-embed-text
|
||||||
|
OLLAMA_RAG_MODEL=<กำหนดโดย ops team> # ดู research.md R1
|
||||||
|
|
||||||
|
# PyThaiNLP Microservice
|
||||||
|
THAI_PREPROCESS_URL=http://localhost:8765
|
||||||
|
|
||||||
|
# Typhoon API (อาจใช้ sandbox key สำหรับ dev)
|
||||||
|
TYPHOON_API_KEY=<dev-key>
|
||||||
|
TYPHOON_API_URL=https://api.opentyphoon.ai/v1
|
||||||
|
|
||||||
|
# RAG Config
|
||||||
|
RAG_TOPK=20
|
||||||
|
RAG_FINAL_K=5
|
||||||
|
RAG_TIMEOUT_MS=5000
|
||||||
|
RAG_QUERY_CACHE_TTL=300 # seconds
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Step 4: Pull Ollama Models
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# บน Admin Desktop / local Ollama
|
||||||
|
ollama pull nomic-embed-text # embedding model (768 dims)
|
||||||
|
ollama pull <OLLAMA_RAG_MODEL> # local LLM (ดูจาก ops team)
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Step 5: Run Schema Delta
|
||||||
|
|
||||||
|
```sql
|
||||||
|
-- รัน SQL delta ตาม ADR-009 (แก้โดยตรง ห้ามใช้ migration)
|
||||||
|
-- specs/03-Data-and-Storage/deltas/06-add-rag-status-to-attachments.sql
|
||||||
|
-- specs/03-Data-and-Storage/deltas/06b-create-document-chunks.sql
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Step 6: Start NestJS Backend
|
||||||
|
|
||||||
|
```bash
|
||||||
|
cd backend
|
||||||
|
pnpm install
|
||||||
|
pnpm run start:dev
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Step 7: Initialize Qdrant Collection
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# เรียก endpoint เพื่อ create collection (ถ้ายังไม่มี)
|
||||||
|
curl -X POST http://localhost:3001/api/rag/admin/init-collection \
|
||||||
|
-H "Authorization: Bearer <admin-token>"
|
||||||
|
# → {"message": "Collection lcbp3_vectors initialized"}
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Step 8: Test RAG Query
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# 1. Upload + commit ไฟล์ PDF (ผ่าน DMS upload API ปกติ)
|
||||||
|
# → rag_status จะเปลี่ยนเป็น PENDING → PROCESSING → INDEXED
|
||||||
|
|
||||||
|
# 2. ตรวจสอบ status
|
||||||
|
curl http://localhost:3001/api/rag/status/<attachment-uuid> \
|
||||||
|
-H "Authorization: Bearer <token>"
|
||||||
|
# → {"data": {"ragStatus": "INDEXED", "chunkCount": 12}}
|
||||||
|
|
||||||
|
# 3. RAG query
|
||||||
|
curl -X POST http://localhost:3001/api/rag/query \
|
||||||
|
-H "Authorization: Bearer <token>" \
|
||||||
|
-H "Content-Type: application/json" \
|
||||||
|
-d '{
|
||||||
|
"question": "เอกสารนี้เกี่ยวกับอะไร?",
|
||||||
|
"projectPublicId": "<project-uuid>"
|
||||||
|
}'
|
||||||
|
# → {"data": {"answer": "...", "citations": [...], "fallbackUsed": false}}
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Troubleshooting
|
||||||
|
|
||||||
|
| ปัญหา | สาเหตุ | แก้ไข |
|
||||||
|
|-------|-------|-------|
|
||||||
|
| `rag_status` ค้างที่ PROCESSING | BullMQ worker ไม่ทำงาน | ตรวจสอบ Redis connection + worker process |
|
||||||
|
| Qdrant connection refused | Qdrant container ไม่ run | `docker compose up qdrant -d` |
|
||||||
|
| PyThaiNLP timeout | Microservice ไม่ start | ตรวจสอบ port 8765 + Python service |
|
||||||
|
| `fallbackUsed: true` ตลอด | Typhoon API key ผิด / network block | ตรวจสอบ `TYPHOON_API_KEY` + network |
|
||||||
|
| `chunkCount: 0` หลัง INDEXED | Collection ยังไม่ถูก init | เรียก `/api/rag/admin/init-collection` |
|
||||||
@@ -0,0 +1,160 @@
|
|||||||
|
# Research: ADR-022 RAG — Technical Unknowns Resolution
|
||||||
|
|
||||||
|
**Date**: 2026-04-19 | **Phase**: 0 — Pre-Design Research
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## R1: Local Ollama LLM Model (Marked "Confidential" in Spec)
|
||||||
|
|
||||||
|
**Decision**: ใช้ environment variable `OLLAMA_RAG_MODEL` กำหนด model ที่ใช้ในแต่ละ environment
|
||||||
|
**Rationale**: Model name ถูก mark confidential ใน spec — ไม่ hardcode ใน codebase; ทีม ops กำหนดผ่าน docker-compose env; fallback default คือ `llama3:8b` หากไม่ได้ set
|
||||||
|
**Alternatives considered**:
|
||||||
|
- Hardcode model name: ❌ ขัด ADR-018 (sensitive config ไม่ควรอยู่ใน code)
|
||||||
|
- Config file: ❌ ซับซ้อนเกินจำเป็น; env var เพียงพอ
|
||||||
|
|
||||||
|
**Implementation**:
|
||||||
|
```env
|
||||||
|
# docker-compose.yml
|
||||||
|
OLLAMA_RAG_MODEL=<model-name> # กำหนดโดย ops team
|
||||||
|
OLLAMA_URL=http://admin-desktop:11434
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## R2: PyThaiNLP Integration Strategy
|
||||||
|
|
||||||
|
**Decision**: Python microservice แยกต่างหาก รับ text ผ่าน HTTP POST จาก NestJS processor
|
||||||
|
**Rationale**: NestJS เป็น TypeScript — ไม่สามารถ import PyThaiNLP โดยตรง; HTTP microservice ง่ายต่อ deploy แยก container บน Admin Desktop
|
||||||
|
**Alternatives considered**:
|
||||||
|
- Python subprocess จาก NestJS: ❌ fragile, ยาก scale
|
||||||
|
- Node.js Thai tokenizer (node-nlp): ❌ คุณภาพต่ำกว่า PyThaiNLP สำหรับภาษาไทยเฉพาะทาง
|
||||||
|
|
||||||
|
**PyThaiNLP version**: 5.0.x (latest stable, support Python 3.11)
|
||||||
|
|
||||||
|
**Processing steps**:
|
||||||
|
1. Word tokenization (newmm engine — best for mixed Thai/English)
|
||||||
|
2. Thai numeral normalization: `๑๐` → `10`
|
||||||
|
3. Abbreviation expansion: `รฟม.` → `การรถไฟฟ้าขนส่งมวลชนแห่งประเทศไทย (รฟม.)`
|
||||||
|
4. Section header strip: `หน้า 1/3`, `ลงชื่อ__________` removed
|
||||||
|
5. Rejoin tokens with space for nomic-embed-text input
|
||||||
|
|
||||||
|
**Microservice endpoint**:
|
||||||
|
```
|
||||||
|
POST http://admin-desktop:8765/preprocess
|
||||||
|
Body: { "text": "..." }
|
||||||
|
Response: { "normalized": "..." }
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## R3: Qdrant Tiered Multitenancy Setup (v1.16+)
|
||||||
|
|
||||||
|
**Decision**: Single collection `lcbp3_vectors` + `is_tenant: true` payload index บน `project_public_id`
|
||||||
|
**Rationale**: Qdrant v1.16 รองรับ tenant-aware HNSW index ที่ให้ query speed 3-5× เร็วขึ้นเมื่อ filter ด้วย tenant field; ไม่ต้องบริหาร N collections
|
||||||
|
**Alternatives considered**:
|
||||||
|
- Separate collection per project: ❌ ops burden สูง; lifecycle management ซับซ้อนเมื่อ project เพิ่ม/ลบ
|
||||||
|
- Plain payload filter: ❌ ช้ากว่า is_tenant=true บน large collection
|
||||||
|
|
||||||
|
**Qdrant collection creation**:
|
||||||
|
```typescript
|
||||||
|
await qdrantClient.createCollection('lcbp3_vectors', {
|
||||||
|
vectors: { size: 768, distance: 'Cosine' },
|
||||||
|
hnsw_config: { payload_m: 16, m: 0 }, // ปิด global index
|
||||||
|
});
|
||||||
|
|
||||||
|
await qdrantClient.createPayloadIndex('lcbp3_vectors', {
|
||||||
|
field_name: 'project_public_id',
|
||||||
|
field_schema: { type: 'keyword', is_tenant: true },
|
||||||
|
});
|
||||||
|
|
||||||
|
// Additional indexes for filtering
|
||||||
|
await qdrantClient.createPayloadIndex('lcbp3_vectors', {
|
||||||
|
field_name: 'classification',
|
||||||
|
field_schema: 'keyword',
|
||||||
|
});
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## R4: Hybrid Search Implementation
|
||||||
|
|
||||||
|
**Decision**: BM25 (MariaDB FULLTEXT) + Vector (Qdrant) merged with weighted score (0.7 vector + 0.3 keyword)
|
||||||
|
**Rationale**: RRF (Reciprocal Rank Fusion) มีความซับซ้อนสูงกว่าสำหรับ initial implementation; weighted sum ง่ายกว่า tune และ debug; v1.1.2 spec ระบุ 0.7/0.3 ไว้แล้ว
|
||||||
|
**Alternatives considered**:
|
||||||
|
- RRF fusion: ✅ ดีกว่าสำหรับ production long-term แต่ complex กว่า — defer ไปเป็น Future Enhancement
|
||||||
|
- Vector-only: ❌ พลาด keyword เช่น doc number `REF-2026-001` (v1.1.1 reviewer ชี้ประเด็นนี้)
|
||||||
|
|
||||||
|
**Merge algorithm**:
|
||||||
|
```typescript
|
||||||
|
// Normalize scores 0-1 then merge
|
||||||
|
const mergedScore = (0.7 * vectorScore) + (0.3 * keywordScore);
|
||||||
|
// Top 20 → Re-rank → Top 5
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## R5: BullMQ Queue Architecture
|
||||||
|
|
||||||
|
**Decision**: 3 queues แยก (ocr, thai-preprocess, embedding) เพื่อ scale แต่ละ stage ได้อิสระ
|
||||||
|
**Rationale**: OCR bound by CPU; thai-preprocess bound by network (HTTP to microservice); embedding bound by Ollama GPU — scale แยกกันได้
|
||||||
|
**Queue design**:
|
||||||
|
|
||||||
|
| Queue | Worker Location | Concurrency | DLQ |
|
||||||
|
|-------|----------------|-------------|-----|
|
||||||
|
| `rag:ocr` | QNAP NAS | 2 | ✅ max 3 retries |
|
||||||
|
| `rag:thai-preprocess` | QNAP NAS | 4 | ✅ max 3 retries |
|
||||||
|
| `rag:embedding` | QNAP NAS | 3 | ✅ max 3 retries |
|
||||||
|
|
||||||
|
**DLQ (Dead Letter Queue)**: ไฟล์ที่ fail > 3 ครั้ง → update `rag_status = 'FAILED'` + บันทึก error ใน `rag_last_error` → alert ทีม Dev
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## R6: Typhoon API Failover Pattern
|
||||||
|
|
||||||
|
**Decision**: Circuit-breaker pattern ใน `TyphoonService` — auto-failover to Ollama เมื่อ timeout > 5s หรือ HTTP 5xx
|
||||||
|
**Rationale**: Simple threshold-based failover เพียงพอสำหรับ initial version; ไม่ต้องใช้ library circuit-breaker เพิ่ม
|
||||||
|
|
||||||
|
**Failover logic**:
|
||||||
|
```typescript
|
||||||
|
async generateAnswer(context: string, query: string, classification: Classification): Promise<RagAnswer> {
|
||||||
|
if (classification === 'CONFIDENTIAL') {
|
||||||
|
return this.generateWithOllama(context, query); // ADR-018: local only
|
||||||
|
}
|
||||||
|
try {
|
||||||
|
return await Promise.race([
|
||||||
|
this.generateWithTyphoon(context, query),
|
||||||
|
this.timeoutAfter(5000), // 5s
|
||||||
|
]);
|
||||||
|
} catch {
|
||||||
|
this.logger.warn('Typhoon timeout/error — failing over to Ollama');
|
||||||
|
return { ...(await this.generateWithOllama(context, query)), used_fallback_model: true };
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## R7: Security — Prompt Injection Defense
|
||||||
|
|
||||||
|
**Decision**: Structured output enforcement + system prompt boundary markers
|
||||||
|
**Rationale**: "ignore external instructions" (v1.1.2) ไม่เพียงพอ; ต้องใช้ structured JSON output บังคับ format ป้องกัน model เบี่ยงเบน
|
||||||
|
|
||||||
|
**Defense layers**:
|
||||||
|
1. System prompt: `<CONTEXT_START>` / `<CONTEXT_END>` boundary markers
|
||||||
|
2. Require JSON-only output (structured output mode)
|
||||||
|
3. Post-generation validation: ตรวจ `citations` array ว่า `doc_number` มีอยู่จริงใน retrieved chunks
|
||||||
|
4. ถ้า validation ล้มเหลว → return fallback "ไม่พบข้อมูลที่ระบุ"
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Resolution Summary
|
||||||
|
|
||||||
|
| Unknown | Status | Decision |
|
||||||
|
|---------|--------|---------|
|
||||||
|
| Ollama model name | ✅ Resolved | `OLLAMA_RAG_MODEL` env var |
|
||||||
|
| PyThaiNLP integration | ✅ Resolved | Python HTTP microservice บน Admin Desktop |
|
||||||
|
| Qdrant multitenancy API | ✅ Resolved | `is_tenant: true` payload index (Qdrant v1.16+) |
|
||||||
|
| Hybrid search merge | ✅ Resolved | Weighted sum 0.7/0.3 (RRF deferred) |
|
||||||
|
| BullMQ queue structure | ✅ Resolved | 3 queues: ocr, thai-preprocess, embedding |
|
||||||
|
| Typhoon failover pattern | ✅ Resolved | Promise.race timeout + Ollama fallback |
|
||||||
|
| Prompt injection defense | ✅ Resolved | Structured JSON output + citation validation |
|
||||||
@@ -0,0 +1,291 @@
|
|||||||
|
# Tasks: ADR-022 — RAG (Retrieval-Augmented Generation)
|
||||||
|
|
||||||
|
**Input**: Design documents from `specs/06-Decision-Records/ADR-022-Retrieval-Augmented-Generation/`
|
||||||
|
**Prerequisites**: plan.md ✅ | spec (v1.1.2 clarified) ✅ | research.md ✅ | data-model.md ✅ | contracts/rag-api.yaml ✅
|
||||||
|
|
||||||
|
**Total Tasks**: 39 | **User Stories**: 5 | **Parallel opportunities**: 22 tasks
|
||||||
|
|
||||||
|
## Format: `[ID] [P?] [Story?] Description with file path`
|
||||||
|
|
||||||
|
- **[P]**: Can run in parallel (different files, no dependencies on incomplete tasks)
|
||||||
|
- **[Story]**: User story this task belongs to (US1–US5)
|
||||||
|
- No story label = Setup or Foundational phase
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## User Stories (derived from spec v1.1.2 + plan.md)
|
||||||
|
|
||||||
|
| ID | Priority | Story | Goal |
|
||||||
|
|----|----------|-------|------|
|
||||||
|
| US1 | P1 🎯 MVP | RAG Query API | ผู้ใช้ถามคำถามจากเอกสารโครงการและได้คำตอบพร้อม citation |
|
||||||
|
| US2 | P2 | Auto Ingestion | เอกสารที่ commit เข้าระบบถูก index อัตโนมัติ (OCR→PyThaiNLP→Embed→Qdrant) |
|
||||||
|
| US3 | P3 | Status & Re-ingest | Admin ตรวจสอบสถานะ ingestion และ trigger re-index สำหรับไฟล์ที่ FAILED |
|
||||||
|
| US4 | P4 | Vector Cleanup | เมื่อลบเอกสาร vectors ใน Qdrant ถูกลบตามโดยอัตโนมัติ |
|
||||||
|
| US5 | P5 | Frontend UI | ผู้ใช้ใช้งาน RAG ผ่าน search page ในระบบ DMS |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Phase 1: Setup (Shared Infrastructure)
|
||||||
|
|
||||||
|
**Purpose**: เตรียม infrastructure, environment, และ schema delta ก่อนเริ่ม implement
|
||||||
|
|
||||||
|
- [ ] T001 Create RagModule skeleton in `backend/src/modules/rag/rag.module.ts` (empty module, imports BullMQ + TypeORM)
|
||||||
|
- [ ] T002 [P] Add schema delta `06-add-rag-status-to-attachments.sql` to `specs/03-Data-and-Storage/deltas/` (per data-model.md §1.1)
|
||||||
|
- [ ] T003 [P] Add schema delta `06b-create-document-chunks.sql` to `specs/03-Data-and-Storage/deltas/` (per data-model.md §1.2)
|
||||||
|
- [ ] T004 [P] Add Qdrant + Redis services to `backend/docker-compose.yml` (per quickstart.md Step 1)
|
||||||
|
- [ ] T005 [P] Add RAG environment variables to `backend/.env.example` (`QDRANT_URL`, `OLLAMA_EMBED_MODEL`, `OLLAMA_RAG_MODEL`, `THAI_PREPROCESS_URL`, `TYPHOON_API_KEY`, `RAG_TOPK`, `RAG_FINAL_K`, `RAG_TIMEOUT_MS`, `RAG_QUERY_CACHE_TTL`)
|
||||||
|
|
||||||
|
**Checkpoint**: Infrastructure ready — schema deltas applied, docker services running, env vars configured
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Phase 2: Foundational (Blocking Prerequisites)
|
||||||
|
|
||||||
|
**Purpose**: Core services ที่ทุก User Story ต้องใช้ — **ต้องเสร็จก่อนเริ่ม Phase 3+**
|
||||||
|
|
||||||
|
**⚠️ CRITICAL**: ห้ามเริ่ม User Story ใดๆ จนกว่า Phase 2 จะสมบูรณ์
|
||||||
|
|
||||||
|
- [ ] T006 Create `DocumentChunk` TypeORM entity in `backend/src/modules/rag/entities/document-chunk.entity.ts` (per data-model.md §3.1 — id, documentId, chunkIndex, content, docType, docNumber, revision, projectCode, projectPublicId, classification, embeddingModel, createdAt)
|
||||||
|
- [ ] T007 [P] Create `RagQueryDto` + `RagResponseDto` + `Citation` interfaces in `backend/src/modules/rag/dto/rag-query.dto.ts` and `backend/src/modules/rag/dto/rag-response.dto.ts` (per data-model.md §4, contracts/rag-api.yaml) — **EC-RAG-004**: ห้ามมี `maxClassification` field ใน RagQueryDto — classification ต้อง derive server-side จาก user role เท่านั้น
|
||||||
|
- [ ] T008 [P] Create `EmbeddingService` (Ollama nomic-embed-text wrapper, returns 768-dim vector) in `backend/src/modules/rag/embedding.service.ts`
|
||||||
|
- [ ] T009 [P] Create `QdrantService` (collection init with is_tenant=true, upsert points, hybrid search merge 0.7/0.3, delete by documentId) in `backend/src/modules/rag/qdrant.service.ts` (per research.md R3, R4) — **EC-RAG-003**: implement `OnModuleInit` เพื่อ auto-create collection `lcbp3_vectors` (HNSW `payload_m:16, m:0`) ตอน startup; ถ้า Qdrant ไม่ตอบสนอง log ERROR + set `collectionReady = false` ห้าม throw
|
||||||
|
- [ ] T010 Create `TyphoonService` (Typhoon API call + auto-failover Promise.race to Ollama on timeout/5xx, `used_fallback_model` flag) in `backend/src/modules/rag/typhoon.service.ts` (depends T008, per research.md R6)
|
||||||
|
- [ ] T011 Wire BullMQ queues (`rag:ocr`, `rag:thai-preprocess`, `rag:embedding`) with DLQ config in `backend/src/modules/rag/rag.module.ts` (per research.md R5 — max 3 retries, delay backoff)
|
||||||
|
|
||||||
|
**Checkpoint**: Core services built and unit-testable — EmbeddingService, QdrantService, TyphoonService isolated
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Phase 3: User Story 1 — RAG Query API (P1 🎯 MVP)
|
||||||
|
|
||||||
|
**Goal**: `POST /api/rag/query` ค้นหา context + สร้างคำตอบ AI พร้อม citation ตาม RBAC + tenant isolation
|
||||||
|
|
||||||
|
**Independent Test**:
|
||||||
|
```bash
|
||||||
|
curl -X POST http://localhost:3001/api/rag/query \
|
||||||
|
-H "Authorization: Bearer <token>" \
|
||||||
|
-d '{"question":"เอกสารนี้เกี่ยวกับอะไร?","projectPublicId":"<uuid>"}'
|
||||||
|
# ต้องได้ {"data":{"answer":"...","citations":[...],"fallbackUsed":false}}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Implementation for User Story 1
|
||||||
|
|
||||||
|
- [ ] T012 [P] [US1] Create `RagService.query()` pipeline in `backend/src/modules/rag/rag.service.ts`:
|
||||||
|
1. Derive `classificationCeiling` จาก user role (Admin/Manager→CONF, Member→INT, Guest→PUB) — **EC-RAG-004**
|
||||||
|
2. Check `collectionReady` flag → return 503 ถ้า false — **EC-RAG-003**
|
||||||
|
3. Check Redis cache key `SHA256(question+projectPublicId+classificationCeiling)` — skip cache ถ้า CONFIDENTIAL — **EC-RAG-005**
|
||||||
|
4. Embed question → hybrid search QdrantService → filter ACL → score-based re-rank top 5 → build context ≤3000 tokens
|
||||||
|
5. Route LLM: CONFIDENTIAL → Ollama local เท่านั้น; PUBLIC/INTERNAL → TyphoonService (with fallover) — **ADR-018**
|
||||||
|
6. Write cache ถ้า PUBLIC/INTERNAL, TTL 5min — **EC-RAG-005**
|
||||||
|
7. Return RagResponseDto
|
||||||
|
- [ ] T013 [P] [US1] Create `RagService.buildContext()` helper (format `[DOC_TYPE - DOC_NUMBER - REV]\nsnippet`, limit 3–5 docs) in `backend/src/modules/rag/rag.service.ts`
|
||||||
|
- [ ] T014 [P] [US1] Add CASL permission `manage:rag` to `backend/src/database/seeds/seed-permissions.sql`
|
||||||
|
- [ ] T015 [US1] Create `RagController` with `POST /rag/query` endpoint (CASL guard `manage:rag`, Zod/class-validator, Idempotency-Key header, NestJS Logger, ADR-007 error handling) in `backend/src/modules/rag/rag.controller.ts` (depends T012, T014)
|
||||||
|
- [ ] T016 [US1] Register `RagModule` in `backend/src/app.module.ts` (depends T015)
|
||||||
|
- [ ] T017 [US1] Write unit tests for `RagService.query()` in `backend/src/modules/rag/__tests__/rag.service.spec.ts` covering:
|
||||||
|
- success path (PUBLIC, cache miss → cache write)
|
||||||
|
- cache hit returns cached result without Qdrant call
|
||||||
|
- CONFIDENTIAL → Ollama only, **no cache read/write** (EC-RAG-005)
|
||||||
|
- `collectionReady=false` → 503 RAG_NOT_READY (EC-RAG-003)
|
||||||
|
- cross-project cache isolation: same question different project → different cache key (EC-RAG-005)
|
||||||
|
- classification ceiling derived from role, not from request (EC-RAG-004)
|
||||||
|
|
||||||
|
**Checkpoint**: `POST /api/rag/query` returns answer + citations; CONFIDENTIAL routing verified; tenant isolation tested
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Phase 4: User Story 2 — Auto Ingestion Pipeline (P2)
|
||||||
|
|
||||||
|
**Goal**: เมื่อไฟล์ถูก commit เข้า permanent storage → rag_status: PENDING→PROCESSING→INDEXED อัตโนมัติ
|
||||||
|
|
||||||
|
**Independent Test**:
|
||||||
|
```bash
|
||||||
|
# Upload + commit file → check rag_status
|
||||||
|
curl http://localhost:3001/api/rag/status/<attachment-uuid> -H "Authorization: Bearer <token>"
|
||||||
|
# ต้องได้ {"data":{"ragStatus":"INDEXED","chunkCount":12}}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Implementation for User Story 2
|
||||||
|
|
||||||
|
- [ ] T018 [P] [US2] Create `OcrProcessor` BullMQ consumer (read attachment, call Tesseract OCR, enqueue `rag:thai-preprocess`) in `backend/src/modules/rag/processors/ocr.processor.ts`
|
||||||
|
- [ ] T019 [P] [US2] Create `ThaiPreprocessProcessor` BullMQ consumer (HTTP POST to `THAI_PREPROCESS_URL`, normalize text, enqueue `rag:embedding`) in `backend/src/modules/rag/processors/thai-preprocess.processor.ts` (per research.md R2)
|
||||||
|
- [ ] T020 [P] [US2] Create `EmbeddingProcessor` BullMQ consumer (chunk text per Section 6 strategy, call EmbeddingService, upsert QdrantService batch, save DocumentChunk rows, update rag_status=INDEXED) in `backend/src/modules/rag/processors/embedding.processor.ts`
|
||||||
|
- [ ] T021 [US2] Create `IngestionService` in `backend/src/modules/rag/ingestion.service.ts` (depends T018, T019, T020):
|
||||||
|
- enqueue `rag:ocr` job โดยใช้ `attachmentId` เป็น BullMQ `jobId` (native dedup) — **EC-RAG-001**
|
||||||
|
- ถ้า job นั้น active/waiting อยู่แล้ว → log `'rag:ocr job already queued'` และ return silently
|
||||||
|
- manage rag_status lifecycle: PENDING→PROCESSING→INDEXED/FAILED
|
||||||
|
- DLQ → set rag_status=FAILED + rag_last_error
|
||||||
|
- [ ] T022 [US2] Hook `IngestionService.enqueue()` into `StorageService.commitFile()` in `backend/src/common/storage/storage.service.ts` (trigger ingestion when file moves to permanent, depends T021)
|
||||||
|
- [ ] T023 [US2] Write unit tests for `IngestionService` in `backend/src/modules/rag/__tests__/ingestion.service.spec.ts` covering:
|
||||||
|
- successful enqueue with attachmentId as jobId
|
||||||
|
- duplicate enqueue → second call is no-op, log only (EC-RAG-001)
|
||||||
|
- OcrProcessor double-check: rag_status=PROCESSING → return MoveToCompleted (EC-RAG-001)
|
||||||
|
- FAILED after 3 retries → rag_last_error set
|
||||||
|
- rag_status transitions: PENDING→PROCESSING→INDEXED/FAILED
|
||||||
|
|
||||||
|
**Checkpoint**: Upload a PDF → BullMQ processes → rag_status=INDEXED → chunks in Qdrant; verify with GET /rag/status
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Phase 5: User Story 3 — Status & Re-ingestion Management (P3)
|
||||||
|
|
||||||
|
**Goal**: Admin ดู `ragStatus` ของ attachment และ trigger re-index สำหรับไฟล์ที่ FAILED
|
||||||
|
|
||||||
|
**Independent Test**:
|
||||||
|
```bash
|
||||||
|
# GET status
|
||||||
|
curl http://localhost:3001/api/rag/status/<uuid> -H "Authorization: Bearer <token>"
|
||||||
|
# POST re-ingest (ต้อง FAILED state)
|
||||||
|
curl -X POST http://localhost:3001/api/rag/ingest/<uuid> -H "Authorization: Bearer <token>"
|
||||||
|
```
|
||||||
|
|
||||||
|
### Implementation for User Story 3
|
||||||
|
|
||||||
|
- [ ] T024 [P] [US3] Add `RagService.getStatus()` (query attachments.rag_status + COUNT document_chunks) in `backend/src/modules/rag/rag.service.ts`
|
||||||
|
- [ ] T025 [P] [US3] Add `RagService.reIngest()` in `backend/src/modules/rag/rag.service.ts` — **EC-RAG-002** cleanup order (mandatory):
|
||||||
|
1. Validate rag_status = FAILED → throw BusinessException ถ้าไม่ใช่
|
||||||
|
2. DELETE document_chunks WHERE document_id = attachmentId (DB transaction)
|
||||||
|
3. DELETE Qdrant points by documentId filter (log ERROR ถ้า fail แต่ดำเนินต่อ)
|
||||||
|
4. SET rag_status = PENDING + clear rag_last_error
|
||||||
|
5. Enqueue `rag:ocr` job with attachmentId as jobId (EC-RAG-001)
|
||||||
|
- [ ] T026 [US3] Add `GET /rag/status/:attachmentId` + `POST /rag/ingest/:attachmentId` endpoints to `RagController` in `backend/src/modules/rag/rag.controller.ts` (CASL guard, validate FAILED state before re-ingest, depends T024, T025)
|
||||||
|
|
||||||
|
**Checkpoint**: GET status returns correct ragStatus + chunkCount; POST re-ingest only works on FAILED files; non-FAILED returns 400
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Phase 6: User Story 4 — Vector Cleanup on Document Delete (P4)
|
||||||
|
|
||||||
|
**Goal**: เมื่อ attachment ถูก soft-delete → Qdrant vectors + document_chunks rows ถูกลบตาม (data consistency)
|
||||||
|
|
||||||
|
**Independent Test**:
|
||||||
|
```bash
|
||||||
|
# Delete attachment → verify vectors gone
|
||||||
|
curl -X DELETE http://localhost:3001/api/rag/vectors/<uuid> -H "Authorization: Bearer <token>"
|
||||||
|
# Qdrant should return 0 points for deleted documentId
|
||||||
|
```
|
||||||
|
|
||||||
|
### Implementation for User Story 4
|
||||||
|
|
||||||
|
- [ ] T027 [P] [US4] Add `RagService.deleteVectors()` (delete Qdrant points by documentId filter, delete document_chunks rows, reset rag_status=PENDING) in `backend/src/modules/rag/rag.service.ts`
|
||||||
|
- [ ] T028 [P] [US4] Add `DELETE /rag/vectors/:attachmentId` endpoint to `RagController` in `backend/src/modules/rag/rag.controller.ts` (CASL guard `manage:rag`, depends T027)
|
||||||
|
- [ ] T029 [US4] Hook `RagService.deleteVectors()` into `AttachmentService` soft-delete flow in `backend/src/modules/` (find the existing attachment delete service, call deleteVectors, depends T028)
|
||||||
|
|
||||||
|
**Checkpoint**: Delete attachment → GET /rag/status returns PENDING (no chunks); Qdrant confirms 0 points for that documentId
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Phase 7: User Story 5 — Frontend UI (P5)
|
||||||
|
|
||||||
|
**Goal**: ผู้ใช้ค้นหาและรับคำตอบ AI จาก RAG บน search page พร้อม citation cards และ fallback badge
|
||||||
|
|
||||||
|
**Independent Test**: เปิด `/rag` → พิมพ์คำถาม → เห็นคำตอบ + citation cards + "ใช้ local model" badge (เมื่อ fallbackUsed=true)
|
||||||
|
|
||||||
|
### Implementation for User Story 5
|
||||||
|
|
||||||
|
- [ ] T030 [P] [US5] Create `useRagQuery` TanStack Query hook (POST /api/rag/query, loading/error/success states) in `frontend/hooks/use-rag.ts`
|
||||||
|
- [ ] T031 [P] [US5] Create `RagSearchBar` component (input + submit, loading spinner, Zod validation question ≤500 chars) in `frontend/components/rag/rag-search-bar.tsx`
|
||||||
|
- [ ] T032 [P] [US5] Create `RagResultCard` component (answer, citation list with docNumber/docType/snippet/score, confidence indicator) in `frontend/components/rag/rag-result-card.tsx`
|
||||||
|
- [ ] T033 [P] [US5] Create `RagFallbackBadge` component (แสดงเมื่อ `fallbackUsed=true` — "ใช้ local model คุณภาพอาจลดลง") in `frontend/components/rag/rag-fallback-badge.tsx`
|
||||||
|
- [ ] T034 [US5] Create RAG search page with `RagSearchBar` + `RagResultCard` + `RagFallbackBadge` in `frontend/app/(dashboard)/rag/page.tsx` (depends T030, T031, T032, T033)
|
||||||
|
|
||||||
|
**Checkpoint**: Navigate to `/rag` → search → see answer with citations; fallback badge appears when Typhoon is down
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Phase 8: Polish & Cross-Cutting Concerns
|
||||||
|
|
||||||
|
**Purpose**: Security hardening, observability, admin tools
|
||||||
|
|
||||||
|
- [ ] T035 [P] Add `@Throttle()` rate limiting to `RagController` (prevent Q&A abuse) in `backend/src/modules/rag/rag.controller.ts`
|
||||||
|
- [ ] T036 [P] Add structured audit logging for every RAG query (user_id, projectPublicId, question_hash, retrieved doc_ids, llm_provider, latency_ms, confidence) to `RagService.query()` in `backend/src/modules/rag/rag.service.ts` (per v1.1.1 audit_log schema)
|
||||||
|
- [ ] T037 [P] Add prompt injection defense to `TyphoonService` (boundary markers `<CONTEXT_START>/<CONTEXT_END>`, JSON-only output mode, post-gen citation validation) in `backend/src/modules/rag/typhoon.service.ts` (per research.md R7)
|
||||||
|
- [ ] T038 Add admin endpoint `POST /rag/admin/init-collection` (create Qdrant collection `lcbp3_vectors` if not exists, create payload indexes) in `backend/src/modules/rag/rag.controller.ts`
|
||||||
|
- [ ] T039 [P] Write unit tests for `QdrantService` (collection init, upsert batch, hybrid search merge 0.7/0.3, delete by documentId, tenant filter enforcement) in `backend/src/modules/rag/__tests__/qdrant.service.spec.ts`
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Dependencies & Execution Order
|
||||||
|
|
||||||
|
### Phase Dependencies
|
||||||
|
|
||||||
|
- **Phase 1 (Setup)**: ไม่มี dependency — เริ่มได้ทันที
|
||||||
|
- **Phase 2 (Foundational)**: ต้องเสร็จ Phase 1 — **blocks ทุก User Story**
|
||||||
|
- **Phase 3–7 (User Stories)**: ทุก story ต้องรอ Phase 2 เสร็จ; สามารถทำคู่ขนานได้หากมีทีม
|
||||||
|
- **Phase 8 (Polish)**: ต้องรอ Phase 3 เสร็จ (ต้องมี RagController + RagService ก่อน)
|
||||||
|
|
||||||
|
### User Story Dependencies
|
||||||
|
|
||||||
|
| Story | Depends On | หมายเหตุ |
|
||||||
|
|-------|-----------|----------|
|
||||||
|
| US1 (P1) | Phase 2 complete | Core RAG query — สามารถทำได้โดยไม่ต้องรอ ingestion |
|
||||||
|
| US2 (P2) | Phase 2 + US1 rag_status | IngestionService ใช้ QdrantService เดียวกับ US1 |
|
||||||
|
| US3 (P3) | Phase 2 + T021 (IngestionService) | ต้องมี IngestionService.enqueue() ก่อน |
|
||||||
|
| US4 (P4) | Phase 2 + T027 (deleteVectors) | สามารถทำคู่ขนานกับ US2 ได้ |
|
||||||
|
| US5 (P5) | US1 API endpoint ready (T015) | Frontend ต้องการ POST /rag/query ทำงานก่อน |
|
||||||
|
|
||||||
|
### Parallel Opportunities
|
||||||
|
|
||||||
|
- T002, T003, T004, T005 — ทำพร้อมกันใน Phase 1
|
||||||
|
- T007, T008, T009 — ทำพร้อมกันใน Phase 2
|
||||||
|
- T012, T013, T014 — ทำพร้อมกันใน Phase 3
|
||||||
|
- T018, T019, T020 — ทำพร้อมกันใน Phase 4
|
||||||
|
- T024, T025 — ทำพร้อมกันใน Phase 5
|
||||||
|
- T027, T028 — ทำพร้อมกันใน Phase 6
|
||||||
|
- T030, T031, T032, T033 — ทำพร้อมกันใน Phase 7
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Parallel Example: Phase 2 (Foundational)
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# ทำพร้อมกัน (different files):
|
||||||
|
Task T007: "Create RagQueryDto + RagResponseDto in backend/src/modules/rag/dto/"
|
||||||
|
Task T008: "Create EmbeddingService in backend/src/modules/rag/embedding.service.ts"
|
||||||
|
Task T009: "Create QdrantService in backend/src/modules/rag/qdrant.service.ts"
|
||||||
|
|
||||||
|
# รอ T008 เสร็จก่อน:
|
||||||
|
Task T010: "Create TyphoonService in backend/src/modules/rag/typhoon.service.ts"
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Implementation Strategy
|
||||||
|
|
||||||
|
### MVP First (US1 Only — ~5 วัน)
|
||||||
|
|
||||||
|
1. Phase 1: Setup (1 วัน) — T001–T005
|
||||||
|
2. Phase 2: Foundational (2 วัน) — T006–T011
|
||||||
|
3. Phase 3: US1 RAG Query API (2 วัน) — T012–T017
|
||||||
|
4. **STOP และ VALIDATE**: ทดสอบ POST /api/rag/query ด้วย indexed documents
|
||||||
|
5. Deploy / demo MVP ได้ทันที
|
||||||
|
|
||||||
|
### Incremental Delivery
|
||||||
|
|
||||||
|
1. MVP (Phase 1–3) → RAG query พร้อมใช้งาน
|
||||||
|
2. + US2 (Phase 4) → Auto-ingestion on upload
|
||||||
|
3. + US3 (Phase 5) → Admin status monitoring
|
||||||
|
4. + US4 (Phase 6) → Data consistency on delete
|
||||||
|
5. + US5 (Phase 7) → Frontend search UI
|
||||||
|
6. + Phase 8 → Security hardening + audit logs
|
||||||
|
|
||||||
|
### Parallel Team Strategy (2+ developers)
|
||||||
|
|
||||||
|
```
|
||||||
|
Developer A (Backend): Phase 1 → Phase 2 → Phase 3 (US1) → Phase 4 (US2)
|
||||||
|
Developer B (Backend): Phase 2 (parallel T007/T008/T009) → Phase 5+6 (US3+US4)
|
||||||
|
Developer C (Frontend): รอ T015 เสร็จ → Phase 7 (US5)
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Notes
|
||||||
|
|
||||||
|
- **[P]** = different files, no incomplete task dependencies — สามารถทำพร้อมกัน
|
||||||
|
- **[USn]** = maps to user story for traceability
|
||||||
|
- ADR-019: ใช้ `publicId` (UUIDv7) ทุก API — ห้าม `parseInt`
|
||||||
|
- ADR-009: SQL delta โดยตรง — ห้าม TypeORM migration
|
||||||
|
- ADR-018: Ollama บน Admin Desktop เท่านั้น; CONFIDENTIAL ห้ามผ่าน Typhoon
|
||||||
|
- ADR-008: BullMQ สำหรับทุก ingestion job — ห้าม inline processing
|
||||||
|
- Commit หลังแต่ละ task หรือ logical group
|
||||||
|
- Stop ที่ทุก Checkpoint เพื่อ validate ก่อนไปต่อ
|
||||||
Reference in New Issue
Block a user