690403:2205 Modify AI (Add Gemma4 & PaddleOCR

2026-04-03 22:05:34 +07:00
parent 9c835ec4ac
commit d775d5ad85
8 changed files with 2254 additions and 5 deletions
@@ -2,15 +2,17 @@

 **Status:** Accepted
 **Date:** 2026-02-26
-**Version:** 1.8.0
-**Decision Makers:** Development Team, DevOps Engineer
+**Version:** 1.8.3 (Aligned with ADR-020)
+**Decision Makers:** Development Team, DevOps Engineer, AI Integration Lead
 **Related Documents:**

+- [ADR-020: AI Intelligence Integration Architecture](./ADR-020-ai-intelligence-integration.md) — Overall AI Architecture & RFA-First Strategy
 - [Legacy Data Migration Plan](../03-Data-and-Storage/03-04-legacy-data-migration.md)
 - [n8n Migration Setup Guide](../03-Data-and-Storage/03-05-n8n-migration-setup-guide.md)
+- [ADR-019: Hybrid Identifier Strategy](./ADR-019-hybrid-identifier-strategy.md) — UUID Strategy สำหรับ DB Lookup
 - [Software Architecture](../02-Architecture/02-02-software-architecture.md)
 - [Data Dictionary](../03-Data-and-Storage/03-01-data-dictionary.md)
-  > **Note:** ADR-017 is clarified and hardened by ADR-018 regarding AI physical isolation. Category Enum system-driven, Idempotency Contract, Duplicate Handling Clarification, Storage Enforcement, Audit Log Enhancement, Review Queue Integration, Revision Drift Protection, Execution Time, Encoding Normalization, Security Hardening, Orchestrator on QNAP, AI Physical Isolation (Desktop Desk-5439).
+  > **Note:** ADR-017 is clarified and hardened by ADR-018 regarding AI physical isolation. Now part of unified ADR-020 architecture with RFA-First approach, Gemma 4 model, and comprehensive Human-in-the-Loop validation.

 ---

@@ -92,10 +94,10 @@
 | Component              | รายละเอียด                                                                                                   |
 | ---------------------- | ------------------------------------------------------------------------------------------------------------ |
 | Migration Orchestrator | n8n (Docker บน QNAP NAS)                                                                                     |
-| AI Model Primary       | Ollama `llama3.2:3b` (Validation, Summarization, Tagging)                                                    |
+| AI Model Primary       | Ollama `gemma4:9b` (9.6 GB, Gemma 4 9B) — Validation, Summarization, Tagging |
 | AI Model Fallback      | Ollama `mistral:7b-instruct-q4_K_M`                                                                          |
 | Hardware               | QNAP NAS (Orchestrator) + Desktop Desk-5439 (AI Processing, RTX 2060 SUPER 8GB)                              |
-| DB Lookup (n8n)        | n8n ทำการ Query `project_id`, `organization_id` และดึง `Tags` จาก DB ให้ AI                                  |
+| DB Lookup (n8n)        | n8n ทำการ Query `project_uuid`, `organization_uuid` และดึง `Tags` จาก DB ให้ AI (ADR-019)                      |
 | Data Ingestion         | 1. Staging ลง `migration_review_queue` -> 2. กดยืนยันผ่าน Frontend Management UI -> 3. Final Commit ผ่าน API |
 | Concurrency (n8n)      | Sequential — Batch Size 50-100 ป้องกัน DB Connection Overload                                                |
 | Checkpoint             | MariaDB `migration_progress` และการใช้ `ON DUPLICATE KEY UPDATE` ใน Staging                                  |
@@ -387,3 +389,12 @@ IF excel_revision != current_db_revision + 1
 ---

 _สำหรับขั้นตอนปฏิบัติงานแบบละเอียด ดูที่ `03-04-legacy-data-migration.md` และ `03-05-n8n-migration-setup-guide.md`_
+
+---
+
+## Document History
+
+| Version | Date       | Author      | Changes                                                              |
+| ------- | ---------- | ----------- | -------------------------------------------------------------------- |
+| 1.8.0   | 2026-02-26 | DevOps Team | Initial ADR — Ollama + n8n Migration Architecture                    |
+| 1.8.2   | 2026-04-03 | Tech Lead   | **Updated** — Aligned with ADR-019 (UUID Strategy), changed AI Model to `gemma4:9b` (9.6 GB) |
@@ -0,0 +1,317 @@
+# ADR-017B: Smart Legacy Document Digitization (AI-Powered Use Case Extension)
+
+**Status:** Accepted
+**Date:** 2026-03-27
+**Version:** 1.8.2 (Aligned with ADR-020)
+**Decision Makers:** Development Team, AI Integration Lead
+**Related Documents:**
+
+- [ADR-020: AI Intelligence Integration Architecture](./ADR-020-ai-intelligence-integration.md) — Overall AI Architecture & RFA-First Strategy
+- [ADR-017: Ollama Data Migration Architecture](./ADR-017-ollama-data-migration.md)
+- [ADR-018: AI Boundary Policy](./ADR-018-ai-boundary.md) — AI Physical Isolation (No Direct DB/Storage Access)
+- [n8n Migration Setup Guide](../03-Data-and-Storage/03-05-n8n-migration-setup-guide.md)
+- [Legacy Data Migration Plan](../03-Data-and-Storage/03-04-legacy-data-migration.md)
+- [Migration Business Scope](../03-Data-and-Storage/03-06-migration-business-scope.md)
+- [Glossary](../00-Overview/00-02-glossary.md)
+
+> **Note:** ADR-017B เป็น Use Case Extension ของ ADR-017 ที่ขยายขอบเขตการใช้งาน Ollama จาก Migration Batch สู่ระบบจัดหมวดหมู่เอกสารอัจฉริยะ (Smart Categorization) พร้อมควบคุมตาม ADR-018 (AI Isolation Policy) และเป็นส่วนหนึ่งของ ADR-020 (Unified AI Architecture).
+
+---
+
+## Context and Problem Statement
+
+### ปัญหาที่ต้องการแก้ไข
+
+โครงการ LCBP3 มีเอกสารเก่าจำนวนมาก (กว่า 20,000 ฉบับ) ที่ต้องนำเข้าสู่ระบบ DMS ใหม่ การจัดหมวดหมู่และสกัด Metadata ด้วยมือมีปัญหาหลัก 3 ประการ:
+
+1. **Manual Labor สูง:** ต้องใช้เวลานานในการอ่านและพิมพ์ข้อมูล Metadata (Data Entry)
+2. **Human Error:** ความผิดพลาดจากการพิมพ์ (Typo) โดยเฉพาะเลขที่สัญญาและชื่อเฉพาะ
+3. **Searchability:** ไฟล์ PDF ที่เป็นแค่ภาพสแกน (Scanned) ไม่สามารถค้นหาเนื้อหาได้
+
+### ข้อจำกัดที่สำคัญ
+
+- **Data Privacy:** เอกสารก่อสร้างท่าเรือเป็นความลับ ห้ามส่งข้อมูลขึ้น Cloud AI Provider
+- **Cost Constraint:** การใช้ Cloud AI (~$0.01–0.03 ต่อ Record) อาจสูงถึง $600 สำหรับ 20,000 records
+- **Hardware Limitation:** ต้องรันบนเครื่องที่มีอยู่ (i7-9700K / 32GB RAM / RTX 2060 Super 8GB)
+
+---
+
+## Decision Drivers
+
+- **Security First (ADR-018):** AI ต้องรันบน Admin Desktop (Desk-5439) เท่านั้น — ห้ามรันบน QNAP/Production Server
+- **Privacy Guaranteed:** ประมวลผลภายในเครือข่ายองค์กร (On-Premise) เท่านั้น
+- **Cost Effectiveness:** Zero Cost สำหรับ AI Inference (ไม่มีค่า Pay-per-use)
+- **Data Integrity:** ข้อมูลที่ AI สกัดต้องผ่าน Human Verification ก่อน Commit ลงระบบจริง
+- **AI Isolation (ADR-018):** AI ห้ามเข้าถึง Database/Storage โดยตรง — ต้องสื่อสารผ่าน DMS API เท่านั้น
+- **Recoverability:** รองรับ Checkpoint/Resume และ Rollback ได้สมบูรณ์
+
+---
+
+## Considered Options
+
+### Option 1: Manual Data Entry (No AI)
+
+**Pros:**
+
+- ไม่ต้องลงทุน Hardware เพิ่ม
+- ความแม่นยำ 100% (ถ้าพิมพ์ถูก)
+
+**Cons:**
+
+- ❌ ใช้เวลานานมาก (อาจเป็นสัปดาห์หรือเดือน)
+- ❌ Human Error สูง (Typo, Inconsistency)
+- ❌ ไม่สามารถทำให้ Scanned PDF ค้นหาได้
+
+### Option 2: Cloud AI Service (OpenAI, Google Vision, Azure AI)
+
+**Pros:**
+
+- AI ฉลาดสูง แม่นยำมาก
+- ไม่ต้องดูแล Infrastructure
+
+**Cons:**
+
+- ❌ **ผิดนโยบาย Data Privacy** — เอกสารก่อสร้างท่าเรือเป็นความลับ
+- ❌ **ค่าใช้จ่ายสูง** (~$600 สำหรับ 20,000 records)
+- ❌ Dependency กับ External Service
+
+### Option 3: Local AI (Ollama + Gemma 4 9B) + n8n + Human Verification ⭐ (Selected)
+
+**Pros:**
+
+- ✅ **Privacy Guaranteed** — รันภายในเครือข่ายองค์กร
+- ✅ **Zero Cost** — ไม่มีค่า Pay-per-use
+- ✅ **AI Isolation (ADR-018)** — AI รันบน Desktop แยกต่างหาก ไม่เข้าถึง DB โดยตรง
+- ✅ **Human-in-the-Loop** — Admin ตรวจสอบก่อน Commit
+- ✅ **Recoverability** — รองรับ Checkpoint/Resume
+- ✅ **Clean Architecture** — แยก Migration Logic ออกจาก Core Application
+
+**Cons:**
+
+- ❌ ต้องเปิด Desktop ทิ้งไว้ดูแล GPU Temperature
+- ❌ Model ขนาดเล็กอาจแม่นยำน้อยกว่า Cloud AI → ต้องมี Human Review Queue
+- ❌ ใช้เวลา Migration ~16.6 ชั่วโมง (~3–4 คืน)
+
+---
+
+## Decision Outcome
+
+**Chosen Option:** Option 3 — Local AI (Ollama + Gemma 4 9B) + n8n + Human Verification
+
+**Rationale:**
+
+ประยุกต์ใช้ Hardware ที่มีอยู่ (i7-9700K + RTX 2060 Super) โดยไม่ขัดหลัก Privacy และ Security ของโครงการ n8n ช่วยลด Risk ที่จะกระทบ Core Backend และรองรับ Checkpoint/Resume ได้ดีกว่าการเขียน Script เอง นอกจากนี้ยังสอดคล้องกับ **ADR-018 (AI Boundary)** ที่กำหนดให้ AI ต้องรันบน Admin Desktop แยกต่างหาก และไม่สามารถเข้าถึง Database/Storage โดยตรงได้
+
+---
+
+## Implementation Architecture
+
+### System Components
+
+| Component | รายละเอียด | ที่ตั้ง |
+|-----------|-----------|---------|
+| **Migration Orchestrator** | n8n (Docker บน QNAP NAS) | QNAP NAS |
+| **AI Processing Engine** | Ollama + Gemma 4 9B (9.6 GB) | Admin Desktop (Desk-5439) |
+| **OCR Engine** | Tesseract หรือ Google Vision (On-prem) | Same as AI |
+| **Verification UI** | Next.js Frontend — Review Mode | Web Browser |
+| **Database** | MariaDB — Staging + Production | QNAP NAS |
+| **File Storage** | Two-Phase Storage (Temp → Permanent) | NAS Mounts |
+
+### Workflow (Main Success Scenario)
+
+```
+[1. Ingestion]        → Batch Upload (PDF/Scan)
+      ↓
+[2. Pre-processing]   → File Validation (NestJS)
+      ↓
+[3. OCR Layer]        → Raw Text Extraction (Tesseract)
+      ↓
+[4. AI Analysis]      → Gemma 4 9B via Ollama (Desk-5439)
+      ↓
+[5. Staging]          → migration_review_queue (MariaDB)
+      ↓
+[6. Verification]     → Admin Review UI (Next.js)
+      ↓
+[7. Commit]           → POST /api/migration/commit_batch
+      ↓
+[8. Finalization]     → Permanent Storage (QNAP)
+```
+
+### AI Processing Flow (ADR-018 Compliant)
+
+```
+┌─────────────────┐     ┌─────────────────┐     ┌─────────────────┐
+│   Scanned PDF   │────▶│  OCR Engine     │────▶│   Raw Text      │
+│   (Staging)     │     │  (Tesseract)    │     │   (Text)        │
+└─────────────────┘     └─────────────────┘     └────────┬────────┘
+                                                         │
+                              ┌──────────────────────────┘
+                              ▼
+┌─────────────────┐     ┌─────────────────┐     ┌─────────────────┐
+│  DMS Backend    │◀────│  Ollama API     │◀────│  AI Prompt      │
+│  (Validation)   │     │  (Desk-5439)    │     │  (Gemma 4 9B)   │
+└────────┬────────┘     └─────────────────┘     └─────────────────┘
+         │
+         ▼
+┌─────────────────┐
+│  JSON Metadata  │────▶ migration_review_queue
+│  (Validated)    │
+└─────────────────┘
+```
+
+> ⚠️ **ADR-018 Enforcement:** AI (Ollama) อยู่บน Desktop Desk-5439 แยกต่างหาก — ไม่มี Direct DB Access หรือ File System Access โดยตรง ทุกการสื่อสารผ่าน DMS Backend API เท่านั้น
+
+---
+
+## AI Prompt Specification (Prompt Engineering)
+
+### Document Analysis Prompt
+
+```markdown
+บทบาท: คุณคือผู้ช่วยจัดการเอกสารวิศวกรรมโยธาประจำโครงการท่าเรือแหลมฉบัง เฟส 3
+ภารกิจ: อ่านข้อความดิบจากการสแกน และสรุปข้อมูลออกมาในรูปแบบ JSON เท่านั้น
+
+ข้อความดิบ: [RAW_TEXT_FROM_OCR]
+
+รูปแบบ JSON ที่ต้องการ:
+{
+  "document_type": "รายงาน/สัญญา/จดหมายโต้ตอบ",
+  "contract_number": "เลขที่สัญญา (ถ้ามี)",
+  "contractor_name": "ชื่อบริษัทผู้รับเหมา",
+  "work_phase": "Phase ของงาน (เช่น งานถมทะเล, งานไฟฟ้า)",
+  "summary": "สรุปเนื้อหาสำคัญ 1-2 ประโยค",
+  "key_dates": ["วันที่ในเอกสาร", "วันกำหนดส่งงาน"]
+}
+```
+
+### Extracted Metadata Schema
+
+| Field | Type | คำอธิบาย |
+|-------|------|---------|
+| `document_type` | enum | Correspondence / RFA / Drawing / Minutes / Contract |
+| `contract_number` | string | เลขที่สัญญา (เช่น LCBP3-C1) |
+| `contractor_name` | string | ชื่อบริษัทผู้รับเหมา |
+| `work_phase` | string | Phase งาน (จาก Master Data) |
+| `work_zone` | string | Zone F, Basin 3, etc. |
+| `priority` | enum | ด่วนที่สุด / ด่วน / ปกติ |
+| `summary` | string | สรุปเนื้อหา 4-5 ประโยค |
+| `confidence` | float | 0.0–1.0 (ความมั่นใจของ AI) |
+| `suggested_tags` | array | Tags ที่แนะนำ (is_new: true/false) |
+
+> ⚠️ **Patch Note:** `document_type` ต้องตรงกับ System Enum จาก `GET /api/meta/categories` เท่านั้น — ห้าม hardcode Category List ใน Prompt (ดู ADR-017)
+
+---
+
+## Security & Compliance (ADR-018)
+
+### AI Isolation Requirements
+
+| Rule | Implementation |
+|------|----------------|
+| **Physical Isolation** | Ollama รันบน Admin Desktop (Desk-5439) เท่านั้น |
+| **No Direct DB Access** | AI ต้องสื่อสารผ่าน DMS API → Backend → Database |
+| **No Direct Storage Access** | File operations ผ่าน StorageService เท่านั้น |
+| **Validation Layer** | Backend ตรวจสอบ AI Output ก่อน Write ทุกครั้ง |
+| **Audit Logging** | ทุก AI Request/Response บันทึกใน Audit Log |
+
+### Two-Phase Storage (Storage Governance)
+
+**ข้อห้าม:**
+
+```bash
+❌ mv /data/dms/staging_ai/TCC-COR-0001.pdf /final/path/...
+```
+
+**ข้อบังคับ (ADR-017):**
+
+```
+Phase 1: Temp Upload (โดย n8n)
+✅ POST /api/storage/upload
+   → ได้ผลลัพธ์เป็น attachment_id (เช่น 1024)
+   → ไฟล์จะถูกระบุเป็น is_temporary = TRUE
+
+Phase 2: Final Commit (โดย Admin ผ่าน Frontend)
+✅ POST /api/migration/commit_batch
+   body: { queue_ids: [1, 2, 3] }
+```
+
+---
+
+## Confidence Threshold Policy
+
+ข้อมูลทุกชุดจาก AI จะถูกส่งเข้า `migration_review_queue` โดยจัดสถานะตาม Confidence:
+
+| Confidence Level | สถานะ | การดำเนินการ |
+|-----------------|-------|--------------|
+| `≥ 0.85` และ `is_valid = true` | PENDING | พร้อมให้ Admin Batch Import |
+| `0.60–0.84` | PENDING | ไฮไลต์ให้ Admin ตรวจสอบก่อน |
+| `< 0.60` หรือ `is_valid = false` | REJECTED | รอ Admin แก้ไข Manual |
+| AI Parse Error | ERROR_LOG | Trigger Fallback Logic |
+
+---
+
+## Performance Estimation
+
+| Parameter | ค่า |
+|-----------|-----|
+| Delay ระหว่าง Request | 2 วินาที |
+| Inference Time (avg) | ~1 วินาที |
+| เวลาต่อ Record | ~3 วินาที |
+| จำนวน Record | 20,000 |
+| **เวลารวม** | ~60,000 วินาที (~16.6 ชั่วโมง) |
+| **จำนวนคืนที่ต้องใช้** | **~3–4 คืน** (รัน 22:00–06:00) |
+
+> **Recommendation:** สำหรับเครื่อง i7-9700K / 32GB RAM / RTX 2060 Super 8GB แนะนำให้ใช้ Queue (BullMQ) ประมวลผลไฟล์แบบ Sequential (ทีละไฟล์) เพื่อให้ Gemma 4 9B (9.6 GB) ทำงานได้เสถียรที่สุดใน VRAM 8GB
+
+---
+
+## Consequences
+
+### Positive Consequences
+
+1. ✅ **Privacy Guaranteed** — ไม่มีข้อมูลออกนอกเครือข่ายองค์กร
+2. ✅ **Zero AI Cost** — ไม่มีค่าใช้จ่าย Pay-per-use
+3. ✅ **Security Compliant (ADR-018)** — AI Isolation ชัดเจน
+4. ✅ **Human-in-the-Loop** — ความถูกต้อง 100% ก่อน Commit
+5. ✅ **Recoverability** — รองรับ Checkpoint/Resume และ Rollback
+6. ✅ **Searchable Legacy** — Scanned PDF กลายเป็น Searchable Metadata
+
+### Negative Consequences
+
+1. ❌ **Operational Overhead** — ต้องดูแล GPU Temperature และ Desktop Uptime
+2. ❌ **Accuracy Trade-off** — Model ขนาดเล็กอาจแม่นยำน้อยกว่า Cloud AI
+3. ❌ **Time Investment** — ต้องใช้เวลา ~3–4 คืนในการ Migration
+4. ❌ **Hardware Dependency** — ต้องพึ่งพา Desktop Desk-5439
+
+### Mitigation Strategies
+
+- **Human Review Queue** — บังคับตรวจสอบทุก record ก่อน Commit
+- **Confidence Threshold** — แบ่งระดับตามความมั่นใจของ AI
+- **Fallback Model** — Auto-switch ไป mistral:7b-instruct เมื่อ Error ≥ Threshold
+- **Monitoring** — ตรวจสอบ GPU Temperature และ Progress ตลอดเวลา
+
+---
+
+## Related Documents
+
+- [ADR-017: Ollama Data Migration Architecture](./ADR-017-ollama-data-migration.md) — Architecture หลักสำหรับ Migration
+- [ADR-018: AI Boundary Policy](./ADR-018-ai-boundary.md) — Security Isolation สำหรับ AI
+- [03-05-n8n-migration-setup-guide.md](../03-Data-and-Storage/03-05-n8n-migration-setup-guide.md) — คู่มือติดตั้ง n8n Workflow
+- [03-04-legacy-data-migration.md](../03-Data-and-Storage/03-04-legacy-data-migration.md) — แผน Migration แบบละเอียด
+- [03-06-migration-business-scope.md](../03-Data-and-Storage/03-06-migration-business-scope.md) — Go/No-Go Gates และ Business Scope
+- [00-02-glossary.md](../00-Overview/00-02-glossary.md) — คำศัพท์และตัวย่อในระบบ DMS
+
+---
+
+## Document History
+
+| Version | Date       | Author     | Changes |
+|---------|------------|------------|---------|
+| 1.0.0   | 2026-02-26 | Nattanin   | Initial Use Case Specification |
+| 1.8.1   | 2026-03-27 | Tech Lead  | **Refactored to ADR format** — Aligned with ADR-017, ADR-018, and Project Specs |
+
+---
+
+**Last Updated:** 2026-03-27
+**Status:** Accepted
+**Next Review:** 2026-06-01 (Quarterly review)
@@ -0,0 +1,406 @@
+# ADR-018: AI Boundary Policy (AI Isolation)
+
+**Status:** Accepted
+**Date:** 2026-03-27
+**Version:** 1.8.2 (Aligned with ADR-020)
+**Decision Makers:** Security Team, System Architect, AI Integration Lead
+**Related Documents:**
+
+- [ADR-020: AI Intelligence Integration Architecture](./ADR-020-ai-intelligence-integration.md) — Overall AI Architecture & RFA-First Strategy
+- [ADR-017: Ollama Data Migration Architecture](./ADR-017-ollama-data-migration.md)
+- [ADR-017B: Smart Legacy Document Digitization](./ADR-017B-ollama.md)
+- [ADR-016: Security & Authentication](./ADR-016-security-authentication.md)
+- [ADR-019: Hybrid Identifier Strategy](./ADR-019-hybrid-identifier-strategy.md)
+- [n8n Migration Setup Guide](../03-Data-and-Storage/03-05-n8n-migration-setup-guide.md)
+- [RAG Architecture](../03-Data-and-Storage/03-07-OpenRAG.md)
+
+> **หมายเหตุ:** ADR-018 เป็น Security Policy หลักที่ควบคุมการทำงานของ AI Components ทั้งหมดในระบบ LCBP3-DMS ทุก Use Case ที่ใช้ AI (Migration, RAG, Smart Categorization) ต้องปฏิบัติตาม Policy นี้ และเป็นส่วนหนึ่งของ ADR-020 (Unified AI Architecture).
+
+---
+
+## Context and Problem Statement
+
+### ปัญหาที่ต้องการแก้ไข
+
+การนำ AI (Ollama, OpenRAG, หรือ LLM อื่นๆ) เข้ามาใช้งานในระบบ DMS ที่มีเอกสารสำคัญและข้อมูล Confidential ของโครงการท่าเรือ Laem Chabang Phase 3 มีความเสี่ยงด้าน Security หลัก 4 ประการ:
+
+1. **Data Exposure Risk:** หาก AI มีสิทธิ์เข้าถึง Database โดยตรง อาจมีการรั่วไหลของข้อมูลทางการค้า / การก่อสร้าง
+2. **Unauthorized Data Modification:** AI อาจทำการแก้ไขข้อมูลโดยไม่มีการตรวจสอบจาก Human
+3. **Privilege Escalation:** หาก AI ถูก compromise อาจใช้สิทธิ์ Database Access เพื่อโจมตีระบบอื่น
+4. **Compliance Violation:** ไม่สอดคล้องกับมาตรฐาน ISO 27001 และ PDPA สำหรับข้อมูลส่วนบุคคล
+
+### ข้อจำกัดด้าน Infrastructure
+
+- **QNAP NAS:** เป็น Production Server ที่ไม่ควรรัน AI Workload (Resource contention + Security boundary)
+- **Admin Desktop (Desk-5439):** เครื่องสำหรับ Admin มี GPU (RTX 2060 Super 8GB) เหมาะสำหรับ AI Inference
+- **Network Segmentation:** ต้องแยก Zone ระหว่าง AI Processing (Untrusted) กับ Database (Trusted)
+
+---
+
+## Decision Drivers
+
+- **Zero Trust Architecture:** AI ถือเป็น Untrusted Component เสมอ ไม่ว่าเป็น On-Premise หรือไม่
+- **Defense in Depth:** หลายชั้นของการควบคุม (Physical → Network → API → Data)
+- **Auditability:** ทุกการสื่อสารกับ AI ต้อง Log ได้
+- **Human-in-the-Loop:** ข้อมูลจาก AI ต้องผ่าน Human Validation ก่อน Commit ลง Database
+- **Minimal Privilege:** AI ได้รับสิทธิ์น้อยที่สุด (Principle of Least Privilege)
+
+---
+
+## Considered Options
+
+### Option 1: AI รันบน QNAP NAS (Same Host กับ Database)
+
+**Pros:**
+
+- ✅ ติดตั้งง่าย ไม่ต้องดูแลหลายเครื่อง
+- ✅ Network Latency ต่ำ (localhost)
+
+**Cons:**
+
+- ❌ **Security Risk สูง:** AI มี Direct Access ถึง Database หากถูก compromise
+- ❌ **Resource Contention:** AI Inference กิน RAM/CPU สูง กระทบ Production Services
+- ❌ **No Isolation:** ไม่มี Security Boundary ระหว่าง AI กับ Core Application
+
+### Option 2: AI บน Cloud AI Provider (OpenAI, Google, Azure)
+
+**Pros:**
+
+- ✅ AI ฉลาดสูง แม่นยำมาก
+- ✅ ไม่ต้องดูแล Hardware
+
+**Cons:**
+
+- ❌ **ผิดนโยบาย Data Privacy:** เอกสารก่อสร้างท่าเรือเป็นความลับ ห้ามส่งข้อมูลขึ้น Cloud
+- ❌ **Cost สูง:** Pay-per-use ไม่เหมาะกับงานประมวลผลจำนวนมาก
+- ❌ **No Control:** ไม่สามารถควบคุม Data Retention หรือ Audit ได้
+
+### Option 3: Physical Isolation + API-only Communication ⭐ (Selected)
+
+**Pros:**
+
+- ✅ **Security Boundary ชัดเจน:** AI รันบน Desktop แยกต่างหาก ไม่เข้าถึง DB โดยตรง
+- ✅ **Zero Trust:** AI ถือเป็น Untrusted Component สื่อสารผ่าน API เท่านั้น
+- ✅ **Audit Trail:** ทุก Request/Response ผ่าน Backend ซึ่งมี Audit Log ครบถ้วน
+- ✅ **Human-in-the-Loop:** Backend ตรวจสอบข้อมูลก่อน Write ลง Database
+- ✅ **Resource Isolation:** AI Workload ไม่กระทบ Production Services บน QNAP
+- ✅ **Compliance:** สอดคล้องกับ ISO 27001 และ PDPA
+
+**Cons:**
+
+- ❌ ต้องดูแลเครื่อง Desktop เพิ่ม (GPU Temperature, Uptime)
+- ❌ Network Latency เพิ่มขึ้นเล็กน้อย (LAN traffic)
+- ❌ ต้องออกแบบ API Contract ให้รัดกุม
+
+---
+
+## Decision Outcome
+
+**Chosen Option:** Option 3 — Physical Isolation + API-only Communication
+
+**Rationale:**
+
+การแยก AI ไปรันบน Admin Desktop (Desk-5439) และบังคับให้สื่อสารผ่าน DMS Backend API เท่านั้น เป็นแนวทางที่ Balance ระหว่าง Security, Privacy, และ Operational Feasibility ดีที่สุด ทำให้ AI ถูกมองว่าเป็น **Untrusted External Component** เสมอ แม้จะรันในเครือข่ายเดียวกัน
+
+---
+
+## AI Isolation Architecture
+
+### Infrastructure Layout
+
+| Component | Host | Zone | Network Access | Database Access |
+|-----------|------|------|----------------|-----------------|
+| **Ollama / OpenRAG** | Admin Desktop (Desk-5439) | Untrusted (AI Zone) | LAN only (QNAP NAS mount) | ❌ **ไม่มี** |
+| **DMS Backend** | QNAP NAS (Docker) | Trusted (App Zone) | LAN + Frontend | ✅ Full Access |
+| **MariaDB** | QNAP NAS | Trusted (DB Zone) | Localhost only | — |
+| **n8n** | QNAP NAS (Docker) | Trusted (Orchestrator) | LAN + DB | ✅ Via API only |
+
+### Communication Flow
+
+```
+┌─────────────────────────────────────────────────────────────────┐
+│  Untrusted Zone (AI Zone)                                        │
+│  Admin Desktop (Desk-5439) — RTX 2060 Super 8GB                  │
+│  ┌─────────────────┐    ┌─────────────────┐                     │
+│  │  Ollama (LLM)   │    │  OpenRAG        │                     │
+│  │  Port: 11434    │    │  (Docling)      │                     │
+│  └────────┬────────┘    └────────┬────────┘                     │
+└───────────┼─────────────────────┼───────────────────────────────┘
+            │                     │
+            │ HTTP API            │ Write JSON
+            │                     │
+┌───────────┼─────────────────────┼───────────────────────────────┐
+│           ▼                     ▼                               │
+│  Trusted Zone (App Zone)                                       │
+│  QNAP NAS (Docker)                                             │
+│  ┌─────────────────┐    ┌─────────────────┐    ┌────────────┐  │
+│  │  DMS Backend    │◀───│  n8n            │    │  MariaDB   │  │
+│  │  (NestJS)       │    │  (Poll JSON)    │    │  (Auth DB) │  │
+│  │  Port: 3001     │    └─────────────────┘    └────────────┘  │
+│  └────────┬────────┘                                            │
+│           │                                                    │
+│           │ Validation + Audit Log                             │
+│           ▼                                                    │
+│  ┌─────────────────┐                                            │
+│  │  Database       │                                            │
+│  │  (MariaDB)      │                                            │
+│  └─────────────────┘                                            │
+└─────────────────────────────────────────────────────────────────┘
+```
+
+> ⚠️ **ข้อห้าม:** Ollama/OpenRAG **ห้าม** อยู่บน QNAP NAS และ **ห้าม** มี Database Connection String
+
+---
+
+## Security Rules (Non-Negotiable)
+
+### Rule 1: Physical Isolation
+
+| ข้อกำหนด | รายละเอียด |
+|----------|------------|
+| **AI Host** | Admin Desktop (Desk-5439) เท่านั้น |
+| **Forbidden Hosts** | QNAP NAS, Production Servers, Cloud VM |
+| **Hardware** | i7-9700K / 32GB RAM / RTX 2060 Super 8GB |
+| **Network** | LAN (192.168.x.x) — No Public IP |
+
+### Rule 2: No Direct Database Access
+
+```typescript
+// ❌ FORBIDDEN — AI ห้ามเชื่อมต่อ Database โดยตรง
+const connection = await mysql.createConnection({
+  host: '192.168.1.100',
+  user: 'ai_service',  // NEVER!
+  password: '***',
+  database: 'lcbp3_dms'
+});
+
+// ✅ CORRECT — AI สื่อสารผ่าน DMS Backend API เท่านั้น
+const response = await fetch('http://192.168.1.100:3001/api/ai/analyze', {
+  method: 'POST',
+  headers: { 'Authorization': 'Bearer ' + ai_token },
+  body: JSON.stringify({ text: extractedText })
+});
+```
+
+### Rule 3: No Direct Storage Access
+
+```bash
+# ❌ FORBIDDEN — AI ห้ามเข้าถึง File System โดยตรง
+mv /data/dms/uploads/TCC-COR-0001.pdf /final/path/
+cp /staging_ai/*.pdf /processed/
+
+# ✅ CORRECT — ใช้ StorageService ผ่าน API เท่านั้น
+POST /api/storage/upload
+POST /api/migration/commit_batch
+```
+
+### Rule 4: Validation Layer
+
+```typescript
+// Backend ตรวจสอบ AI Output ทุกครั้งก่อน Write
+@Injectable()
+export class AiValidationService {
+  validateAiOutput(output: AiOutputDto): ValidationResult {
+    // 1. Schema Validation (Zod/class-validator)
+    const schemaCheck = this.validateSchema(output);
+
+    // 2. Confidence Threshold (≥ 0.85 auto-approve, 0.60–0.84 review, < 0.60 reject)
+    const confidenceCheck = this.checkConfidence(output.confidence);
+
+    // 3. Enum Enforcement (Category must be from System Enum)
+    const enumCheck = this.validateCategoryEnum(output.suggested_category);
+
+    // 4. Audit Log Recording
+    this.auditLog.record({
+      action: 'AI_VALIDATION',
+      source: 'AI_SERVICE',
+      confidence: output.confidence,
+      result: schemaCheck && confidenceCheck && enumCheck
+    });
+
+    return { isValid: schemaCheck && confidenceCheck && enumCheck };
+  }
+}
+```
+
+### Rule 5: Audit Logging
+
+| Event | Log Level | Fields |
+|-------|-----------|--------|
+| AI Request | INFO | `timestamp`, `source_ip`, `model`, `prompt_hash` |
+| AI Response | INFO | `timestamp`, `confidence`, `processing_time`, `response_hash` |
+| Validation Pass | INFO | `record_id`, `confidence`, `validator` |
+| Validation Fail | WARN | `record_id`, `reason`, `raw_response` |
+| Unauthorized Access | ERROR | `source_ip`, `attempted_action`, `blocked_by` |
+
+---
+
+## AI Communication Contract
+
+### API Endpoint Design
+
+```typescript
+// AI เรียก Backend (ผ่าน n8n หรือตรง)
+POST /api/ai/analyze-document
+Headers:
+  - Authorization: Bearer {ai_service_token}
+  - Idempotency-Key: {document_hash}
+  - X-AI-Source: ollama | openrag
+Body:
+  {
+    "extracted_text": "ข้อความจาก OCR...",
+    "document_type_hint": "pdf",
+    "source_file": "TCC-COR-2024-001.pdf"
+  }
+
+Response:
+  {
+    "is_valid": true,
+    "confidence": 0.92,
+    "suggested_category": "Correspondence",
+    "extracted_metadata": { ... },
+    "audit_log_id": "0195..."
+  }
+```
+
+### Authentication for AI Services
+
+| Service | Auth Method | Token Lifetime | Scope |
+|---------|-------------|----------------|-------|
+| **Ollama** | mTLS / IP Whitelist | Session-based | `ai:invoke` |
+| **n8n → Backend** | JWT (Service Account) | 1 hour | `migration:write`, `ai:read` |
+| **OpenRAG** | File-based (Shared NAS) | N/A | Write to `rag-output/` only |
+
+---
+
+## Data Flow Compliance
+
+### Flow 1: Migration (ADR-017)
+
+```
+[Scanned PDF] → [OCR on Desktop] → [Ollama AI] → [JSON Output]
+     │
+     ▼
+[DMS Backend API] → [Validation Layer] → [Audit Log]
+     │
+     ▼
+[Staging Table: migration_review_queue]
+     │
+     ▼
+[Human Review] → [Commit via Frontend] → [Permanent DB + Storage]
+```
+
+### Flow 2: RAG (OpenRAG)
+
+```
+[PDF Folder] → [OpenRAG on Desktop] → [JSON to rag-output/]
+     │
+     ▼
+[n8n Poll JSON] → [DMS Backend API] → [Validation + Audit]
+     │
+     ▼
+[Elasticsearch Index + MariaDB Metadata]
+```
+
+### Flow 3: Smart Categorization (ADR-017B)
+
+```
+[User Upload PDF] → [Temporary Storage]
+     │
+     ▼
+[Queue Job] → [Ollama AI via API]
+     │
+     ▼
+[Validation Layer] → [Suggestion to User]
+     │
+     ▼
+[User Confirm] → [Final Category Assignment]
+```
+
+---
+
+## Compliance Matrix
+
+| Requirement | Implementation | Evidence |
+|-------------|----------------|----------|
+| **ISO 27001 A.9.4.1** | JWT + mTLS for AI Auth | Token logs in `audit_logs` |
+| **ISO 27001 A.12.3.1** | IP Whitelist for AI Host | `192.168.x.x` only |
+| **PDPA Data Minimization** | AI ไม่เก็บข้อมูลระยะยาว | Temporary processing only |
+| **PDPA Security** | Physical Isolation + Encryption | TLS 1.3 for all API calls |
+| **OWASP BOLA** | UUID for all identifiers | ADR-019 Compliance |
+| **Zero Trust** | API-only communication | No direct DB/Storage access |
+
+---
+
+## Consequences
+
+### Positive Consequences
+
+1. ✅ **Security Hardened:** AI treated as untrusted component — all outputs validated
+2. ✅ **Audit Trail Complete:** Every AI interaction logged with hash + timestamp
+3. ✅ **Compliance Ready:** ISO 27001 + PDPA requirements met
+4. ✅ **Operational Safety:** AI failures don't compromise Production Database
+5. ✅ **Scalability:** Can add more AI services without security redesign
+
+### Negative Consequences
+
+1. ❌ **Complexity:** Need to maintain separate AI host + API contracts
+2. ❌ **Latency:** Network round-trip between AI and Backend (LAN only, acceptable)
+3. ❌ **Monitoring Overhead:** Need to monitor both QNAP and Desktop systems
+4. ❌ **Token Management:** Service accounts for AI need rotation policy
+
+### Mitigation Strategies
+
+- **Health Check:** Ollama `/api/tags` + Backend `/health` monitoring every 60 seconds
+- **Auto-Failover:** Switch to fallback model (mistral:7b) if primary model fails
+- **Token Rotation:** Service account JWT rotated every 7 days
+- **Network Redundancy:** อุปกรณ์สำรองสำหรับ Admin Desktop (เตรียมสำรอง)
+
+---
+
+## Security Checklist (Pre-Deployment)
+
+### 🔴 Critical (Must Pass)
+
+| Check | Command/Method | Expected Result |
+|-------|---------------|-----------------|
+| AI Host Isolation | `ping 192.168.1.100` from AI Host | Success (LAN only) |
+| No DB Access from AI | `mysql -h qnap_ip -u root` from Desktop | **Connection Refused** |
+| API Auth Required | `curl http://qnap:3001/api/ai/analyze` | 401 Unauthorized |
+| Valid Token Works | `curl -H "Authorization: Bearer {valid}" ...` | 200 OK |
+| Audit Log Written | `SELECT * FROM audit_logs WHERE source='AI_SERVICE'` | Records found |
+
+### 🟡 Important (Should Pass)
+
+| Check | Method | Expected Result |
+|-------|--------|-----------------|
+| TLS Enabled | `curl -v https://...` | TLS 1.3 handshake |
+| IP Whitelist Active | Try from unauthorized IP | Blocked by Firewall |
+| Token Expiration | Use expired JWT | 401 Token Expired |
+| Idempotency Key | Replay same request | 200 OK (no duplicate write) |
+
+---
+
+## Related Documents
+
+- [ADR-017: Ollama Data Migration Architecture](./ADR-017-ollama-data-migration.md) — Migration implementation following ADR-018
+- [ADR-017B: Smart Legacy Document Digitization](./ADR-017B-ollama.md) — Smart categorization use case
+- [ADR-016: Security & Authentication](./ADR-016-security-authentication.md) — General security strategy
+- [ADR-019: Hybrid Identifier Strategy](./ADR-019-hybrid-identifier-strategy.md) — UUID strategy for API security
+- [03-07-OpenRAG.md](../03-Data-and-Storage/03-07-OpenRAG.md) — RAG architecture under ADR-018
+- [03-05-n8n-migration-setup-guide.md](../03-Data-and-Storage/03-05-n8n-migration-setup-guide.md) — n8n setup with AI isolation
+
+---
+
+## Document History
+
+| Version | Date       | Author       | Changes                                                  |
+| ------- | ---------- | ------------ | -------------------------------------------------------- |
+| 1.8.1   | 2026-03-27 | Security Lead| Initial ADR — AI Boundary Policy (Physical Isolation)    |
+| 1.8.2   | 2026-04-03 | Tech Lead    | Updated — Aligned AI Model spec with ADR-017/017B        |
+
+---
+
+**Last Updated:** 2026-04-03
+**Status:** Accepted
+**Next Review:** 2026-06-01 (Quarterly security review)
@@ -0,0 +1,515 @@
+# ADR-020: AI Intelligence Integration Architecture
+
+**Status:** Proposed
+**Date:** 2026-04-03
+**Version:** 1.8.5
+**Decision Makers:** Development Team, AI Integration Lead, System Architect
+**Related Documents:**
+
+- [ADR-017: Ollama Data Migration Architecture](./ADR-017-ollama-data-migration.md)
+- [ADR-017B: Smart Legacy Document Digitization](./ADR-017B-ollama.md)
+- [ADR-018: AI Boundary Policy](./ADR-018-ai-boundary.md) — AI Physical Isolation
+- [ADR-019: Hybrid Identifier Strategy](./ADR-019-hybrid-identifier-strategy.md) — UUID Strategy
+- [n8n Migration Setup Guide](../03-Data-and-Storage/03-05-n8n-migration-setup-guide.md)
+
+> **หมายุ:** ADR-020 กำหนดสถาปัตยกรรมการผสานรวม AI Intelligence เข้ากับระบบ NAP-DMS แบบครบวงจร โดยใช้แนวทาง "RFA-First" เพื่อให้ครอบคลุมทั้งการนำเข้าเอกสารเก่า (Legacy Migration) และการสร้างเอกสารใหม่ (New Ingestion)
+
+---
+
+## Context and Problem Statement
+
+### ปัญหาที่ต้องการแก้ไข
+
+ระบบ NAP-DMS v1.8.5 ต้องการเพิ่มประสิทธิภาพการทำงานกับเอกสารวิศวกรรมโดยใช้ AI Intelligence ใน 2 สถานการณ์หลัก:
+
+1. **Legacy Document Migration:** มีเอกสาร PDF เก่าจำนวนมากที่ต้องนำเข้าระบบ พร้อมตรวจสอบความถูกต้องระหว่าง Metadata ใน Excel กับเนื้อหาใน PDF
+2. **New Document Ingestion:** ผู้ใช้งานอัปโหลดเอกสารใหม่และต้องการความช่วยเหลือจาก AI ในการสกัดข้อมูลอัตโนมัติ
+
+### ข้อจำกัดและข้อกำหนด
+
+- **Security (ADR-018):** AI ต้องรันบน Admin Desktop (Desk-5439) แยกส่วนกับระบบหลัก
+- **Data Privacy:** ห้ามส่งข้อมูลขึ้น Cloud Provider ต้องประมวลผลภายในองค์กรเท่านั้น
+- **Human-in-the-Loop:** ข้อมูลที่ AI สกัดต้องผ่านการตรวจสอบโดยมนุษย์เสมอ
+- **Thai Language Support:** ต้องรองรับเอกสารภาษาไทยและวิศวกรรม
+
+---
+
+## Decision Drivers
+
+- **RFA-First Approach:** เริ่มจากเอกสาร RFA (Request for Approval) ที่มีความซับซ้อนสูง
+- **Unified Architecture:** ใช้ Pipeline และ Component ร่วมกันทั้ง 2 รูปแบบการทำงาน
+- **Data Integrity:** รักษาความถูกต้องของข้อมูลเป็นสำคัญสูงสุด
+- **User Experience:** จัดหมวดหมู่ระหว่าง Batch Throughput กับ Real-time UX
+- **Cost Efficiency:** ใช้ Ollama แบบ On-Premise เพื่อลดต้นทุน
+- **Maintainability:** แยก Logic ของ AI ออกจาก Core Application
+
+---
+
+## Considered Options
+
+### Option 1: Separate AI Systems per Use Case
+
+**Pros:**
+- ✅ เชี่ยวชาญเฉพาะทาง (Specialized)
+- ✅ แยก Failure Domain
+
+**Cons:**
+- ❌ Code Dupification สูง
+- ❌ บำรุงรักษายาก (Multiple systems)
+- ❌ Inconsistent AI Behavior
+
+### Option 2: Unified AI Pipeline with Different Frontends ⭐ (Selected)
+
+**Pros:**
+- ✅ **Single Source of Truth:** Pipeline กลางเดียว
+- ✅ **Reusable Components:** DocumentReviewForm ใช้ร่วมกันได้
+- ✅ **Consistent Quality:** Prompt และ Model เดียวกัน
+- ✅ **Easier Maintenance:** แก้ไขที่เดียว ใช้ได้ทั้งหมด
+- ✅ **Cost Effective:** ใช้ Ollama รุ่นเดียว (Gemma 4)
+
+**Cons:**
+- ❌ ต้องออกแบบให้รองรับทั้ง Batch และ Real-time
+- ❌ Complex Component Design
+
+---
+
+## Decision Outcome
+
+**Chosen Option:** Option 2 — Unified AI Pipeline with Different Frontends
+
+**Rationale:**
+
+การสร้าง Pipeline กลางเดียวสำหรับ AI และใช้ Component ร่วมกันทาง Frontend จะช่วยลดความซับซ้อนในการบำรุงรักษา และรับประกันความสม่ำเสมอของคุณภาพ AI ทั้งในการนำเข้าเอกสารเก่าและใหม่
+
+---
+
+## Architecture Overview
+
+### Core Technology Stack
+
+| Component | Technology | Host | Purpose |
+|-----------|------------|------|---------|
+| **AI Engine** | Ollama + Gemma 4 | Admin Desktop (Desk-5439) | LLM Inference |
+| **OCR Engine** | PaddleOCR | Admin Desktop (Desk-5439) | Thai/English Text Extraction |
+| **Orchestrator** | n8n | QNAP NAS (Docker) | Workflow Management |
+| **AI Gateway** | NestJS AiModule | QNAP NAS (Docker) | API Gateway & Validation |
+
+### Data Flow Architecture
+
+```
+┌─────────────────────────────────────────────────────────────────┐
+│                    AI Processing Flow                           │
+├─────────────────────────────────────────────────────────────────┤
+│                                                                 │
+│  ┌─────────────┐    ┌─────────────┐    ┌─────────────────┐     │
+│  │   Input     │───▶│    n8n      │───▶│  AI Services    │     │
+│  │  (PDF/Excel)│    │  Workflow   │    │ (OCR+LLM)       │     │
+│  └─────────────┘    └─────────────┘    └────────┬────────┘     │
+│                                              │                 │
+│                                              ▼                 │
+│  ┌─────────────────────────────────────────────────────────┐   │
+│  │              DMS Backend API                           │   │
+│  │  ┌─────────────┐    ┌─────────────┐    ┌────────────┐  │   │
+│  │  │AiService    │    │Validation   │    │Audit Log   │  │   │
+│  │  │Gateway      │◀───│Layer        │◀───│Service     │  │   │
+│  │  └─────────────┘    └─────────────┘    └────────────┘  │   │
+│  └─────────────────────────────────────────────────────────┘   │
+│                              │                                 │
+│                              ▼                                 │
+│  ┌─────────────────────────────────────────────────────────┐   │
+│  │                 Frontend Layer                           │   │
+│  │  ┌─────────────────────┐    ┌─────────────────────┐     │   │
+│  │  │  Migration Dashboard │    │  Document Review   │     │   │
+│  │  │      (Admin)        │    │     Form (User)     │     │   │
+│  │  └─────────────────────┘    └─────────────────────┘     │   │
+│  └─────────────────────────────────────────────────────────┘   │
+└─────────────────────────────────────────────────────────────────┘
+```
+
+---
+
+## Implementation Modules
+
+### Backend Components
+
+#### 1. AiModule & AiService
+
+```typescript
+@Injectable()
+export class AiService {
+  // Single entry point for all AI operations
+  async extractMetadata(fileId: string): Promise<AiExtractionResult> {
+    // 1. Send to n8n workflow
+    // 2. Wait for OCR + LLM processing
+    // 3. Validate results
+    // 4. Return structured data
+  }
+
+  async validateExtraction(result: AiExtractionResult): Promise<ValidationResult> {
+    // Confidence scoring, enum validation, audit logging
+  }
+}
+```
+
+#### 2. Migration Entity
+
+```sql
+CREATE TABLE migration_logs (
+  id INT AUTO_INCREMENT PRIMARY KEY,
+  publicId BINARY(16) DEFAULT (UUID_TO_BIN(UUID(), 1)),
+  source_file VARCHAR(255) NOT NULL,
+  source_metadata JSON, -- Data from Excel
+  ai_extracted JSON, -- Data from AI
+  confidence_score DECIMAL(3,2),
+  status ENUM('PENDING_REVIEW', 'APPROVED', 'REJECTED'),
+  reviewed_by INT,
+  reviewed_at TIMESTAMP NULL,
+  created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
+  updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP
+);
+```
+
+#### 3. API Endpoints
+
+| Endpoint | Purpose | Access |
+|----------|---------|--------|
+| `POST /api/ai/extract` | Real-time extraction | Authenticated Users |
+| `POST /api/migration/batch` | Batch migration | Admin Only |
+| `GET /api/migration/queue` | Review queue | Admin Only |
+| `POST /api/migration/commit` | Commit approved items | Admin Only |
+
+### Frontend Components
+
+#### 1. DocumentReviewForm (Reusable Component)
+
+```typescript
+interface DocumentReviewFormProps {
+  // Source: Migration Table or AI API Response
+  sourceData: MigrationItem | AiExtractionResult;
+  // Mode: 'migration' | 'new'
+  mode: 'migration' | 'new';
+  onSubmit: (validatedData: ValidatedDocument) => void;
+}
+
+// Features:
+// - Highlight AI-suggested fields
+// - Show confidence scores
+// - Allow human correction
+// - Track feedback for AI improvement
+```
+
+#### 2. Migration Dashboard (Admin)
+
+```typescript
+// Features:
+// - Filter by confidence level
+// - Bulk approve/reject
+// - Compare source vs AI data
+// - Export review reports
+```
+
+---
+
+## Workflow Specifications
+
+### Workflow 1: Legacy Migration (Batch Processing)
+
+```
+Input: Excel Metadata + PDF Files
+  │
+  ▼
+n8n Workflow:
+  1. Read Excel row
+  2. Send PDF to PaddleOCR
+  3. Extract Thai/English text
+  4. Send text + metadata to Gemma 4
+  5. AI validates consistency
+  6. Generate confidence score
+  7. Store in migration_logs (PENDING_REVIEW)
+  │
+  ▼
+Output: Migration Dashboard for Admin Review
+  │
+  ▼
+Action: Admin approves → Commit to permanent storage
+```
+
+### Workflow 2: New Ingestion (Real-time Processing)
+
+```
+Input: User uploads PDF in RFA creation form
+  │
+  ▼
+n8n Workflow (Real-time):
+  1. OCR extraction (PaddleOCR)
+  2. AI analysis (Gemma 4)
+  3. Return suggestions to frontend
+  │
+  ▼
+Output: Form auto-fill with AI suggestions
+  │
+  ▼
+Action: User reviews/edits → Saves to database
+```
+
+---
+
+## AI Model Configuration
+
+### Gemma 4 Prompt Strategy
+
+```prompt
+You are an AI assistant for Laem Chabang Phase 3 construction project document analysis.
+
+TASK: Extract and validate metadata from engineering documents.
+
+RULES:
+1. Extract: Subject, Date, Discipline, Drawing Reference, Contract Number
+2. Validate: Check consistency between provided metadata and document content
+3. Confidence: Rate accuracy (0-100%) for each extracted field
+4. Language: Support Thai and English engineering terms
+5. Format: Return structured JSON
+
+OUTPUT FORMAT:
+{
+  "extracted_metadata": {
+    "subject": "...",
+    "date": "YYYY-MM-DD",
+    "discipline": "Civil|Mechanical|Electrical|Architectural",
+    "drawing_reference": "...",
+    "contract_number": "..."
+  },
+  "validation": {
+    "is_consistent": true|false,
+    "discrepancies": ["..."],
+    "confidence_score": 0.95
+  }
+}
+```
+
+### Confidence Scoring Strategy
+
+| Score Range | Action |
+|-------------|--------|
+| **95-100%** | Auto-approve (migration only) |
+| **85-94%** | Low priority review |
+| **60-84%** | High priority review |
+| **< 60%** | Reject / Requires manual entry |
+
+---
+
+## Security & Compliance
+
+### AI Boundary Enforcement (ADR-018)
+
+| Rule | Implementation |
+|------|----------------|
+| **Physical Isolation** | AI runs on Admin Desktop only |
+| **No Direct DB Access** | All communication via DMS API |
+| **API Authentication** | JWT tokens with `ai:invoke` scope |
+| **Audit Logging** | Every AI interaction logged |
+| **Human Validation** | No auto-commit without review |
+
+### Data Privacy Measures
+
+- **Local Processing Only:** No data leaves corporate network
+- **Temporary Storage:** AI processes data in memory only
+- **Encryption:** All API calls use TLS 1.3
+- **Data Retention:** AI logs retained for 90 days only
+
+---
+
+## Implementation Roadmap
+
+### Phase 1: Pipeline Infrastructure (Task BE-AI-01)
+
+**Week 1-2: AI Pipeline Foundation**
+1. **Docker Environment Setup** on Admin Desktop (Desk-5439)
+   - n8n service with Basic Authentication
+   - Ollama with Gemma 4 model (GPU optimized)
+   - PaddleOCR service with Thai language support
+2. **n8n Workflow Development**
+   - Webhook trigger for DMS integration
+   - OCR → AI → JSON processing pipeline
+   - Error handling and retry logic
+3. **Prompt Engineering**
+   - Thai engineering document templates
+   - JSON schema validation
+   - Confidence scoring implementation
+4. **Integration Testing**
+   - End-to-end pipeline validation
+   - Security boundary verification
+   - Performance benchmarking
+
+### Phase 2: Backend AI Gateway (Task BE-AI-02)
+
+**Week 3-4: NestJS Integration Layer**
+1. **Database Schema Design** (SQL First)
+   - `migration_logs` table with UUIDv7 primary keys
+   - `ai_audit_logs` for performance tracking
+   - Data dictionary updates
+2. **AI Module Architecture**
+   - `AiService` with n8n webhook integration
+   - `MigrationService` for business logic
+   - Validation layer with confidence thresholds
+3. **API Endpoints & Security**
+   - Admin migration endpoints with CASL guards
+   - Real-time extraction endpoint (`/api/ai/extract`)
+   - Idempotency and rate limiting implementation
+4. **Configuration Management**
+   - Service account authentication
+   - Environment variables for AI endpoints
+   - Monitoring and logging setup
+
+### Phase 3: Frontend Human-in-the-Loop (Task FE-AI-03)
+
+**Week 5-6: User Experience & Validation**
+1. **Reusable AI Components**
+   - `AiSuggestionField` with confidence indicators
+   - `DocumentComparisonView` with PDF sidebar
+   - Client-side validation with Zod + React Hook Form
+2. **Admin Migration Dashboard**
+   - Paginated table with filtering/sorting
+   - Bulk actions for high-confidence items
+   - Error logging and retry mechanisms
+3. **Real-time Ingestion Integration**
+   - RFA creation flow enhancement
+   - AI processing state indicators
+   - Auto-fill with user override capability
+4. **Human-AI Feedback Loop**
+   - User correction tracking
+   - Performance analytics dashboard
+   - Accuracy metrics and reporting
+
+### Phase 4: Testing & Deployment
+
+**Week 7-8: Production Readiness**
+1. **Comprehensive Testing**
+   - Thai/English document validation
+   - Confidence scoring accuracy verification
+   - Load testing and performance optimization
+2. **Security Audit**
+   - ADR-018 boundary verification
+   - Penetration testing of AI endpoints
+   - Data privacy compliance check
+3. **User Training & Documentation**
+   - Admin workflow training
+   - User guide for AI-assisted document creation
+   - Troubleshooting and support procedures
+4. **Production Deployment**
+   - Blue-green deployment strategy
+   - Monitoring and alerting setup
+   - Rollback procedures and contingency plans
+
+---
+
+## Success Metrics
+
+### Technical Performance Metrics
+
+| Metric | Target | Measurement Method |
+|--------|--------|-------------------|
+| **Thai OCR Accuracy** | >90% | Character-by-character comparison with ground truth |
+| **AI JSON Validity** | 100% | Automated validation of all AI responses |
+| **Processing Time** | <15s/document | End-to-end timing from upload to suggestion |
+| **GPU Memory Usage** | <6GB per doc | Resource monitoring on Admin Desktop |
+| **System Uptime** | >99% | Service availability monitoring |
+
+### Business Impact Metrics
+
+| Metric | Target | Measurement Method |
+|--------|--------|-------------------|
+| **Data Entry Time Reduction** | 70% | Time comparison manual vs AI-assisted workflows |
+| **AI Accuracy Rate** | 85%+ | Human verification of AI extractions |
+| **Migration Throughput** | 1000 docs/day | Batch processing capacity with admin review |
+| **User Correction Rate** | <15% | Percentage of AI suggestions modified by users |
+| **Admin Productivity** | 3x improvement | Documents processed per admin hour |
+
+### User Experience Metrics
+
+| Metric | Target | Measurement Method |
+|--------|--------|-------------------|
+| **User Satisfaction** | 4.0/5.0 | Post-deployment user survey |
+| **Task Completion Rate** | >95% | Successful document creation rate |
+| **Learning Curve** | <30 min | Time to proficiency for new users |
+| **Error Rate** | <2% | Failed AI extractions requiring manual intervention |
+
+### Security & Compliance Metrics
+
+| Metric | Target | Measurement Method |
+|--------|--------|-------------------|
+| **Security Incidents** | 0 | Audit log monitoring and breach detection |
+| **Data Privacy Compliance** | 100% | Adherence to ADR-018 and PDPA requirements |
+| **Audit Trail Completeness** | 100% | All AI interactions logged and traceable |
+| **API Response Times** | <200ms | DMS API performance under load |
+
+---
+
+## Risk Assessment & Mitigation
+
+### 🔴 High-Risk Items
+
+| Risk | Impact | Probability | Mitigation Strategy |
+|------|--------|-------------|-------------------|
+| **AI Accuracy on Thai Documents** | High | Medium | Custom prompt engineering + Extensive testing with Thai engineering docs |
+| **Admin Desktop Hardware Failure** | High | Low | Backup desktop ready + Cloud AI fallback plan (emergency only) |
+| **Data Privacy Violations** | Critical | Low | Strict ADR-018 enforcement + Regular security audits |
+| **Performance Bottlenecks** | Medium | Medium | Queue system + GPU monitoring + Load balancing |
+
+### 🟡 Medium-Risk Items
+
+| Risk | Impact | Probability | Mitigation Strategy |
+|------|--------|-------------|-------------------|
+| **User Adoption Resistance** | Medium | Medium | Comprehensive training + UI/UX optimization + Early user involvement |
+| **Thai OCR Quality Issues** | Medium | Medium | Multiple OCR engines + Manual correction workflow |
+| **Integration Complexity** | Medium | Low | Phased deployment + Extensive testing + Rollback procedures |
+| **Cost Overruns** | Medium | Low | On-premise AI eliminates per-use costs | Resource monitoring |
+
+### 🟢 Low-Risk Items
+
+| Risk | Impact | Probability | Mitigation Strategy |
+|------|--------|-------------|-------------------|
+| **Technology Stack Changes** | Low | Low | Containerized deployment + Version pinning |
+| **Vendor Dependency** | Low | Low | Open-source stack + Multiple model options |
+| **Regulatory Changes** | Medium | Low | Flexible architecture + Compliance monitoring |
+
+---
+
+## Related Documents & Tasks
+
+### Architecture Decision Records
+- **[ADR-017: Ollama Data Migration](./ADR-017-ollama-data-migration.md)** — Foundation migration architecture
+- **[ADR-017B: Smart Categorization](./ADR-017B-ollama.md)** — AI categorization use cases
+- **[ADR-018: AI Boundary Policy](./ADR-018-ai-boundary.md)** — Security isolation requirements (CRITICAL)
+- **[ADR-019: Hybrid Identifier Strategy](./ADR-019-hybrid-identifier-strategy.md)** — UUID usage patterns (CRITICAL)
+
+### Implementation Tasks
+- **[Task BE-AI-01: Pipeline Infrastructure Setup](../08-Tasks/Task%20BE-AI-01.md)** — n8n + PaddleOCR + Gemma 4 setup
+- **[Task BE-AI-02: Backend AI Gateway Development](../08-Tasks/Task%20BE-AI-02.md)** — NestJS integration layer
+- **[Task FE-AI-03: Frontend Human-in-the-Loop Interface](../08-Tasks/Task-FE-AI-03.md)** — User experience and validation
+
+### Technical Specifications
+- **[03-05-n8n-migration-setup-guide.md](../03-Data-and-Storage/03-05-n8n-migration-setup-guide.md)** — n8n configuration details
+- **[05-02-backend-guidelines.md](../05-Engineering-Guidelines/05-02-backend-guidelines.md)** — NestJS patterns and conventions
+- **[05-03-frontend-guidelines.md](../05-Engineering-Guidelines/05-03-frontend-guidelines.md)** — Next.js patterns and UI standards
+- **[03-01-data-dictionary.md](../03-Data-and-Storage/03-01-data-dictionary.md)** — Field definitions and business rules
+
+### Compliance & Security
+- **[ADR-016: Security & Authentication](./ADR-016-security-authentication.md)** — Overall security framework
+- **[04-08-release-management-policy.md](../04-Infrastructure-OPS/04-08-release-management-policy.md)** — Deployment procedures
+
+---
+
+## Document History
+
+| Version | Date | Author | Changes |
+|---------|------|--------|---------|
+| 1.8.5 | 2026-04-03 | AI Integration Lead | Initial ADR — AI Intelligence Integration Architecture |
+| 1.8.6 | 2026-04-03 | Tech Lead | Updated — Aligned with detailed task specifications and implementation requirements |
+
+---
+
+**Last Updated:** 2026-04-03
+**Status:** Proposed
+**Review Date:** 2026-04-10
+**Implementation Target:** v1.9.0
@@ -75,6 +75,15 @@ Architecture Decision Records (ADRs) เป็นเอกสารที่บ
 | -------------------------------------------------- | -------------------------- | ----------- | ---------- | ---------------------------------------------------- |
 | [ADR-019](./ADR-019-hybrid-identifier-strategy.md) | Hybrid Identifier Strategy | ✅ Accepted | 2026-03-11 | INT PK (internal) + UUIDv7 (public API) on 14 tables |

+### AI & Data Integration
+
+| ADR                                             | Title                              | Status        | Date       | Summary                                                                      |
+| ----------------------------------------------- | ---------------------------------- | ------------- | ---------- | ---------------------------------------------------------------------------- |
+| [ADR-017](./ADR-017-ollama-data-migration.md)  | Ollama Data Migration Architecture | ✅ Accepted   | 2026-02-26 | On-premise AI (Ollama) สำหรับ Migration 20,000+ PDFs พร้อม Validation Layer |
+| [ADR-017B](./ADR-017B-ollama.md)               | Smart Legacy Document Digitization | ✅ Accepted   | 2026-03-27 | AI-powered categorization สำหรับเอกสารเก่า ตาม ADR-018 (AI Isolation)   |
+| [ADR-018](./ADR-018-ai-boundary.md)            | AI Boundary Policy                 | ✅ Accepted   | 2026-03-27 | Physical Isolation + API-only communication (Zero Trust for AI)              |
+| [ADR-020](./ADR-020-ai-intelligence-integration.md) | AI Intelligence Integration Architecture | 🔄 Proposed | 2026-04-03 | Unified AI Pipeline สำหรับ RFA-First (Legacy Migration + New Ingestion)    |
+
 ---

 ## 🔍 ADR Categories
@@ -120,6 +129,13 @@ Architecture Decision Records (ADRs) เป็นเอกสารที่บ

 - **ADR-019:** Hybrid Identifier Strategy - INT PK (internal) + UUIDv7 (public API) บน 14 tables

+### 9. AI & Data Integration
+
+- **ADR-017:** Ollama Data Migration - On-premise AI (Ollama) สำหรับ Migration 20,000+ PDFs
+- **ADR-017B:** Smart Document Digitization - AI-powered categorization สำหรับเอกสารเก่า
+- **ADR-018:** AI Boundary Policy - Physical Isolation + API-only communication (Zero Trust)
+- **ADR-020:** AI Intelligence Integration - Unified AI Pipeline สำหรับ RFA-First approach
+
 ---

 ## 📖 How to Read ADRs