8.1 KiB
8.1 KiB
Task BE-AI-01: Pipeline Infrastructure Setup
Phase: Step 1 - AI Pipeline Foundation (n8n + PaddleOCR + Gemma 4) ADR Compliance: ADR-018 (AI Boundary), ADR-019 (UUID Strategy) Priority: 🔴 Critical - Foundation for all AI features
Context: เป็นรากฐานสำคัญของระบบ Document Intelligence ตาม ADR-020 โดยต้องเป็นไปตามนโยบาย AI Isolation และใช้ Identifier ที่ถูกต้อง
📋 Implementation Tasks
AI-1.1: Infrastructure Setup (ADR-018 Compliance)
- Docker Environment on Admin Desktop (Desk-5439):
- ติดตั้ง Docker Compose สำหรับ n8n และ PaddleOCR
- ตั้งค่า Network Isolation (LAN only, no public access)
- ตรวจสอบ Hardware: RTX 2060 Super 8GB VRAM availability
- n8n Service:
- Docker Compose service พร้อม Basic Authentication
- Webhook endpoint:
/webhook/ai-processing - Environment variables:
N8N_BASIC_AUTH_USER,N8N_BASIC_AUTH_PASSWORD
- Ollama Service:
- Pull model:
gemma4:e4b(GPU optimized, higher accuracy) - API endpoint:
http://localhost:11434 - Health check:
GET /api/tags - Memory requirement: Minimum 8GB VRAM for 9B model
- ollama run gemma4
- ตั้ง Windows System Environment Variables หรือใน PowerShell:
- [System.Environment]::SetEnvironmentVariable('OLLAMA_HOST', '0.0.0.0', 'User')
- [System.Environment]::SetEnvironmentVariable('OLLAMA_KEEP_ALIVE', '30m', 'User')
- [System.Environment]::SetEnvironmentVariable("OLLAMA_NUM_GPU", "1", "User")
- [System.Environment]::SetEnvironmentVariable("OLLAMA_NUM_THREAD", "8", "User")
- Pull model:
- [ ] **PaddleOCR Service:**
- Docker image: `paddlepaddle/paddle:latest-gpu`
- Thai language support configuration
- API endpoint design: `POST /ocr/extract`
### **AI-1.2: n8n Workflow Development**
- [ ] **Webhook Trigger Node:**
- Input: `{ publicId: string, fileUrl: string, context: 'migration'|'ingestion' }`
- Validation: Verify `publicId` format (UUIDv7) before processing
- Idempotency check: Prevent duplicate processing
- [ ] **OCR Integration Node:**
- HTTP Request to PaddleOCR service
- Input: Binary file data
- Output: `{ text: string, confidence: number, language: 'th'|'en' }`
- Error handling: Retry logic + fallback to CPU OCR
- [ ] **Prompt Engineering Node:**
- Function Node to construct Gemma 4 prompt
- Template includes: Role definition, validation rules, JSON schema
- Thai engineering context keywords
- [ ] **Gemma 4 LLM Node:**
- HTTP Request to Ollama API
- Model: `gemma:9b` (enhanced accuracy for Thai engineering documents)
- Parameters: `temperature: 0.1`, `max_tokens: 2048`
- Output validation: Ensure valid JSON response
- Memory monitoring: Track VRAM usage during inference
- [ ] **Result Processing Node:**
- Parse and validate AI response
- Calculate confidence scores
- Format for DMS Backend API callback
- [ ] **Callback to DMS:**
- HTTP POST to NestJS webhook endpoint
- Payload: `{ publicId, extractedData, confidence, processingTime }`
- Authentication: Service account JWT
### **AI-1.3: Prompt Engineering for Thai Engineering Documents**
- [ ] **System Prompt Template:**
```prompt
You are a Senior Document Controller for Laem Chabang Port Phase 3 construction project.
TASK: Extract metadata from engineering documents with high accuracy.
RULES:
1. Extract: subject, document_date, discipline, drawing_reference, contract_number
2. Validate consistency between content and metadata
3. Return confidence score (0-100%) for each field
4. Support Thai and English engineering terms
5. Output MUST be valid JSON only
OUTPUT FORMAT:
{
"subject": "string",
"document_date": "YYYY-MM-DD",
"discipline": "Civil|Mechanical|Electrical|Architectural",
"drawing_reference": "string",
"contract_number": "string",
"confidence": {
"overall": 0.95,
"field_confidence": {...}
}
}
- Thai Language Optimization:
- Engineering terms: "วิศวกรรมโยธา", "แบบรายละเอียด", "ขออนุมัติ", "แผนผัง"
- Date format recognition: Thai Buddhist years (พ.ศ.)
- Organization names: "ท่าเรือแหลมฉบัง", "ก.ท.ม.", "การท่าเรือฯ"
- JSON Schema Validation:
- Zod schema for response validation
- Required fields enforcement
- Type checking and sanitization
AI-1.4: Integration Testing & Validation
- Test Case 1: Legacy Migration Flow:
- Input: Scanned RFA PDF + Excel metadata
- Expected: Thai text extraction >90% accuracy
- Validation: AI output matches Excel data
- Test Case 2: Real-time Ingestion Flow:
- Input: New PDF upload from user
- Expected: Response time <15 seconds
- Validation: Structured JSON response
- Performance Benchmarking:
- Target: <15 seconds per document
- Memory usage monitoring on Admin Desktop
- GPU utilization tracking
- Security Validation:
- Verify no external network calls
- Confirm AI services run in isolation
- Test authentication between n8n and DMS
✅ Acceptance Criteria
-
Pipeline Functionality:
- n8n successfully processes PDF → OCR → AI → JSON flow
- Thai text extraction accuracy >90%
- Gemma 4 returns valid JSON 100% of time
-
Security Compliance (ADR-018):
- All services run on Admin Desktop only
- No external network connections
- Proper authentication between services
-
Data Integrity:
- Extracted metadata matches document content >85%
- Confidence scoring implemented and accurate
- Idempotency prevents duplicate processing
-
Performance:
- Processing time <20 seconds per document (gemma:9b)
- GPU memory usage <8GB per document
- System remains stable under load
� Critical Rules (Non-Negotiable)
- ADR-018 Compliance: AI services MUST run on Admin Desktop ONLY
- No Direct DB Access: Pipeline communicates via DMS API only
- UUID Strategy: All document references use
publicId(UUIDv7) - Thai Language Support: Must handle Thai engineering documents
- Error Handling: All failures must log to DMS audit system
📁 Related Specifications
- ADR-018: AI Boundary Policy - Physical isolation requirements
- ADR-019: Hybrid Identifier Strategy - UUID usage patterns
- ADR-020: AI Intelligence Integration - Overall architecture
- 03-05-n8n-migration-setup-guide.md: n8n configuration details
📝 Implementation Notes
Docker Compose Structure
services:
n8n:
image: n8nio/n8n:latest
ports: ["5678:5678"]
environment:
- N8N_BASIC_AUTH_ACTIVE=true
- N8N_BASIC_AUTH_USER=admin
- N8N_BASIC_AUTH_PASSWORD=${N8N_PASSWORD}
paddleocr:
image: paddlepaddle/paddle:latest-gpu
ports: ["8866:8866"]
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: 1
capabilities: [gpu]
device_ids: ["0"] # RTX 2060 Super
environment:
- CUDA_VISIBLE_DEVICES=0
shm_size: 2gb
ollama:
image: ollama/ollama:latest
ports: ["11434:11434"]
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: 1
capabilities: [gpu]
device_ids: ["0"] # RTX 2060 Super
environment:
- CUDA_VISIBLE_DEVICES=0
- OLLAMA_MAX_LOADED_MODELS=1
- OLLAMA_NUM_PARALLEL=2
volumes:
- ollama_data:/root/.ollama
pull_policy: always
volumes:
ollama_data:
Hardware Requirements
- GPU: RTX 2060 Super 8GB VRAM (minimum for gemma:9b)
- RAM: 32GB system memory recommended
- Storage: 100GB SSD for models and temporary files
- Network: Gigabit LAN for file transfers
Model Specifications
- gemma:9b - 9 billion parameters, optimized for Thai
- VRAM Usage: ~7-8GB for inference
- Performance: ~15-20 seconds per document
- Accuracy: Expected 90%+ for Thai engineering documents