Files
lcbp3/specs/08-Tasks/Task BE-AI-01.md
T
admin c95e0f537e
CI / CD Pipeline / build (push) Successful in 4m34s
CI / CD Pipeline / deploy (push) Successful in 7m33s
690404:1139 Modify ADR
2026-04-04 11:39:56 +07:00

8.1 KiB

Task BE-AI-01: Pipeline Infrastructure Setup

Phase: Step 1 - AI Pipeline Foundation (n8n + PaddleOCR + Gemma 4) ADR Compliance: ADR-018 (AI Boundary), ADR-019 (UUID Strategy) Priority: 🔴 Critical - Foundation for all AI features

Context: เป็นรากฐานสำคัญของระบบ Document Intelligence ตาม ADR-020 โดยต้องเป็นไปตามนโยบาย AI Isolation และใช้ Identifier ที่ถูกต้อง


📋 Implementation Tasks

AI-1.1: Infrastructure Setup (ADR-018 Compliance)

  • Docker Environment on Admin Desktop (Desk-5439):
    • ติดตั้ง Docker Compose สำหรับ n8n และ PaddleOCR
    • ตั้งค่า Network Isolation (LAN only, no public access)
    • ตรวจสอบ Hardware: RTX 2060 Super 8GB VRAM availability
  • n8n Service:
    • Docker Compose service พร้อม Basic Authentication
    • Webhook endpoint: /webhook/ai-processing
    • Environment variables: N8N_BASIC_AUTH_USER, N8N_BASIC_AUTH_PASSWORD
  • Ollama Service:
    • Pull model: gemma4:e4b (GPU optimized, higher accuracy)
    • API endpoint: http://localhost:11434
    • Health check: GET /api/tags
    • Memory requirement: Minimum 8GB VRAM for 9B model
    • ollama run gemma4
    • ตั้ง Windows System Environment Variables หรือใน PowerShell:
      • [System.Environment]::SetEnvironmentVariable('OLLAMA_HOST', '0.0.0.0', 'User')
      • [System.Environment]::SetEnvironmentVariable('OLLAMA_KEEP_ALIVE', '30m', 'User')
      • [System.Environment]::SetEnvironmentVariable("OLLAMA_NUM_GPU", "1", "User")
      • [System.Environment]::SetEnvironmentVariable("OLLAMA_NUM_THREAD", "8", "User")

- [ ] **PaddleOCR Service:**
  - Docker image: `paddlepaddle/paddle:latest-gpu`
  - Thai language support configuration
  - API endpoint design: `POST /ocr/extract`

### **AI-1.2: n8n Workflow Development**
- [ ] **Webhook Trigger Node:**
  - Input: `{ publicId: string, fileUrl: string, context: 'migration'|'ingestion' }`
  - Validation: Verify `publicId` format (UUIDv7) before processing
  - Idempotency check: Prevent duplicate processing
- [ ] **OCR Integration Node:**
  - HTTP Request to PaddleOCR service
  - Input: Binary file data
  - Output: `{ text: string, confidence: number, language: 'th'|'en' }`
  - Error handling: Retry logic + fallback to CPU OCR
- [ ] **Prompt Engineering Node:**
  - Function Node to construct Gemma 4 prompt
  - Template includes: Role definition, validation rules, JSON schema
  - Thai engineering context keywords
- [ ] **Gemma 4 LLM Node:**
  - HTTP Request to Ollama API
  - Model: `gemma:9b` (enhanced accuracy for Thai engineering documents)
  - Parameters: `temperature: 0.1`, `max_tokens: 2048`
  - Output validation: Ensure valid JSON response
  - Memory monitoring: Track VRAM usage during inference
- [ ] **Result Processing Node:**
  - Parse and validate AI response
  - Calculate confidence scores
  - Format for DMS Backend API callback
- [ ] **Callback to DMS:**
  - HTTP POST to NestJS webhook endpoint
  - Payload: `{ publicId, extractedData, confidence, processingTime }`
  - Authentication: Service account JWT

### **AI-1.3: Prompt Engineering for Thai Engineering Documents**
- [ ] **System Prompt Template:**
  ```prompt
  You are a Senior Document Controller for Laem Chabang Port Phase 3 construction project.

  TASK: Extract metadata from engineering documents with high accuracy.

  RULES:
  1. Extract: subject, document_date, discipline, drawing_reference, contract_number
  2. Validate consistency between content and metadata
  3. Return confidence score (0-100%) for each field
  4. Support Thai and English engineering terms
  5. Output MUST be valid JSON only

  OUTPUT FORMAT:
  {
    "subject": "string",
    "document_date": "YYYY-MM-DD",
    "discipline": "Civil|Mechanical|Electrical|Architectural",
    "drawing_reference": "string",
    "contract_number": "string",
    "confidence": {
      "overall": 0.95,
      "field_confidence": {...}
    }
  }
  • Thai Language Optimization:
    • Engineering terms: "วิศวกรรมโยธา", "แบบรายละเอียด", "ขออนุมัติ", "แผนผัง"
    • Date format recognition: Thai Buddhist years (พ.ศ.)
    • Organization names: "ท่าเรือแหลมฉบัง", "ก.ท.ม.", "การท่าเรือฯ"
  • JSON Schema Validation:
    • Zod schema for response validation
    • Required fields enforcement
    • Type checking and sanitization

AI-1.4: Integration Testing & Validation

  • Test Case 1: Legacy Migration Flow:
    • Input: Scanned RFA PDF + Excel metadata
    • Expected: Thai text extraction >90% accuracy
    • Validation: AI output matches Excel data
  • Test Case 2: Real-time Ingestion Flow:
    • Input: New PDF upload from user
    • Expected: Response time <15 seconds
    • Validation: Structured JSON response
  • Performance Benchmarking:
    • Target: <15 seconds per document
    • Memory usage monitoring on Admin Desktop
    • GPU utilization tracking
  • Security Validation:
    • Verify no external network calls
    • Confirm AI services run in isolation
    • Test authentication between n8n and DMS

Acceptance Criteria

  1. Pipeline Functionality:

    • n8n successfully processes PDF → OCR → AI → JSON flow
    • Thai text extraction accuracy >90%
    • Gemma 4 returns valid JSON 100% of time
  2. Security Compliance (ADR-018):

    • All services run on Admin Desktop only
    • No external network connections
    • Proper authentication between services
  3. Data Integrity:

    • Extracted metadata matches document content >85%
    • Confidence scoring implemented and accurate
    • Idempotency prevents duplicate processing
  4. Performance:

    • Processing time <20 seconds per document (gemma:9b)
    • GPU memory usage <8GB per document
    • System remains stable under load

Critical Rules (Non-Negotiable)

  1. ADR-018 Compliance: AI services MUST run on Admin Desktop ONLY
  2. No Direct DB Access: Pipeline communicates via DMS API only
  3. UUID Strategy: All document references use publicId (UUIDv7)
  4. Thai Language Support: Must handle Thai engineering documents
  5. Error Handling: All failures must log to DMS audit system

  • ADR-018: AI Boundary Policy - Physical isolation requirements
  • ADR-019: Hybrid Identifier Strategy - UUID usage patterns
  • ADR-020: AI Intelligence Integration - Overall architecture
  • 03-05-n8n-migration-setup-guide.md: n8n configuration details

📝 Implementation Notes

Docker Compose Structure

services:
  n8n:
    image: n8nio/n8n:latest
    ports: ["5678:5678"]
    environment:
      - N8N_BASIC_AUTH_ACTIVE=true
      - N8N_BASIC_AUTH_USER=admin
      - N8N_BASIC_AUTH_PASSWORD=${N8N_PASSWORD}

  paddleocr:
    image: paddlepaddle/paddle:latest-gpu
    ports: ["8866:8866"]
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: 1
              capabilities: [gpu]
              device_ids: ["0"] # RTX 2060 Super
    environment:
      - CUDA_VISIBLE_DEVICES=0
    shm_size: 2gb

  ollama:
    image: ollama/ollama:latest
    ports: ["11434:11434"]
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: 1
              capabilities: [gpu]
              device_ids: ["0"] # RTX 2060 Super
    environment:
      - CUDA_VISIBLE_DEVICES=0
      - OLLAMA_MAX_LOADED_MODELS=1
      - OLLAMA_NUM_PARALLEL=2
    volumes:
      - ollama_data:/root/.ollama
    pull_policy: always

volumes:
  ollama_data:

Hardware Requirements

  • GPU: RTX 2060 Super 8GB VRAM (minimum for gemma:9b)
  • RAM: 32GB system memory recommended
  • Storage: 100GB SSD for models and temporary files
  • Network: Gigabit LAN for file transfers

Model Specifications

  • gemma:9b - 9 billion parameters, optimized for Thai
  • VRAM Usage: ~7-8GB for inference
  • Performance: ~15-20 seconds per document
  • Accuracy: Expected 90%+ for Thai engineering documents