690403:2205 Modify AI (Add Gemma4 & PaddleOCR
This commit is contained in:
@@ -0,0 +1,247 @@
|
||||
# Task BE-AI-01: Pipeline Infrastructure Setup
|
||||
|
||||
**Phase:** Step 1 - AI Pipeline Foundation (n8n + PaddleOCR + Gemma 4)
|
||||
**ADR Compliance:** ADR-018 (AI Boundary), ADR-019 (UUID Strategy)
|
||||
**Priority:** 🔴 Critical - Foundation for all AI features
|
||||
|
||||
> **Context:** เป็นรากฐานสำคัญของระบบ Document Intelligence ตาม ADR-020 โดยต้องเป็นไปตามนโยบาย AI Isolation และใช้ Identifier ที่ถูกต้อง
|
||||
|
||||
---
|
||||
|
||||
## 📋 Implementation Tasks
|
||||
|
||||
### **AI-1.1: Infrastructure Setup (ADR-018 Compliance)**
|
||||
- [ ] **Docker Environment on Admin Desktop (Desk-5439):**
|
||||
- ติดตั้ง Docker Compose สำหรับ n8n และ PaddleOCR
|
||||
- ตั้งค่า Network Isolation (LAN only, no public access)
|
||||
- ตรวจสอบ Hardware: RTX 2060 Super 8GB VRAM availability
|
||||
- [ ] **n8n Service:**
|
||||
- Docker Compose service พร้อม Basic Authentication
|
||||
- Webhook endpoint: `/webhook/ai-processing`
|
||||
- Environment variables: `N8N_BASIC_AUTH_USER`, `N8N_BASIC_AUTH_PASSWORD`
|
||||
- [ ] **Ollama Service:**
|
||||
- Pull model: `gemma:9b` (GPU optimized, higher accuracy)
|
||||
- API endpoint: `http://localhost:11434`
|
||||
- Health check: `GET /api/tags`
|
||||
- Memory requirement: Minimum 8GB VRAM for 9B model
|
||||
- ollama run gemma4:9b-q5_K_M / gemma4:9b-q4_K_M
|
||||
- สร้างไฟล์ %USERPROFILE%\.ollama\config
|
||||
```config
|
||||
# ใช้ GPU เป็นหลัก
|
||||
gpu: true
|
||||
num_gpu: 1
|
||||
|
||||
# เปิด KV cache เพื่อให้ตอบเร็วขึ้น
|
||||
kv_cache: true
|
||||
|
||||
# จำกัด batch size ให้เหมาะกับ VRAM 8GB
|
||||
gpu_batch_size: 512
|
||||
|
||||
# ปรับ num_thread ให้เหมาะกับ CPU 6–8 คอร์
|
||||
num_thread: 6
|
||||
|
||||
# เปิด mmap เพื่อโหลดโมเดลเร็วขึ้น
|
||||
mmap: true
|
||||
|
||||
# ปรับ max_seq_len ให้เหมาะกับงาน DMS
|
||||
max_seq_len: 4096
|
||||
|
||||
# ปรับ temp ต่ำเพื่อให้ผลลัพธ์เสถียร
|
||||
temperature: 0.2
|
||||
```
|
||||
|
||||
- [ ] **PaddleOCR Service:**
|
||||
- Docker image: `paddlepaddle/paddle:latest-gpu`
|
||||
- Thai language support configuration
|
||||
- API endpoint design: `POST /ocr/extract`
|
||||
|
||||
### **AI-1.2: n8n Workflow Development**
|
||||
- [ ] **Webhook Trigger Node:**
|
||||
- Input: `{ publicId: string, fileUrl: string, context: 'migration'|'ingestion' }`
|
||||
- Validation: Verify `publicId` format (UUIDv7) before processing
|
||||
- Idempotency check: Prevent duplicate processing
|
||||
- [ ] **OCR Integration Node:**
|
||||
- HTTP Request to PaddleOCR service
|
||||
- Input: Binary file data
|
||||
- Output: `{ text: string, confidence: number, language: 'th'|'en' }`
|
||||
- Error handling: Retry logic + fallback to CPU OCR
|
||||
- [ ] **Prompt Engineering Node:**
|
||||
- Function Node to construct Gemma 4 prompt
|
||||
- Template includes: Role definition, validation rules, JSON schema
|
||||
- Thai engineering context keywords
|
||||
- [ ] **Gemma 4 LLM Node:**
|
||||
- HTTP Request to Ollama API
|
||||
- Model: `gemma:9b` (enhanced accuracy for Thai engineering documents)
|
||||
- Parameters: `temperature: 0.1`, `max_tokens: 2048`
|
||||
- Output validation: Ensure valid JSON response
|
||||
- Memory monitoring: Track VRAM usage during inference
|
||||
- [ ] **Result Processing Node:**
|
||||
- Parse and validate AI response
|
||||
- Calculate confidence scores
|
||||
- Format for DMS Backend API callback
|
||||
- [ ] **Callback to DMS:**
|
||||
- HTTP POST to NestJS webhook endpoint
|
||||
- Payload: `{ publicId, extractedData, confidence, processingTime }`
|
||||
- Authentication: Service account JWT
|
||||
|
||||
### **AI-1.3: Prompt Engineering for Thai Engineering Documents**
|
||||
- [ ] **System Prompt Template:**
|
||||
```prompt
|
||||
You are a Senior Document Controller for Laem Chabang Port Phase 3 construction project.
|
||||
|
||||
TASK: Extract metadata from engineering documents with high accuracy.
|
||||
|
||||
RULES:
|
||||
1. Extract: subject, document_date, discipline, drawing_reference, contract_number
|
||||
2. Validate consistency between content and metadata
|
||||
3. Return confidence score (0-100%) for each field
|
||||
4. Support Thai and English engineering terms
|
||||
5. Output MUST be valid JSON only
|
||||
|
||||
OUTPUT FORMAT:
|
||||
{
|
||||
"subject": "string",
|
||||
"document_date": "YYYY-MM-DD",
|
||||
"discipline": "Civil|Mechanical|Electrical|Architectural",
|
||||
"drawing_reference": "string",
|
||||
"contract_number": "string",
|
||||
"confidence": {
|
||||
"overall": 0.95,
|
||||
"field_confidence": {...}
|
||||
}
|
||||
}
|
||||
```
|
||||
- [ ] **Thai Language Optimization:**
|
||||
- Engineering terms: "วิศวกรรมโยธา", "แบบรายละเอียด", "ขออนุมัติ", "แผนผัง"
|
||||
- Date format recognition: Thai Buddhist years (พ.ศ.)
|
||||
- Organization names: "ท่าเรือแหลมฉบัง", "ก.ท.ม.", "การท่าเรือฯ"
|
||||
- [ ] **JSON Schema Validation:**
|
||||
- Zod schema for response validation
|
||||
- Required fields enforcement
|
||||
- Type checking and sanitization
|
||||
|
||||
### **AI-1.4: Integration Testing & Validation**
|
||||
- [ ] **Test Case 1: Legacy Migration Flow:**
|
||||
- Input: Scanned RFA PDF + Excel metadata
|
||||
- Expected: Thai text extraction >90% accuracy
|
||||
- Validation: AI output matches Excel data
|
||||
- [ ] **Test Case 2: Real-time Ingestion Flow:**
|
||||
- Input: New PDF upload from user
|
||||
- Expected: Response time <15 seconds
|
||||
- Validation: Structured JSON response
|
||||
- [ ] **Performance Benchmarking:**
|
||||
- Target: <15 seconds per document
|
||||
- Memory usage monitoring on Admin Desktop
|
||||
- GPU utilization tracking
|
||||
- [ ] **Security Validation:**
|
||||
- Verify no external network calls
|
||||
- Confirm AI services run in isolation
|
||||
- Test authentication between n8n and DMS
|
||||
|
||||
---
|
||||
|
||||
## ✅ Acceptance Criteria
|
||||
|
||||
1. **Pipeline Functionality:**
|
||||
- n8n successfully processes PDF → OCR → AI → JSON flow
|
||||
- Thai text extraction accuracy >90%
|
||||
- Gemma 4 returns valid JSON 100% of time
|
||||
|
||||
2. **Security Compliance (ADR-018):**
|
||||
- All services run on Admin Desktop only
|
||||
- No external network connections
|
||||
- Proper authentication between services
|
||||
|
||||
3. **Data Integrity:**
|
||||
- Extracted metadata matches document content >85%
|
||||
- Confidence scoring implemented and accurate
|
||||
- Idempotency prevents duplicate processing
|
||||
|
||||
4. **Performance:**
|
||||
- Processing time <20 seconds per document (gemma:9b)
|
||||
- GPU memory usage <8GB per document
|
||||
- System remains stable under load
|
||||
|
||||
---
|
||||
|
||||
## � Critical Rules (Non-Negotiable)
|
||||
|
||||
1. **ADR-018 Compliance:** AI services MUST run on Admin Desktop ONLY
|
||||
2. **No Direct DB Access:** Pipeline communicates via DMS API only
|
||||
3. **UUID Strategy:** All document references use `publicId` (UUIDv7)
|
||||
4. **Thai Language Support:** Must handle Thai engineering documents
|
||||
5. **Error Handling:** All failures must log to DMS audit system
|
||||
|
||||
---
|
||||
|
||||
## 📁 Related Specifications
|
||||
|
||||
- **ADR-018:** AI Boundary Policy - Physical isolation requirements
|
||||
- **ADR-019:** Hybrid Identifier Strategy - UUID usage patterns
|
||||
- **ADR-020:** AI Intelligence Integration - Overall architecture
|
||||
- **03-05-n8n-migration-setup-guide.md:** n8n configuration details
|
||||
|
||||
---
|
||||
|
||||
## 📝 Implementation Notes
|
||||
|
||||
### Docker Compose Structure
|
||||
```yaml
|
||||
services:
|
||||
n8n:
|
||||
image: n8nio/n8n:latest
|
||||
ports: ["5678:5678"]
|
||||
environment:
|
||||
- N8N_BASIC_AUTH_ACTIVE=true
|
||||
- N8N_BASIC_AUTH_USER=admin
|
||||
- N8N_BASIC_AUTH_PASSWORD=${N8N_PASSWORD}
|
||||
|
||||
paddleocr:
|
||||
image: paddlepaddle/paddle:latest-gpu
|
||||
ports: ["8866:8866"]
|
||||
deploy:
|
||||
resources:
|
||||
reservations:
|
||||
devices:
|
||||
- driver: nvidia
|
||||
count: 1
|
||||
capabilities: [gpu]
|
||||
device_ids: ["0"] # RTX 2060 Super
|
||||
environment:
|
||||
- CUDA_VISIBLE_DEVICES=0
|
||||
shm_size: 2gb
|
||||
|
||||
ollama:
|
||||
image: ollama/ollama:latest
|
||||
ports: ["11434:11434"]
|
||||
deploy:
|
||||
resources:
|
||||
reservations:
|
||||
devices:
|
||||
- driver: nvidia
|
||||
count: 1
|
||||
capabilities: [gpu]
|
||||
device_ids: ["0"] # RTX 2060 Super
|
||||
environment:
|
||||
- CUDA_VISIBLE_DEVICES=0
|
||||
- OLLAMA_MAX_LOADED_MODELS=1
|
||||
- OLLAMA_NUM_PARALLEL=2
|
||||
volumes:
|
||||
- ollama_data:/root/.ollama
|
||||
pull_policy: always
|
||||
|
||||
volumes:
|
||||
ollama_data:
|
||||
```
|
||||
|
||||
### Hardware Requirements
|
||||
- **GPU:** RTX 2060 Super 8GB VRAM (minimum for gemma:9b)
|
||||
- **RAM:** 32GB system memory recommended
|
||||
- **Storage:** 100GB SSD for models and temporary files
|
||||
- **Network:** Gigabit LAN for file transfers
|
||||
|
||||
### Model Specifications
|
||||
- **gemma:9b** - 9 billion parameters, optimized for Thai
|
||||
- **VRAM Usage:** ~7-8GB for inference
|
||||
- **Performance:** ~15-20 seconds per document
|
||||
- **Accuracy:** Expected 90%+ for Thai engineering documents
|
||||
@@ -0,0 +1,295 @@
|
||||
# Task BE-AI-02: Backend AI Gateway Development
|
||||
|
||||
**Phase:** Step 2 - AI Integration Layer (NestJS)
|
||||
**ADR Compliance:** ADR-018 (AI Boundary), ADR-019 (UUID Strategy)
|
||||
**Priority:** 🔴 Critical - Bridge between DMS and AI Pipeline
|
||||
|
||||
> **Context:** เป็นส่วนเชื่อมโยงระหว่างระบบ DMS และ AI Pipeline ตาม ADR-020 โดยต้องรักษาความปลอดภัยและใช้ Identifier ที่ถูกต้อง
|
||||
|
||||
---
|
||||
|
||||
## 🛠️ Implementation Tasks
|
||||
|
||||
### **AI-2.1: Database Schema Design (SQL First Approach)**
|
||||
- [ ] **Create `migration_logs` Table:**
|
||||
```sql
|
||||
CREATE TABLE migration_logs (
|
||||
id INT AUTO_INCREMENT PRIMARY KEY,
|
||||
publicId BINARY(16) DEFAULT (UUID_TO_BIN(UUID(), 1)),
|
||||
source_file VARCHAR(255) NOT NULL,
|
||||
source_metadata JSON,
|
||||
ai_extracted_metadata JSON,
|
||||
confidence_score DECIMAL(3,2),
|
||||
status ENUM('PENDING_REVIEW', 'VERIFIED', 'IMPORTED', 'FAILED') DEFAULT 'PENDING_REVIEW',
|
||||
admin_feedback TEXT,
|
||||
reviewed_by INT NULL,
|
||||
reviewed_at TIMESTAMP NULL,
|
||||
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
|
||||
updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
|
||||
INDEX idx_status (status),
|
||||
INDEX idx_confidence (confidence_score),
|
||||
INDEX idx_publicId (publicId)
|
||||
);
|
||||
```
|
||||
- [ ] **Create `ai_audit_logs` Table:**
|
||||
```sql
|
||||
CREATE TABLE ai_audit_logs (
|
||||
id INT AUTO_INCREMENT PRIMARY KEY,
|
||||
publicId BINARY(16) DEFAULT (UUID_TO_BIN(UUID(), 1)),
|
||||
document_publicId BINARY(16),
|
||||
ai_model VARCHAR(50) NOT NULL,
|
||||
processing_time_ms INT,
|
||||
confidence_score DECIMAL(3,2),
|
||||
input_hash VARCHAR(64),
|
||||
output_hash VARCHAR(64),
|
||||
status ENUM('SUCCESS', 'FAILED', 'TIMEOUT') NOT NULL,
|
||||
error_message TEXT,
|
||||
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
|
||||
INDEX idx_document (document_publicId),
|
||||
INDEX idx_model (ai_model),
|
||||
INDEX idx_status (status),
|
||||
FOREIGN KEY (document_publicId) REFERENCES migration_logs(publicId)
|
||||
);
|
||||
```
|
||||
- [ ] **Update Data Dictionary:**
|
||||
- Add field descriptions to `specs/03-Data-and-Storage/03-01-data-dictionary.md`
|
||||
- Include business rules for confidence thresholds
|
||||
- Document status transitions and workflows
|
||||
|
||||
### **AI-2.2: AI Gateway Module Architecture**
|
||||
- [ ] **Module Structure:**
|
||||
```typescript
|
||||
// src/modules/ai/ai.module.ts
|
||||
@Module({
|
||||
imports: [TypeOrmModule.forFeature([MigrationLog, AiAuditLog])],
|
||||
controllers: [AiController],
|
||||
providers: [AiService, AiValidationService],
|
||||
exports: [AiService],
|
||||
})
|
||||
export class AiModule {}
|
||||
```
|
||||
- [ ] **AiService Implementation:**
|
||||
```typescript
|
||||
@Injectable()
|
||||
export class AiService {
|
||||
async triggerProcessing(filePublicId: string, context: ProcessingContext): Promise<void> {
|
||||
// 1. Validate publicId format (ADR-019)
|
||||
// 2. Send HTTP request to n8n webhook
|
||||
// 3. Log request to ai_audit_logs
|
||||
// 4. Return processing token
|
||||
}
|
||||
|
||||
async handleWebhookCallback(payload: AiCallbackDto): Promise<void> {
|
||||
// 1. Validate JWT token from n8n
|
||||
// 2. Update migration_logs with AI results
|
||||
// 3. Calculate confidence scores
|
||||
// 4. Trigger notifications if needed
|
||||
}
|
||||
|
||||
async extractRealtime(filePublicId: string): Promise<ExtractionResult> {
|
||||
// 1. Send to n8n for immediate processing
|
||||
// 2. Wait for response (timeout: 30s)
|
||||
// 3. Return structured suggestions
|
||||
}
|
||||
}
|
||||
```
|
||||
- [ ] **Configuration Management:**
|
||||
```env
|
||||
# .env
|
||||
AI_N8N_WEBHOOK_URL=http://192.168.1.100:5678/webhook/ai-processing
|
||||
AI_N8N_AUTH_TOKEN=service-account-jwt-token
|
||||
AI_OLLAMA_URL=http://192.168.1.100:11434
|
||||
AI_TIMEOUT_MS=30000
|
||||
AI_MAX_RETRIES=3
|
||||
```
|
||||
|
||||
### **AI-2.3: Migration Engine & Business Logic**
|
||||
- [ ] **MigrationService Implementation:**
|
||||
```typescript
|
||||
@Injectable()
|
||||
export class MigrationService {
|
||||
async stageLegacyData(excelData: ExcelImportDto[]): Promise<MigrationLog[]> {
|
||||
// 1. Validate Excel data format
|
||||
// 2. Move PDF files to staging area (via StorageService)
|
||||
// 3. Create migration_logs entries
|
||||
// 4. Trigger AI processing for each file
|
||||
}
|
||||
|
||||
async compareData(excelMetadata: any, aiMetadata: any): Promise<ComparisonResult> {
|
||||
// 1. Field-by-field comparison
|
||||
// 2. Calculate confidence deltas
|
||||
// 3. Flag discrepancies for human review
|
||||
// 4. Generate comparison report
|
||||
}
|
||||
|
||||
async approveMigration(migrationPublicId: string, adminId: number): Promise<void> {
|
||||
// 1. Validate admin permissions (CASL)
|
||||
// 2. Move file from staging to permanent storage
|
||||
// 3. Create actual document records (RFA, Correspondence, etc.)
|
||||
// 4. Update migration_logs status
|
||||
}
|
||||
}
|
||||
```
|
||||
- [ ] **Status Management Workflow:**
|
||||
```typescript
|
||||
enum MigrationStatus {
|
||||
PENDING_REVIEW = 'PENDING_REVIEW',
|
||||
VERIFIED = 'VERIFIED',
|
||||
IMPORTED = 'IMPORTED',
|
||||
FAILED = 'FAILED'
|
||||
}
|
||||
|
||||
// State transition rules
|
||||
const statusTransitions = {
|
||||
[MigrationStatus.PENDING_REVIEW]: [MigrationStatus.VERIFIED, MigrationStatus.FAILED],
|
||||
[MigrationStatus.VERIFIED]: [MigrationStatus.IMPORTED, MigrationStatus.PENDING_REVIEW],
|
||||
[MigrationStatus.IMPORTED]: [], // Terminal state
|
||||
[MigrationStatus.FAILED]: [MigrationStatus.PENDING_REVIEW] // Can retry
|
||||
};
|
||||
```
|
||||
|
||||
### **AI-2.4: API Endpoints & Security Implementation**
|
||||
- [ ] **Admin Migration Endpoints:**
|
||||
```typescript
|
||||
@Controller('admin/migration')
|
||||
@UseGuards(JwtAuthGuard, CaslGuard)
|
||||
export class AdminMigrationController {
|
||||
@Get()
|
||||
@Permissions(PERMISSIONS.MIGRATION_READ)
|
||||
async getMigrationList(@Query() query: MigrationQueryDto): Promise<PaginatedResult<MigrationLog>> {
|
||||
// 1. Validate query parameters
|
||||
// 2. Apply filters (status, confidence, date range)
|
||||
// 3. Return paginated results
|
||||
}
|
||||
|
||||
@Patch(':publicId')
|
||||
@Permissions(PERMISSIONS.MIGRATION_APPROVE)
|
||||
async updateMigration(
|
||||
@Param('publicId') publicId: string,
|
||||
@Body() updateDto: MigrationUpdateDto,
|
||||
@CurrentUser() user: User
|
||||
): Promise<MigrationLog> {
|
||||
// 1. Validate publicId (no parseInt!)
|
||||
// 2. Check admin permissions
|
||||
// 3. Update with audit trail
|
||||
}
|
||||
}
|
||||
```
|
||||
- [ ] **Real-time AI Extraction Endpoint:**
|
||||
```typescript
|
||||
@Controller('ai')
|
||||
export class AiController {
|
||||
@Post('extract')
|
||||
@UseGuards(JwtAuthGuard)
|
||||
@Throttle(5, 60) // 5 requests per minute
|
||||
async extractDocument(@Body() dto: ExtractDocumentDto): Promise<ExtractionResult> {
|
||||
// 1. Validate file access permissions
|
||||
// 2. Send to AI pipeline
|
||||
// 3. Return structured suggestions
|
||||
}
|
||||
}
|
||||
```
|
||||
- [ ] **Security Measures:**
|
||||
- CASL permissions for all endpoints
|
||||
- Idempotency-Key header validation
|
||||
- Rate limiting on AI endpoints
|
||||
- JWT authentication for service accounts
|
||||
- Request/response logging for audit
|
||||
|
||||
---
|
||||
|
||||
## 🔴 Critical Rules (Non-Negotiable)
|
||||
|
||||
1. **ADR-019 UUID Strategy:**
|
||||
- Use `publicId` (UUIDv7) for all document references
|
||||
- NEVER use `parseInt()` or `Number()` on UUID values
|
||||
- All API parameters use string type for UUIDs
|
||||
|
||||
2. **ADR-018 AI Boundary:**
|
||||
- No direct database access from AI services
|
||||
- All communication via DMS API only
|
||||
- AI services run on Admin Desktop (isolated)
|
||||
|
||||
3. **Security Requirements:**
|
||||
- All `POST/PATCH` endpoints must validate `Idempotency-Key`
|
||||
- CASL permissions enforced on all endpoints
|
||||
- Rate limiting on AI endpoints (5 req/min)
|
||||
|
||||
4. **Data Integrity:**
|
||||
- SQL-first approach (no TypeORM migrations)
|
||||
- All file operations via StorageService
|
||||
- Audit logging for all AI interactions
|
||||
|
||||
---
|
||||
|
||||
## 📋 Implementation Sequence
|
||||
|
||||
1. **Phase 1 (AI-2.1):** Database schema and data dictionary updates
|
||||
2. **Phase 2 (AI-2.2):** AI Gateway module and basic service structure
|
||||
3. **Phase 3 (AI-2.3 & AI-2.4):** Business logic and API endpoints (parallel development)
|
||||
4. **Phase 4:** Integration testing with n8n pipeline
|
||||
|
||||
---
|
||||
|
||||
## 📁 Related Specifications
|
||||
|
||||
- **ADR-018:** AI Boundary Policy - Security requirements
|
||||
- **ADR-019:** Hybrid Identifier Strategy - UUID patterns
|
||||
- **ADR-020:** AI Intelligence Integration - Architecture overview
|
||||
- **05-02-backend-guidelines.md:** NestJS patterns and conventions
|
||||
- **03-01-data-dictionary.md:** Field definitions and business rules
|
||||
|
||||
---
|
||||
|
||||
## 📝 Code Templates
|
||||
|
||||
### DTO Examples
|
||||
```typescript
|
||||
// extract-document.dto.ts
|
||||
export class ExtractDocumentDto {
|
||||
@IsUUID()
|
||||
publicId: string;
|
||||
|
||||
@IsEnum(['migration', 'ingestion'])
|
||||
context: string;
|
||||
}
|
||||
|
||||
// migration-update.dto.ts
|
||||
export class MigrationUpdateDto {
|
||||
@IsOptional()
|
||||
@IsEnum(['VERIFIED', 'FAILED'])
|
||||
status?: MigrationStatus;
|
||||
|
||||
@IsOptional()
|
||||
@IsString()
|
||||
@MaxLength(1000)
|
||||
adminFeedback?: string;
|
||||
}
|
||||
```
|
||||
|
||||
### Entity Example
|
||||
```typescript
|
||||
// migration-log.entity.ts
|
||||
@Entity('migration_logs')
|
||||
export class MigrationLog extends UuidBaseEntity {
|
||||
@Column({ type: 'varchar', length: 255 })
|
||||
sourceFile: string;
|
||||
|
||||
@Column({ type: 'json' })
|
||||
sourceMetadata: any;
|
||||
|
||||
@Column({ type: 'json' })
|
||||
aiExtractedMetadata: any;
|
||||
|
||||
@Column({ type: 'decimal', precision: 3, scale: 2 })
|
||||
confidenceScore: number;
|
||||
|
||||
@Column({
|
||||
type: 'enum',
|
||||
enum: MigrationStatus,
|
||||
default: MigrationStatus.PENDING_REVIEW
|
||||
})
|
||||
status: MigrationStatus;
|
||||
}
|
||||
```
|
||||
|
||||
@@ -0,0 +1,442 @@
|
||||
# Task FE-AI-03: Frontend Human-in-the-Loop Interface
|
||||
|
||||
**Phase:** Step 3 - AI Verification & User Experience (Next.js)
|
||||
**ADR Compliance:** ADR-018 (AI Boundary), ADR-019 (UUID Strategy)
|
||||
**Priority:** 🔴 Critical - Human validation layer for AI outputs
|
||||
|
||||
> **Context:** เป็นส่วนสำคัญที่สุดในการเปลี่ยนข้อมูลที่ AI สกัดได้ให้เป็นข้อมูลที่มีคุณภาพ (Verified Data) ตามกฎ ADR-018 โดยเน้นการสร้าง UI ที่ใช้งานง่ายสำหรับทั้ง Admin (เอกสารเก่า) และ User (เอกสารใหม่)
|
||||
|
||||
---
|
||||
|
||||
## 🖥️ Implementation Tasks
|
||||
|
||||
### **AI-3.1: Reusable AI Review Components**
|
||||
- [ ] **AiSuggestionField Component:**
|
||||
```typescript
|
||||
// components/ai/ai-suggestion-field.tsx
|
||||
interface AiSuggestionFieldProps {
|
||||
value: string;
|
||||
suggestion?: string;
|
||||
confidence?: number;
|
||||
onAccept: () => void;
|
||||
onReject: () => void;
|
||||
onEdit: (newValue: string) => void;
|
||||
}
|
||||
```
|
||||
Features:
|
||||
- AI icon with confidence badge (✨ 95%)
|
||||
- Yellow highlight for AI-suggested values
|
||||
- Accept/Reject/Edit actions
|
||||
- Tooltip showing raw AI extraction
|
||||
|
||||
- [ ] **DocumentComparisonView Component:**
|
||||
```typescript
|
||||
// components/ai/document-comparison-view.tsx
|
||||
interface DocumentComparisonViewProps {
|
||||
fileUrl: string;
|
||||
extractedData: ExtractionResult;
|
||||
formData: FormData;
|
||||
onFieldUpdate: (field: string, value: string) => void;
|
||||
}
|
||||
```
|
||||
Features:
|
||||
- PDF viewer sidebar (react-pdf)
|
||||
- Form fields with AI suggestions
|
||||
- Side-by-side comparison layout
|
||||
- Real-time validation feedback
|
||||
|
||||
- [ ] **Client-side Validation Integration:**
|
||||
```typescript
|
||||
// Validation schema with confidence thresholds
|
||||
const documentSchema = z.object({
|
||||
subject: z.string().min(1, "จำเป็นต้องระบุชื่อเรื่อง"),
|
||||
documentDate: z.string().refine(validateThaiDate),
|
||||
discipline: z.enum(['Civil', 'Mechanical', 'Electrical', 'Architectural'])
|
||||
});
|
||||
|
||||
// React Hook Form integration
|
||||
const form = useForm({
|
||||
resolver: zodResolver(documentSchema),
|
||||
mode: 'onChange',
|
||||
defaultValues: aiSuggestions
|
||||
});
|
||||
```
|
||||
|
||||
### **AI-3.2: Legacy Migration Dashboard (Admin Interface)**
|
||||
- [ ] **Migration List Page:**
|
||||
```typescript
|
||||
// app/(admin)/admin/migration/page.tsx
|
||||
interface MigrationListProps {
|
||||
status?: MigrationStatus;
|
||||
confidenceRange?: [number, number];
|
||||
dateRange?: [Date, Date];
|
||||
}
|
||||
```
|
||||
Features:
|
||||
- Paginated table with sorting/filtering
|
||||
- Status badges (Pending/Verified/Failed)
|
||||
- Confidence score heat map (red/yellow/green)
|
||||
- Bulk selection for actions
|
||||
|
||||
- [ ] **Filter System:**
|
||||
```typescript
|
||||
// Filter components
|
||||
const StatusFilter = () => (
|
||||
<Select value={selectedStatus} onValueChange={setSelectedStatus}>
|
||||
<SelectItem value="PENDING_REVIEW">รอตรวจสอบ</SelectItem>
|
||||
<SelectItem value="VERIFIED">ผ่านการตรวจสอบ</SelectItem>
|
||||
<SelectItem value="FAILED">ล้มเหลว</SelectItem>
|
||||
</Select>
|
||||
);
|
||||
|
||||
const ConfidenceFilter = () => (
|
||||
<Slider
|
||||
min={0}
|
||||
max={100}
|
||||
value={confidenceRange}
|
||||
onValueChange={setConfidenceRange}
|
||||
marks={[{value: 60, label: 'ต่ำ'}, {value: 85, label: 'ปานกลาง'}, {value: 95, label: 'สูง'}]}
|
||||
/>
|
||||
);
|
||||
```
|
||||
|
||||
- [ ] **Bulk Actions Implementation:**
|
||||
```typescript
|
||||
// Bulk verification for high-confidence items
|
||||
const handleBulkVerify = async (selectedIds: string[]) => {
|
||||
const confirmed = await confirm({
|
||||
title: "ยืนยันการนำเข้าข้อมูล",
|
||||
description: `จะยืนยันนำเข้าเอกสาร ${selectedIds.length} รายการที่มีความมั่นใจ >95% หรือไม่?`
|
||||
});
|
||||
|
||||
if (confirmed) {
|
||||
await Promise.all(
|
||||
selectedIds.map(publicId =>
|
||||
api.migration.update(publicId, { status: 'VERIFIED' })
|
||||
)
|
||||
);
|
||||
}
|
||||
};
|
||||
```
|
||||
|
||||
- [ ] **Error Logging UI:**
|
||||
- Error details modal for failed extractions
|
||||
- OCR error screenshots
|
||||
- AI response raw text viewer
|
||||
- Retry mechanism with different parameters
|
||||
|
||||
### **AI-3.3: Real-time Ingestion Integration (User Interface)**
|
||||
- [ ] **RFA Creation Flow Enhancement:**
|
||||
```typescript
|
||||
// app/(dashboard)/rfas/create/page.tsx
|
||||
const [isProcessing, setIsProcessing] = useState(false);
|
||||
const [aiSuggestions, setAiSuggestions] = useState<ExtractionResult | null>(null);
|
||||
|
||||
const handleFileUpload = async (file: File) => {
|
||||
setIsProcessing(true);
|
||||
try {
|
||||
// 1. Upload file to temporary storage
|
||||
const uploadResult = await api.storage.uploadTemp(file);
|
||||
|
||||
// 2. Trigger AI extraction
|
||||
const extraction = await api.ai.extract({
|
||||
publicId: uploadResult.publicId,
|
||||
context: 'ingestion'
|
||||
});
|
||||
|
||||
// 3. Apply suggestions to form
|
||||
setAiSuggestions(extraction);
|
||||
form.reset(extraction.suggestions);
|
||||
} finally {
|
||||
setIsProcessing(false);
|
||||
}
|
||||
};
|
||||
```
|
||||
|
||||
- [ ] **Processing State UI:**
|
||||
```typescript
|
||||
// Loading component during AI processing
|
||||
const AiProcessingIndicator = () => (
|
||||
<Card className="border-yellow-200 bg-yellow-50">
|
||||
<CardContent className="flex items-center space-x-3 p-4">
|
||||
<Loader2 className="h-5 w-5 animate-spin text-yellow-600" />
|
||||
<div>
|
||||
<p className="font-medium text-yellow-800">AI กำลังวิเคราะห์เอกสาร...</p>
|
||||
<p className="text-sm text-yellow-600">กรุณารอสักครู่ (ประมาณ 15-30 วินาที)</p>
|
||||
</div>
|
||||
</CardContent>
|
||||
</Card>
|
||||
);
|
||||
```
|
||||
|
||||
- [ ] **Auto-fill with User Override:**
|
||||
```typescript
|
||||
// Form field with AI suggestion
|
||||
const FormFieldWithAi = ({ name, label }: { name: string; label: string }) => {
|
||||
const { control, watch } = useFormContext();
|
||||
const value = watch(name);
|
||||
const suggestion = aiSuggestions?.suggestions[name];
|
||||
const confidence = aiSuggestions?.confidence[name];
|
||||
|
||||
return (
|
||||
<FormField
|
||||
control={control}
|
||||
name={name}
|
||||
render={({ field }) => (
|
||||
<FormItem>
|
||||
<FormLabel className="flex items-center gap-2">
|
||||
{label}
|
||||
{suggestion && confidence && (
|
||||
<Badge variant="secondary" className="text-xs">
|
||||
✨ AI {Math.round(confidence * 100)}%
|
||||
</Badge>
|
||||
)}
|
||||
</FormLabel>
|
||||
<FormControl>
|
||||
<Input
|
||||
{...field}
|
||||
className={suggestion && value === suggestion ? 'bg-yellow-50' : ''}
|
||||
/>
|
||||
</FormControl>
|
||||
</FormItem>
|
||||
)}
|
||||
/>
|
||||
);
|
||||
};
|
||||
```
|
||||
|
||||
- [ ] **Raw Text Comparison Toggle:**
|
||||
```typescript
|
||||
// Collapsible panel showing OCR text
|
||||
const OcrTextViewer = ({ extractedText }: { extractedText: string }) => (
|
||||
<Collapsible>
|
||||
<CollapsibleTrigger asChild>
|
||||
<Button variant="ghost" size="sm" className="text-blue-600">
|
||||
<Eye className="h-4 w-4 mr-2" />
|
||||
ดูข้อความดิบจาก AI
|
||||
</Button>
|
||||
</CollapsibleTrigger>
|
||||
<CollapsibleContent>
|
||||
<Card className="mt-2">
|
||||
<CardContent className="p-4">
|
||||
<pre className="text-sm bg-gray-50 p-3 rounded overflow-auto max-h-48">
|
||||
{extractedText}
|
||||
</pre>
|
||||
</CardContent>
|
||||
</Card>
|
||||
</CollapsibleContent>
|
||||
</Collapsible>
|
||||
);
|
||||
```
|
||||
|
||||
### **AI-3.4: Human-AI Feedback Loop Implementation**
|
||||
- [ ] **Feedback Collection System:**
|
||||
```typescript
|
||||
// Track user corrections for AI improvement
|
||||
const trackUserCorrection = async (
|
||||
field: string,
|
||||
aiSuggestion: string,
|
||||
userCorrection: string,
|
||||
documentPublicId: string
|
||||
) => {
|
||||
await api.ai.feedback.create({
|
||||
documentPublicId,
|
||||
field,
|
||||
aiSuggestion,
|
||||
userCorrection,
|
||||
timestamp: new Date().toISOString(),
|
||||
userAgent: navigator.userAgent
|
||||
});
|
||||
};
|
||||
```
|
||||
|
||||
- [ ] **Accuracy Analytics Dashboard:**
|
||||
```typescript
|
||||
// Admin dashboard for AI performance
|
||||
const AiPerformanceDashboard = () => {
|
||||
const [metrics, setMetrics] = useState<PerformanceMetrics>();
|
||||
|
||||
useEffect(() => {
|
||||
const loadMetrics = async () => {
|
||||
const data = await api.ai.analytics.getPerformance();
|
||||
setMetrics(data);
|
||||
};
|
||||
loadMetrics();
|
||||
}, []);
|
||||
|
||||
return (
|
||||
<div className="grid grid-cols-1 md:grid-cols-3 gap-6">
|
||||
<Card>
|
||||
<CardHeader>
|
||||
<CardTitle className="text-sm">ความแม่นยำโดยรวม</CardTitle>
|
||||
</CardHeader>
|
||||
<CardContent>
|
||||
<div className="text-2xl font-bold text-green-600">
|
||||
{metrics?.overallAccuracy}%
|
||||
</div>
|
||||
</CardContent>
|
||||
</Card>
|
||||
|
||||
<Card>
|
||||
<CardHeader>
|
||||
<CardTitle className="text-sm">อัตราการแก้ไขโดยผู้ใช้</CardTitle>
|
||||
</CardHeader>
|
||||
<CardContent>
|
||||
<div className="text-2xl font-bold text-blue-600">
|
||||
{metrics?.userCorrectionRate}%
|
||||
</div>
|
||||
</CardContent>
|
||||
</Card>
|
||||
|
||||
<Card>
|
||||
<CardHeader>
|
||||
<CardTitle className="text-sm">เวลาประมวลผลเฉลี่ย</CardTitle>
|
||||
</CardHeader>
|
||||
<CardContent>
|
||||
<div className="text-2xl font-bold text-purple-600">
|
||||
{metrics?.avgProcessingTime}s
|
||||
</div>
|
||||
</CardContent>
|
||||
</Card>
|
||||
</div>
|
||||
);
|
||||
};
|
||||
```
|
||||
|
||||
- [ ] **Feedback Data Structure:**
|
||||
```typescript
|
||||
// Types for feedback collection
|
||||
interface AiFeedbackDto {
|
||||
documentPublicId: string;
|
||||
field: string;
|
||||
aiSuggestion: string;
|
||||
userCorrection: string;
|
||||
confidence: number;
|
||||
timestamp: string;
|
||||
userAgent: string;
|
||||
}
|
||||
|
||||
interface PerformanceMetrics {
|
||||
overallAccuracy: number;
|
||||
userCorrectionRate: number;
|
||||
avgProcessingTime: number;
|
||||
fieldAccuracy: Record<string, number>;
|
||||
modelPerformance: Record<string, number>;
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 🎨 UX/UI Design Guidelines
|
||||
|
||||
### Design Principles
|
||||
- **Trust through Transparency:** Always show AI confidence and sources
|
||||
- **Human Control First:** User can override any AI suggestion
|
||||
- **Progressive Disclosure:** Hide complexity, show details on demand
|
||||
- **Thai Language First:** All UI text in Thai, engineering terms in context
|
||||
|
||||
### Visual Indicators
|
||||
```typescript
|
||||
// Confidence score color coding
|
||||
const getConfidenceColor = (confidence: number) => {
|
||||
if (confidence >= 0.95) return 'text-green-600 bg-green-50';
|
||||
if (confidence >= 0.85) return 'text-yellow-600 bg-yellow-50';
|
||||
return 'text-red-600 bg-red-50';
|
||||
};
|
||||
|
||||
// AI suggestion highlighting
|
||||
const aiSuggestionStyles = {
|
||||
backgroundColor: '#fef3c7', // yellow-50
|
||||
borderLeft: '3px solid #f59e0b', // yellow-500
|
||||
padding: '0.5rem'
|
||||
};
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 🔴 Critical Rules (Non-Negotiable)
|
||||
|
||||
1. **ADR-019 UUID Strategy:**
|
||||
- All API calls use `publicId` (string) only
|
||||
- NEVER use integer IDs or fallback patterns
|
||||
- Type safety: `publicId?: string` in interfaces
|
||||
|
||||
2. **ADR-018 AI Boundary:**
|
||||
- Frontend communicates with DMS API only
|
||||
- NO direct calls to n8n, Ollama, or PaddleOCR
|
||||
- AI processing via `/api/ai/extract` endpoint only
|
||||
|
||||
3. **Thai Language Standards:**
|
||||
- All UI text in Thai (i18n keys)
|
||||
- Code comments in Thai
|
||||
- Engineering terms preserved in original language
|
||||
|
||||
4. **Security Requirements:**
|
||||
- File uploads through StorageService only
|
||||
- Proper error handling without exposing system details
|
||||
- Rate limiting on AI endpoints
|
||||
|
||||
5. **Data Integrity:**
|
||||
- All AI suggestions require explicit user confirmation
|
||||
- Audit trail for all user corrections
|
||||
- Validation before form submission
|
||||
|
||||
---
|
||||
|
||||
## 📁 Related Specifications
|
||||
|
||||
- **ADR-018:** AI Boundary Policy - Security requirements
|
||||
- **ADR-019:** Hybrid Identifier Strategy - UUID usage patterns
|
||||
- **ADR-020:** AI Intelligence Integration - Architecture overview
|
||||
- **05-03-frontend-guidelines.md:** Next.js patterns and conventions
|
||||
- **05-08-i18n-guidelines.md:** Thai language implementation
|
||||
|
||||
---
|
||||
|
||||
## 📝 Component Library Usage
|
||||
|
||||
### Shadcn/UI Components
|
||||
```typescript
|
||||
// Required components for AI features
|
||||
import {
|
||||
Card,
|
||||
CardContent,
|
||||
CardHeader,
|
||||
CardTitle,
|
||||
Badge,
|
||||
Button,
|
||||
Input,
|
||||
Select,
|
||||
Slider,
|
||||
Collapsible,
|
||||
Dialog,
|
||||
Table,
|
||||
Pagination
|
||||
} from '@/components/ui';
|
||||
|
||||
// Custom AI components
|
||||
import { AiSuggestionField } from '@/components/ai/ai-suggestion-field';
|
||||
import { DocumentComparisonView } from '@/components/ai/document-comparison-view';
|
||||
import { AiProcessingIndicator } from '@/components/ai/processing-indicator';
|
||||
```
|
||||
|
||||
### Tailwind CSS Classes
|
||||
```css
|
||||
/* AI-specific utility classes */
|
||||
.ai-suggestion {
|
||||
@apply bg-yellow-50 border-l-4 border-yellow-500 p-3 rounded;
|
||||
}
|
||||
|
||||
.ai-high-confidence {
|
||||
@apply text-green-600 bg-green-50 border-green-500;
|
||||
}
|
||||
|
||||
.ai-medium-confidence {
|
||||
@apply text-yellow-600 bg-yellow-50 border-yellow-500;
|
||||
}
|
||||
|
||||
.ai-low-confidence {
|
||||
@apply text-red-600 bg-red-50 border-red-500;
|
||||
}
|
||||
```
|
||||
|
||||
Reference in New Issue
Block a user