feat(ai): ADR-032 Typhoon OCR integration - models, processors, cache, VRAM monitor, sandbox UI
This commit is contained in:
@@ -0,0 +1,147 @@
|
||||
# Data Model: Typhoon OCR Integration
|
||||
|
||||
**Feature**: 232-typhoon-ocr-integration
|
||||
**Date**: 2026-05-30
|
||||
**Phase**: Phase 1 - Design & Contracts
|
||||
|
||||
## Entities
|
||||
|
||||
### OCR Engine Configuration
|
||||
|
||||
**Purpose**: Represents available OCR engines with their parameters and resource requirements
|
||||
|
||||
**Fields**:
|
||||
- `engineId`: string (UUIDv7) - Unique identifier for OCR engine configuration
|
||||
- `engineName`: string - Engine name (e.g., "Tesseract", "Typhoon OCR-3B")
|
||||
- `engineType`: enum - Engine type (tesseract, typhoon_ocr)
|
||||
- `isActive`: boolean - Whether engine is currently available
|
||||
- `vramRequirementMB`: number - VRAM requirement in MB (for AI-based engines)
|
||||
- `processingTimeLimitSeconds`: number - Maximum processing time per page
|
||||
- `concurrentLimit`: number - Maximum concurrent requests (1 for Typhoon)
|
||||
- `fallbackEngineId`: string (UUIDv7, nullable) - Fallback engine when unavailable
|
||||
- `createdAt`: datetime - Configuration creation timestamp
|
||||
- `updatedAt`: datetime - Configuration last update timestamp
|
||||
|
||||
**Relationships**:
|
||||
- One-to-many: OCR Engine Configuration → OCR Processing Logs
|
||||
- Many-to-one: OCR Engine Configuration → OCR Engine Configuration (fallback)
|
||||
|
||||
**Validation Rules**:
|
||||
- `engineName` must be unique
|
||||
- `vramRequirementMB` required for AI-based engines
|
||||
- `concurrentLimit` must be >= 1
|
||||
- `fallbackEngineId` must reference valid engine or be null
|
||||
|
||||
### AI Model Configuration
|
||||
|
||||
**Purpose**: Represents available AI models with their VRAM requirements and use cases
|
||||
|
||||
**Fields**:
|
||||
- `modelId`: string (UUIDv7) - Unique identifier for AI model configuration
|
||||
- `modelName`: string - Model name (e.g., "gemma4:e4b", "typhoon2.1-gemma3-4b")
|
||||
- `modelType`: enum - Model type (llm, embedding, ocr)
|
||||
- `ollamaModelName`: string - Ollama model identifier
|
||||
- `vramRequirementMB`: number - VRAM requirement in MB
|
||||
- `isActive`: boolean - Whether model is currently available
|
||||
- `useCases`: string[] - Supported use cases (e.g., ["document_analysis", "ocr_extraction"])
|
||||
- `quantization`: string (nullable) - Quantization type (e.g., "Q3_K_M")
|
||||
- `createdAt`: datetime - Configuration creation timestamp
|
||||
- `updatedAt`: datetime - Configuration last update timestamp
|
||||
|
||||
**Relationships**:
|
||||
- One-to-many: AI Model Configuration → AI Audit Logs
|
||||
|
||||
**Validation Rules**:
|
||||
- `modelName` must be unique
|
||||
- `vramRequirementMB` required
|
||||
- `ollamaModelName` must match Ollama registry
|
||||
- `useCases` must include at least one valid use case
|
||||
|
||||
### VRAM Monitor State
|
||||
|
||||
**Purpose**: Tracks GPU VRAM usage across all loaded AI models
|
||||
|
||||
**Fields**:
|
||||
- `monitorId`: string (UUIDv7) - Unique identifier for monitor state
|
||||
- `totalVRAMMB`: number - Total GPU VRAM in MB
|
||||
- `usedVRAMMB`: number - Currently used VRAM in MB
|
||||
- `loadedModels`: string[] - List of loaded model IDs
|
||||
- `lastUpdated`: datetime - Last update timestamp
|
||||
- `thresholdPercent`: number - VRAM usage threshold (default: 90)
|
||||
|
||||
**Validation Rules**:
|
||||
- `usedVRAMMB` must be <= `totalVRAMMB`
|
||||
- `thresholdPercent` must be between 0 and 100
|
||||
- `loadedModels` must reference valid AI Model Configurations
|
||||
|
||||
### OCR Processing Log
|
||||
|
||||
**Purpose**: Logs all OCR processing attempts for audit and debugging
|
||||
|
||||
**Fields**:
|
||||
- `logId`: string (UUIDv7) - Unique identifier for log entry
|
||||
- `documentPublicId`: string - Document being processed
|
||||
- `engineId`: string (UUIDv7) - OCR engine used
|
||||
- `processingTimeSeconds`: number - Actual processing time
|
||||
- `success`: boolean - Whether processing succeeded
|
||||
- `errorMessage`: string (nullable) - Error message if failed
|
||||
- `fallbackUsed`: boolean - Whether fallback engine was used
|
||||
- `cacheHit`: boolean - Whether result was from cache
|
||||
- `timestamp`: datetime - Processing timestamp
|
||||
|
||||
**Relationships**:
|
||||
- Many-to-one: OCR Processing Log → OCR Engine Configuration
|
||||
|
||||
**Validation Rules**:
|
||||
- `documentPublicId` required
|
||||
- `engineId` must reference valid engine
|
||||
- `processingTimeSeconds` must be >= 0
|
||||
|
||||
### AI Audit Log (Existing - Extended)
|
||||
|
||||
**Purpose**: Logs all AI interactions per ADR-023/023A
|
||||
|
||||
**Extensions for Typhoon Integration**:
|
||||
- Add `modelType` field to distinguish between LLM, OCR, and embedding models
|
||||
- Add `vramUsageMB` field to track VRAM consumption per interaction
|
||||
- Add `cacheHit` field to track cache utilization
|
||||
|
||||
## State Transitions
|
||||
|
||||
### OCR Engine Configuration
|
||||
|
||||
```
|
||||
Created → Active → Inactive → Deleted
|
||||
```
|
||||
|
||||
- **Created**: Initial state when engine configuration is added
|
||||
- **Active**: Engine is available for use
|
||||
- **Inactive**: Engine is temporarily unavailable (e.g., Ollama down)
|
||||
- **Deleted**: Engine configuration is removed
|
||||
|
||||
### AI Model Configuration
|
||||
|
||||
```
|
||||
Created → Active → Inactive → Deleted
|
||||
```
|
||||
|
||||
- **Created**: Initial state when model configuration is added
|
||||
- **Active**: Model is available for use
|
||||
- **Inactive**: Model is temporarily unavailable (e.g., VRAM constraints)
|
||||
- **Deleted**: Model configuration is removed
|
||||
|
||||
## Schema Changes
|
||||
|
||||
No new database tables required. Existing tables will be extended:
|
||||
|
||||
- `ai_prompts`: Add Typhoon OCR prompt templates
|
||||
- `ai_audit_logs`: Add modelType, vramUsageMB, cacheHit fields
|
||||
- New configuration tables may be added in Redis for performance (OCR Engine Configuration, AI Model Configuration)
|
||||
|
||||
## Data Dictionary Updates
|
||||
|
||||
Add entries for:
|
||||
- OCR Engine Configuration
|
||||
- AI Model Configuration
|
||||
- VRAM Monitor State
|
||||
- OCR Processing Log
|
||||
Reference in New Issue
Block a user