Files
lcbp3/specs/200-fullstacks/232-typhoon-ocr-integration/data-model.md
T
admin ae1b1f35e1
CI / CD Pipeline / build (push) Successful in 4m51s
CI / CD Pipeline / deploy (push) Successful in 12m7s
feat(ai): ADR-032 Typhoon OCR integration - models, processors, cache, VRAM monitor, sandbox UI
2026-05-30 22:18:51 +07:00

148 lines
5.2 KiB
Markdown

# Data Model: Typhoon OCR Integration
**Feature**: 232-typhoon-ocr-integration
**Date**: 2026-05-30
**Phase**: Phase 1 - Design & Contracts
## Entities
### OCR Engine Configuration
**Purpose**: Represents available OCR engines with their parameters and resource requirements
**Fields**:
- `engineId`: string (UUIDv7) - Unique identifier for OCR engine configuration
- `engineName`: string - Engine name (e.g., "Tesseract", "Typhoon OCR-3B")
- `engineType`: enum - Engine type (tesseract, typhoon_ocr)
- `isActive`: boolean - Whether engine is currently available
- `vramRequirementMB`: number - VRAM requirement in MB (for AI-based engines)
- `processingTimeLimitSeconds`: number - Maximum processing time per page
- `concurrentLimit`: number - Maximum concurrent requests (1 for Typhoon)
- `fallbackEngineId`: string (UUIDv7, nullable) - Fallback engine when unavailable
- `createdAt`: datetime - Configuration creation timestamp
- `updatedAt`: datetime - Configuration last update timestamp
**Relationships**:
- One-to-many: OCR Engine Configuration → OCR Processing Logs
- Many-to-one: OCR Engine Configuration → OCR Engine Configuration (fallback)
**Validation Rules**:
- `engineName` must be unique
- `vramRequirementMB` required for AI-based engines
- `concurrentLimit` must be >= 1
- `fallbackEngineId` must reference valid engine or be null
### AI Model Configuration
**Purpose**: Represents available AI models with their VRAM requirements and use cases
**Fields**:
- `modelId`: string (UUIDv7) - Unique identifier for AI model configuration
- `modelName`: string - Model name (e.g., "gemma4:e4b", "typhoon2.1-gemma3-4b")
- `modelType`: enum - Model type (llm, embedding, ocr)
- `ollamaModelName`: string - Ollama model identifier
- `vramRequirementMB`: number - VRAM requirement in MB
- `isActive`: boolean - Whether model is currently available
- `useCases`: string[] - Supported use cases (e.g., ["document_analysis", "ocr_extraction"])
- `quantization`: string (nullable) - Quantization type (e.g., "Q3_K_M")
- `createdAt`: datetime - Configuration creation timestamp
- `updatedAt`: datetime - Configuration last update timestamp
**Relationships**:
- One-to-many: AI Model Configuration → AI Audit Logs
**Validation Rules**:
- `modelName` must be unique
- `vramRequirementMB` required
- `ollamaModelName` must match Ollama registry
- `useCases` must include at least one valid use case
### VRAM Monitor State
**Purpose**: Tracks GPU VRAM usage across all loaded AI models
**Fields**:
- `monitorId`: string (UUIDv7) - Unique identifier for monitor state
- `totalVRAMMB`: number - Total GPU VRAM in MB
- `usedVRAMMB`: number - Currently used VRAM in MB
- `loadedModels`: string[] - List of loaded model IDs
- `lastUpdated`: datetime - Last update timestamp
- `thresholdPercent`: number - VRAM usage threshold (default: 90)
**Validation Rules**:
- `usedVRAMMB` must be <= `totalVRAMMB`
- `thresholdPercent` must be between 0 and 100
- `loadedModels` must reference valid AI Model Configurations
### OCR Processing Log
**Purpose**: Logs all OCR processing attempts for audit and debugging
**Fields**:
- `logId`: string (UUIDv7) - Unique identifier for log entry
- `documentPublicId`: string - Document being processed
- `engineId`: string (UUIDv7) - OCR engine used
- `processingTimeSeconds`: number - Actual processing time
- `success`: boolean - Whether processing succeeded
- `errorMessage`: string (nullable) - Error message if failed
- `fallbackUsed`: boolean - Whether fallback engine was used
- `cacheHit`: boolean - Whether result was from cache
- `timestamp`: datetime - Processing timestamp
**Relationships**:
- Many-to-one: OCR Processing Log → OCR Engine Configuration
**Validation Rules**:
- `documentPublicId` required
- `engineId` must reference valid engine
- `processingTimeSeconds` must be >= 0
### AI Audit Log (Existing - Extended)
**Purpose**: Logs all AI interactions per ADR-023/023A
**Extensions for Typhoon Integration**:
- Add `modelType` field to distinguish between LLM, OCR, and embedding models
- Add `vramUsageMB` field to track VRAM consumption per interaction
- Add `cacheHit` field to track cache utilization
## State Transitions
### OCR Engine Configuration
```
Created → Active → Inactive → Deleted
```
- **Created**: Initial state when engine configuration is added
- **Active**: Engine is available for use
- **Inactive**: Engine is temporarily unavailable (e.g., Ollama down)
- **Deleted**: Engine configuration is removed
### AI Model Configuration
```
Created → Active → Inactive → Deleted
```
- **Created**: Initial state when model configuration is added
- **Active**: Model is available for use
- **Inactive**: Model is temporarily unavailable (e.g., VRAM constraints)
- **Deleted**: Model configuration is removed
## Schema Changes
No new database tables required. Existing tables will be extended:
- `ai_prompts`: Add Typhoon OCR prompt templates
- `ai_audit_logs`: Add modelType, vramUsageMB, cacheHit fields
- New configuration tables may be added in Redis for performance (OCR Engine Configuration, AI Model Configuration)
## Data Dictionary Updates
Add entries for:
- OCR Engine Configuration
- AI Model Configuration
- VRAM Monitor State
- OCR Processing Log