lcbp3/specs/200-fullstacks/232-typhoon-ocr-integration/research.md

# Research: Typhoon OCR Integration

**Feature**: 232-typhoon-ocr-integration
**Date**: 2026-05-30
**Phase**: Phase 0 - Outline & Research

## Research Findings

### Typhoon OCR Ollama Integration

**Decision**: Use Ollama HTTP API for Typhoon OCR integration via Admin Desktop (Desk-5439)

**Rationale**:
- Typhoon OCR models are available in Ollama registry (scb10x/typhoon-ocr-3b, scb10x/typhoon-ocr-7b)
- Ollama provides consistent HTTP API for model inference
- Aligns with ADR-023/023A on-premises AI requirement
- Existing Ollama infrastructure on Admin Desktop can be reused

**Alternatives Considered**:
- OpenTyphoon Cloud API: Rejected due to ADR-023 on-premises requirement
- Direct model loading in Python: Rejected due to complexity and lack of integration with existing AI infrastructure

**Implementation Details**:
- Model: scb10x/typhoon-ocr-3b (~3-4GB VRAM)
- API endpoint: `POST /api/generate` with model parameter
- Input: Image data (base64 or file upload)
- Output: Extracted text with confidence scores
- Fallback: Tesseract OCR when Ollama unavailable

### Typhoon LLM Model Integration

**Decision**: Add typhoon2.1-gemma3-4b to AI Model Management as alternative to gemma4

**Rationale**:
- Typhoon models are optimized for Thai language
- Q3_K_M quantization reduces VRAM requirements (~8-10GB vs 16GB+)
- Provides model selection flexibility for administrators
- Compatible with existing Ollama infrastructure

**Alternatives Considered**:
- Full precision typhoon2.1-gemma3-12b: Rejected due to VRAM constraints
- Other Typhoon variants: Rejected due to limited availability in Ollama

**Implementation Details**:
- Model: typhoon2.1-gemma3-4b (~4-5GB VRAM)
- Integration via existing AI service with BullMQ queues
- Requires system.manage_all permission for model selection
- VRAM monitoring to prevent concurrent model loading

### Redis Caching for OCR Results

**Decision**: Use Redis with 24-hour TTL for OCR result caching

**Rationale**:
- Avoid reprocessing same document within short timeframe
- Redis already in use for other caching needs
- 24-hour TTL balances performance with storage efficiency
- Aligns with ADR-023A RAG embedding gap coverage pattern

**Alternatives Considered**:
- Permanent database storage: Rejected due to storage growth concerns
- No caching: Rejected due to performance impact
- Longer TTL (e.g., 7 days): Rejected due to storage efficiency

**Implementation Details**:
- Cache key: `ocr:cache:{documentPublicId}:{engine}:{hash}`
- TTL: 86400 seconds (24 hours)
- Cache invalidation: Manual or on document update
- Fallback to Tesseract bypasses cache

### VRAM Monitoring

**Decision**: Implement VRAM monitoring via Ollama API and Redis state tracking

**Rationale**:
- Prevent VRAM exhaustion when loading multiple models
- Sequential processing constraint (1 concurrent request)
- 90% VRAM usage limit per success criteria
- Ollama provides model status API

**Alternatives Considered**:
- GPU monitoring tools (nvidia-smi): Rejected due to complexity and OS dependency
- No monitoring: Rejected due to risk of VRAM exhaustion

**Implementation Details**:
- Monitor via Ollama `/api/tags` endpoint for loaded models
- Track VRAM usage in Redis: `ai:vram:usage`
- Block model loading if usage > 90%
- Sequential processing enforced via BullMQ queue

### ADR Updates

**Decision**: Create ADR-032 for Typhoon OCR integration and update ADR-023/023A

**Rationale**:
- Document Typhoon models as supported on-premises AI options
- Resolve conflicts between existing ADRs and new integration
- Provide clear guidance for future development
- Maintain ADR consistency per FR-009

**Alternatives Considered**:
- Only update existing ADRs: Rejected due to scope and clarity benefits of dedicated ADR
- No ADR updates: Rejected due to documentation requirements

**Implementation Details**:
- ADR-032: Typhoon OCR integration architecture
- ADR-023: Add Typhoon models to supported AI options
- ADR-023A: Add Typhoon models as alternatives to gemma4/nomic-embed-text
- Review for conflicts with existing ADRs

## Unknowns Resolved

No NEEDS CLARIFICATION markers remained in Technical Context. All technical decisions documented above.

## Dependencies Verified

- ✅ Ollama service operational on Admin Desktop (per ADR-023/023A)
- ✅ Typhoon OCR-3B available in Ollama registry
- ✅ Typhoon2.1-gemma3-4b available in Ollama registry
- ✅ Redis infrastructure available for caching
- ✅ BullMQ infrastructure available for job queues
- ✅ CASL infrastructure available for permission checks

## Next Steps

Proceed to Phase 1: Design & Contracts
- Generate data-model.md
- Generate API contracts in contracts/
- Generate quickstart.md
- Update agent context