Files
lcbp3/specs/200-fullstacks/232-typhoon-ocr-integration/quickstart.md
T
admin ae1b1f35e1
CI / CD Pipeline / build (push) Successful in 4m51s
CI / CD Pipeline / deploy (push) Successful in 12m7s
feat(ai): ADR-032 Typhoon OCR integration - models, processors, cache, VRAM monitor, sandbox UI
2026-05-30 22:18:51 +07:00

130 lines
3.9 KiB
Markdown

# Quickstart: Typhoon OCR Integration
**Feature**: 232-typhoon-ocr-integration
**Date**: 2026-05-30
**Phase**: Implementation
## Current Scope
This feature is being implemented against the live LCBP3 repo structure, not the older generated paths in `plan.md` / `tasks.md`.
Current verified baseline:
- AI Model Management already exists via `ai_available_models` and `system_settings`
- OCR Sandbox already exists as a 2-step flow in `frontend/components/admin/ai/OcrSandboxPromptManager.tsx`
- OCR sidecar currently runs **Tesseract** as the production baseline
- Typhoon LLM option can be seeded into `ai_available_models` by SQL delta
- Typhoon OCR runtime path is still pending full backend/sidecar integration
## Prerequisites
- Admin Desktop (Desk-5439) with Ollama service reachable from DMS backend
- Redis service running
- MariaDB database with `ai_available_models`, `ai_prompts`, and `ai_audit_logs`
- BullMQ queues configured (`ai-realtime`, `ai-batch`)
- `system.manage_all` permission for AI admin features
## Installation Steps
### 1. Pull Typhoon models on Admin Desktop
```powershell
ollama pull scb10x/typhoon2.1-gemma3-4b
ollama pull scb10x/typhoon-ocr-3b
ollama list
```
Expected list should include:
- `scb10x/typhoon2.1-gemma3-4b`
- `scb10x/typhoon-ocr-3b`
### 2. Apply the Typhoon model seed delta
Apply:
- `specs/03-Data-and-Storage/deltas/2026-05-30-seed-typhoon-ai-models.sql`
This delta adds `typhoon2.1-gemma3-4b` into `ai_available_models` if it does not already exist.
### 3. Verify AI admin model data
Verified code path:
- Backend: `backend/src/modules/ai/ai-settings.service.ts`
- API: `GET /api/ai/admin/models`
- Frontend: `frontend/app/(admin)/admin/ai/page.tsx`
Expected behavior:
- `gemma4:e4b` remains the default fallback active model when `AI_ACTIVE_MODEL` is unset
- `typhoon2.1-gemma3-4b` appears as an additional selectable model after the delta is applied
## Usage
### AI Model Management
1. Open the AI admin page.
2. Confirm `typhoon2.1-gemma3-4b` appears in the model list.
3. Activate it from the existing AI Model Management card.
### OCR Sandbox
Current verified baseline:
- OCR Sandbox uses the existing 2-step flow:
- Step 1: OCR only
- Step 2: AI extraction from cached OCR text
- OCR sidecar health card now reflects the current engine baseline as `OCR Sidecar (Tesseract)`
Typhoon OCR engine selection is still pending implementation and should not be treated as complete until backend, queue, and sidecar integration are added.
## Verification
### Verify the model seed
1. Apply the SQL delta.
2. Open `/admin/ai`.
3. Confirm `typhoon2.1-gemma3-4b` appears in the model list.
### Verify the fallback active model
1. Ensure `AI_ACTIVE_MODEL` is missing from `system_settings` in a test environment.
2. Call `GET /api/ai/admin/models/active`.
3. Confirm the fallback response resolves to `gemma4:e4b`.
### Verify OCR baseline label
1. Open `/admin/ai`.
2. Go to `Overview & Health`.
3. Confirm the OCR card label reads `OCR Sidecar (Tesseract)`.
## Troubleshooting
### Ollama unavailable
Symptoms:
- AI health endpoint reports Ollama as down
- model activation cannot proceed
Checks:
```powershell
ollama list
```
### Typhoon model missing from UI
Checks:
- verify `2026-05-30-seed-typhoon-ai-models.sql` was applied
- verify `GET /api/ai/admin/models` returns the seeded row
### OCR Sandbox still uses Tesseract only
This is expected until Typhoon OCR runtime integration is implemented in:
- `backend/src/modules/ai/services/ocr.service.ts`
- `backend/src/modules/ai/processors/ai-batch.processor.ts`
- `specs/04-Infrastructure-OPS/04-00-docker-compose/Desk-5439/ocr-sidecar/app.py`
## Security Notes
- All AI admin endpoints require `system.manage_all`
- AI models remain on-premises only per ADR-023 / ADR-023A
- OCR results must stay behind the DMS backend boundary
- Do not treat Typhoon OCR as production-ready until fallback, queueing, and audit coverage are implemented end-to-end