Files
lcbp3/specs/200-fullstacks/236-unified-ocr-architecture/research.md
T
admin 7e8f4859cd
CI / CD Pipeline / build (push) Failing after 6m24s
CI / CD Pipeline / deploy (push) Has been skipped
feat(ai): add ADR-036 unified OCR architecture and frontend test coverage
- Add ADR-036 unified OCR architecture (typhoon-ocr via Ollama)
- Extend AI execution profiles for OCR sandbox configuration
- Add comprehensive frontend test coverage (components, hooks, services)
- Add backend test coverage for document-numbering services
- Update OCR sidecar with typhoon-ocr integration
- Add AI policy service and execution profile management
- Update AGENTS.md and architecture documentation
2026-06-14 06:34:07 +07:00

6.7 KiB

// File: specs/200-fullstacks/236-unified-ocr-architecture/research.md // Change Log: // - 2026-06-13: Research decisions from ADR-036

Research: Unified AI Model Architecture — Sandbox-Production Parity

Overview

This document consolidates technical decisions from ADR-036 for the Unified AI Model Architecture feature. All decisions are already ratified in ADR-036; this document serves as a quick reference for implementation.

Decisions

D1: Calibration on Existing Profile/Prompt Stores

Decision: Reuse existing ai_execution_profiles as production parameter store and create new ai_sandbox_profiles as draft store. Do not create new parameter store in system_settings.

Rationale:

  • Existing ai_execution_profiles already has the right structure (profile_name, temperature, top_p, etc.)
  • Adding canonical_model column distinguishes np-dms-ai vs np-dms-ocr
  • Avoids schema bloat and migration complexity
  • Leverages existing Redis cache in AiPolicyService

Alternatives Considered:

  • Create new ai_model_parameters table → Rejected: unnecessary duplication
  • Use system_settings JSON → Rejected: loses type safety and queryability

D2: Dual-Model Parameter Management

Decision: Store OCR parameters in dedicated row ocr-extract with canonical_model='np-dms-ocr'. Make numCtx and maxTokens nullable for OCR (not used).

Rationale:

  • OCR has different parameter requirements than LLM (no context window, no max tokens)
  • Single table with canonical_model column simplifies queries
  • Nullable columns allow row-level variation without schema fragmentation

Alternatives Considered:

  • Separate ai_ocr_profiles table → Rejected: adds join complexity
  • JSON blob for model-specific params → Rejected: loses queryability

D3: Snapshot Semantics

Decision: Parameters are frozen at job dispatch time (snapshot), not lazy-loaded during processing. keep_alive is excluded from snapshot (lazy-loaded per ADR-033).

Rationale:

  • Ensures job consistency regardless of subsequent parameter changes
  • Allows safe parameter tuning without affecting running jobs
  • keep_alive is a resource parameter, not a model parameter (ADR-033)

Alternatives Considered:

  • Lazy-load parameters during processing → Rejected: race condition risk
  • Include keep_alive in snapshot → Rejected: violates ADR-033 residency logic

D4: Dual-Model Snapshot for OCR+LLM Jobs

Decision: Support ocrSnapshotParams (OCR) and snapshotParams (LLM) in AiJobPayload for jobs that use both models.

Rationale:

  • Migration jobs use both OCR and LLM
  • Each model needs its own parameter set
  • Separation allows independent tuning

Alternatives Considered:

  • Single snapshot with union of params → Rejected: unclear which params apply to which model
  • Job-level model selection → Rejected: adds complexity to processor logic

D5: Master Data Context Parity in Sandbox

Decision: Require project selection in sandbox tests (no 'default' project). Use selected project/contract context for master data lookup.

Rationale:

  • Eliminates parity gap where sandbox used 'default' while production used real project
  • Ensures sandbox tests accurately reflect production behavior
  • {{master_data_context}} in prompts will match production

Alternatives Considered:

  • Keep 'default' project for sandbox → Rejected: inaccurate test results
  • Auto-select first project → Rejected: hides context selection UI

D6: System Prompt Integration

Decision: System prompts managed via ADR-029 (ai_prompts table), not duplicated in parameter store. Parameter interface links to Prompt Version UI.

Rationale:

  • ADR-029 already has versioning, approval workflow, and audit trail
  • Avoids duplication and maintenance burden
  • Clear separation of concerns (prompts vs runtime parameters)

Alternatives Considered:

  • Store system prompt in ai_execution_profiles → Rejected: duplicates ADR-029
  • Inline system prompt in sandbox draft → Rejected: loses versioning

D7: Model Name Alignment

Decision: Update model names from typhoon2.5-np-dms/typhoon-np-dms-ocr to np-dms-ai/np-dms-ocr across codebase.

Rationale:

  • Canonical names are shorter and more semantic
  • Aligns with ADR-034 decision
  • Simplifies documentation and communication

Alternatives Considered:

  • Keep typhoon names → Rejected: inconsistent with ADR-034
  • Use generic names (main/ocr) → Rejected: loses semantic meaning

D8: Security Guardrails

Decision: Apply endpoint requires Idempotency-Key validation, CASL permission (system.manage_ai), and parameter range validation (temperature/topP 0-1).

Rationale:

  • Idempotency-Key prevents duplicate applies
  • CASL enforces RBAC
  • Range validation prevents invalid parameters
  • Audit logging tracks all changes

Alternatives Considered:

  • Skip Idempotency-Key → Rejected: risk of duplicate applies
  • Use weaker permission → Rejected: security risk

D9: Cache Invalidation

Decision: Invalidate Redis cache after applying parameters to production.

Rationale:

  • Ensures new jobs use updated parameters
  • Prevents stale cache issues
  • Simple DEL operation on cache key

Alternatives Considered:

  • Wait for cache TTL → Rejected: delayed effect
  • No cache invalidation → Rejected: stale parameters

D10: OCR Parameter Wiring to Sidecar

Decision: Add typhoonOptions to OcrDetectionInput and append temperature/topP/repeatPenalty to form data sent to sidecar.

Rationale:

  • Sidecar already accepts overrides via form data
  • Allows OCR model tuning without sidecar changes
  • Maintains existing contract

Alternatives Considered:

  • Modify sidecar API → Rejected: unnecessary infrastructure change
  • Hardcode params in sidecar → Rejected: loses tunability

Technology Stack

  • Backend: NestJS 11, TypeORM, Redis, BullMQ
  • Frontend: Next.js 16, TanStack Query, React Hook Form, Zod
  • Database: MariaDB 11.8
  • Testing: Jest (backend), Vitest + Playwright (frontend)

Performance Targets

  • Apply operation: <2s (including cache invalidation)
  • Sandbox test cycle: <5s (test → apply → verify)
  • Cache invalidation: <100ms

Security Considerations

  • CASL guard on apply endpoint
  • Idempotency-Key validation (5-minute window)
  • Parameter range validation (temperature/topP 0-1)
  • Audit logging for all apply operations
  • No direct DB/storage access from AI (ADR-023/023A)

Dependencies

  • ADR-029: Dynamic Prompt Management (system prompt integration)
  • ADR-033: Adaptive OCR Residency (keep_alive lazy-loading)
  • ADR-034: AI Model Change (canonical model names)
  • Existing AiPolicyService with Redis cache
  • Existing ai_audit_logs table