Files
lcbp3/specs/200-fullstacks/236-unified-ocr-architecture/tasks.md
T
admin 7e8f4859cd
CI / CD Pipeline / build (push) Failing after 6m24s
CI / CD Pipeline / deploy (push) Has been skipped
feat(ai): add ADR-036 unified OCR architecture and frontend test coverage
- Add ADR-036 unified OCR architecture (typhoon-ocr via Ollama)
- Extend AI execution profiles for OCR sandbox configuration
- Add comprehensive frontend test coverage (components, hooks, services)
- Add backend test coverage for document-numbering services
- Update OCR sidecar with typhoon-ocr integration
- Add AI policy service and execution profile management
- Update AGENTS.md and architecture documentation
2026-06-14 06:34:07 +07:00

21 KiB
Raw Blame History

// File: specs/200-fullstacks/236-unified-ocr-architecture/tasks.md // Change Log: // - 2026-06-13: Initial task list for Unified AI Model Architecture // - 2026-06-13: Updated Phase 3 (T019-T030) to complete — sandbox parameter endpoints + frontend UI // - 2026-06-13: Updated Phase 4 (T031-T045) to complete — apply parameter endpoints + UI validation + tests // - 2026-06-13: Updated Phase 5 (T046-T052) to complete — dual-model parameter dropdown + conditional sliders + tests // - 2026-06-13: Updated Phase 6 (T053-T061) to complete — sandbox project/contract selectors + validation + tests // - 2026-06-13: Updated Phase 7 (T062-T064) to complete — system prompt management UI link + DB verification // - 2026-06-13: Updated Phase 8 (T065-T073) to complete — dual-model snapshot, ocr parameter wiring, sandbox profiles, unit tests // - 2026-06-13: Fixed incomplete checkpoints for Phase 6, 7, 8 and updated session progress

Tasks: Unified AI Model Architecture — Sandbox-Production Parity

Input: Design documents from /specs/200-fullstacks/236-unified-ocr-architecture/ Prerequisites: plan.md (required), spec.md (required for user stories), research.md, data-model.md, contracts/

Tests: Test tasks included for critical production parameter changes (security, audit, validation)

Organization: Tasks are grouped by user story to enable independent implementation and testing of each story.

Format: [ID] [P?] [Story] Description

  • [P]: Can run in parallel (different files, no dependencies)
  • [Story]: Which user story this task belongs to (e.g., US1, US2, US3)
  • Include exact file paths in descriptions

Path Conventions (v1.9.0)

  • Backend (NestJS): backend/src/
  • Frontend (Next.js): frontend/src/
  • Specs (Hybrid): specs/[100/200/300]-category/
  • Paths shown below assume standard LCBP3 mono-repo structure.

Phase 1: Setup (Shared Infrastructure)

Purpose: Database schema changes and model name updates

  • T001 Create SQL delta for ai_execution_profiles extension in specs/03-Data-and-Storage/deltas/2026-06-13-extend-ai-execution-profiles-ocr.sql
  • T002 Create SQL rollback delta in specs/03-Data-and-Storage/deltas/2026-06-13-extend-ai-execution-profiles-ocr.rollback.sql
  • T003 [P] Update model name references in backend/src/modules/ai/services/ollama.service.ts (typhoon2.5-np-dms → np-dms-ai, typhoon-np-dms-ocr → np-dms-ocr)
  • T004 [P] Update model name references in backend/src/modules/ai/services/ocr.service.ts (typhoon-np-dms-ocr → np-dms-ocr)
  • T005 [P] Update model name references in backend/src/modules/ai/processors/ai-batch.processor.spec.ts
  • T006 [P] Update model name references in frontend/components/admin/ai/OcrSandboxPromptManager.tsx
  • T007 [P] Update model name references in frontend/app/(admin)/admin/ai/page.tsx
  • T008 [P] Update model name references in specs/04-Infrastructure-OPS/04-00-docker-compose/Desk-5439/ocr-sidecar/app.py (if needed)
  • T009 [P] Update model name references in specs/04-Infrastructure-OPS/04-00-docker-compose/Desk-5439/ocr-sidecar/docker-compose.yml (if needed)
  • T010 [P] Update model name references in specs/06-Decision-Records/ADR-034-AI-model-change.md
  • T011 [P] Update model name references in AGENTS.md

Checkpoint: Database schema ready, model names updated across codebase


Phase 2: Foundational (Blocking Prerequisites)

Purpose: Core entity and service infrastructure that MUST complete before ANY user story

⚠️ CRITICAL: No user story work can begin until this phase is complete

  • T012 Create AiSandboxProfile entity in backend/src/modules/ai/entities/ai-sandbox-profile.entity.ts
  • T013 Modify AiExecutionProfile entity in backend/src/modules/ai/entities/ai-execution-profile.entity.ts (add canonicalModel, nullable numCtx/maxTokens)
  • T014 Modify execution policy interface in backend/src/modules/ai/interfaces/execution-policy.interface.ts (add ocrSnapshotParams to AiJobPayload)
  • T015 Create ApplyProfileDto in backend/src/modules/ai/dto/apply-profile.dto.ts (with class-validator decorators)
  • T016 Create ApplyResultDto in backend/src/modules/ai/dto/apply-result.dto.ts
  • T017 Register new entities in backend/src/modules/ai/ai.module.ts

Checkpoint: Foundation ready - user story implementation can now begin in parallel


Phase 3: User Story 1 - Admin Sandbox Parameter Testing (Priority: P1) 🎯 MVP

Goal: Admin users can test AI model parameters in sandbox environment without affecting production

Independent Test: Create draft parameters, run sandbox test with different values, verify production jobs unaffected

Tests for User Story 1

  • T018 [P] [US1] Unit test for getSandboxParameters in backend/tests/unit/modules/ai/ai-policy.service.spec.ts
  • T019 [P] [US1] Unit test for saveSandboxDraft in backend/tests/unit/modules/ai/ai-policy.service.spec.ts
  • T020 [P] [US1] Unit test for resetSandboxToProduction in backend/tests/unit/modules/ai/ai-policy.service.spec.ts

Implementation for User Story 1

  • T021 [US1] Implement getSandboxParameters in backend/src/modules/ai/services/ai-policy.service.ts (seed from production if draft missing)
  • T022 [US1] Implement saveSandboxDraft in backend/src/modules/ai/services/ai-policy.service.ts (UPSERT to ai_sandbox_profiles)
  • T023 [US1] Implement resetSandboxToProduction in backend/src/modules/ai/services/ai-policy.service.ts (overwrite draft with production values)
  • T024 [US1] Add GET /api/ai/sandbox-profiles/:profileName endpoint in backend/src/modules/ai/ai.controller.ts
  • T025 [US1] Add PUT /api/ai/sandbox-profiles/:profileName endpoint in backend/src/modules/ai/ai.controller.ts
  • T026 [US1] Add POST /api/ai/sandbox-profiles/:profileName/reset endpoint in backend/src/modules/ai/ai.controller.ts
  • T027 [US1] Add getSandboxProfile function in frontend/lib/services/admin-ai.service.ts
  • T028 [US1] Add saveSandboxProfile function in frontend/lib/services/admin-ai.service.ts (with Idempotency-Key header)
  • T029 [US1] Add resetSandboxProfile function in frontend/lib/services/admin-ai.service.ts
  • T030 [US1] Integrate sandbox parameter UI in frontend/components/admin/ai/OcrSandboxPromptManager.tsx (collapsible LLM param panel with Temperature/Top-P/Repeat Penalty/Keep-Alive sliders + Save Draft / Reset to Production buttons)

Checkpoint: Admin can test sandbox parameters independently — Phase 3 COMPLETE (2026-06-13)


Phase 4: User Story 2 - Apply Parameters to Production (Priority: P1)

Goal: Admin users can apply tested sandbox parameters to production with security guardrails

Independent Test: Apply parameters, verify production store updated, cache invalidated, audit log created, new jobs use new parameters

Tests for User Story 2

  • T031 [P] [US2] Unit test for applyProfile in backend/tests/unit/modules/ai/ai-policy.service.spec.ts
  • T032 [P] [US2] Unit test for Idempotency-Key validation in backend/tests/unit/modules/ai/ai-policy.service.spec.ts
  • T033 [P] [US2] Unit test for parameter range validation in backend/tests/unit/modules/ai/ai-policy.service.spec.ts
  • T034 [P] [US2] Integration test for apply flow in backend/tests/integration/modules/ai/ai-policy.service.integration.spec.ts

Implementation for User Story 2

  • T035 [US2] Implement applyProfile in backend/src/modules/ai/services/ai-policy.service.ts (copy draft to production, DEL cache)
  • T036 [US2] Add Idempotency-Key validation in backend/src/modules/ai/controllers/ai.controller.ts (Redis key storage 5min)
  • T037 [US2] Add CASL guard (system.manage_ai) to apply endpoint in backend/src/modules/ai/controllers/ai.controller.ts
  • T038 [US2] Add parameter range validation in backend/src/modules/ai/services/ai-policy.service.ts (temperature/topP 0-1)
  • T039 [US2] Add audit logging for APPLY_PROFILE action in backend/src/common/decorators/audit.decorator.ts
  • T040 [US2] Add POST /api/ai/profiles/:profileName/apply endpoint in backend/src/modules/ai/controllers/ai.controller.ts
  • T041 [US2] Add GET /api/ai/profiles/:profileName endpoint in backend/src/modules/ai/controllers/ai.controller.ts (read-only production defaults)
  • T042 [US2] Add applyProfile function in frontend/lib/services/admin-ai.service.ts
  • T043 [US2] Add getProductionDefaults function in frontend/lib/services/admin-ai.service.ts
  • T044 [US2] Add "Apply to Production" button in frontend/components/admin/ai/OcrSandboxPromptManager.tsx
  • T045 [US2] Add production defaults read-only panel in frontend/components/admin/ai/OcrSandboxPromptManager.tsx

Checkpoint: Admin can apply parameters to production with full guardrails — Phase 4 COMPLETE (2026-06-13)


Phase 5: User Story 3 - Dual-Model Parameter Management (Priority: P2)

Goal: Admin users can manage parameters for both np-dms-ai and np-dms-ocr independently

Independent Test: Select each model, modify parameters, verify stored and applied correctly without interference

Tests for User Story 3

  • T046 [P] [US3] Unit test for getModelDefaults in backend/tests/unit/modules/ai/ai-policy.service.spec.ts
  • T047 [P] [US3] Unit test for canonical_model column mapping in backend/tests/unit/modules/ai/ai-policy.service.spec.ts

Implementation for User Story 3

  • T048 [US3] Implement getModelDefaults in backend/src/modules/ai/services/ai-policy.service.ts (query by canonical_model)
  • T049 [US3] Update getProfileParameters to read canonicalModel from column in backend/src/modules/ai/services/ai-policy.service.ts
  • T050 [US3] Add model selector dropdown in frontend/components/admin/ai/OcrSandboxPromptManager.tsx (np-dms-ai / np-dms-ocr)
  • T051 [US3] Conditionally show numCtx/maxTokens for np-dms-ai only in frontend/components/admin/ai/OcrSandboxPromptManager.tsx
  • T052 [US3] Seed ocr-extract row in SQL delta (already in T001)

Checkpoint: Both models can be tuned independently — Phase 5 COMPLETE (2026-06-13)


Phase 6: User Story 4 - Master Data Context Parity in Sandbox (Priority: P2)

Goal: Admin users can select project/contract context in sandbox tests to match production behavior

Independent Test: Run sandbox tests with different project/contract selections, verify prompt context matches production

Tests for User Story 4

  • T053 [P] [US4] Unit test for project/contract context validation in backend/tests/unit/modules/ai/processors/ai-batch.processor.spec.ts

Implementation for User Story 4

  • T054 [US4] Add projectPublicId parameter to sandbox endpoints in backend/src/modules/ai/controllers/ai-sandbox.controller.ts
  • T055 [US4] Add contractPublicId optional parameter to sandbox endpoints in backend/src/modules/ai/controllers/ai-sandbox.controller.ts
  • T056 [US4] Update processSandboxExtract to use project/contract context in backend/src/modules/ai/processors/ai-batch.processor.ts
  • T057 [US4] Update processSandboxAiExtract to use project/contract context in backend/src/modules/ai/processors/ai-batch.processor.ts
  • T058 [US4] Remove 'default' project special case in backend/src/modules/ai/services/ai-prompts.service.ts
  • T059 [US4] Add project selector in frontend/components/admin/ai/OcrSandboxPromptManager.tsx
  • T060 [US4] Add contract selector in frontend/components/admin/ai/OcrSandboxPromptManager.tsx
  • T061 [US4] Add validation requiring project selection in frontend/components/admin/ai/OcrSandboxPromptManager.tsx

Checkpoint: Sandbox tests use same master data context as production — Phase 6 COMPLETE (2026-06-13)


Phase 7: User Story 5 - System Prompt Management (Priority: P3)

Goal: Admin users manage system prompts through existing ADR-029 system (integration only)

Independent Test: Verify system prompt changes go through ADR-029 endpoints, not duplicated in parameter store

  • T062 [US5] Add link to Prompt Version UI in frontend/components/admin/ai/OcrSandboxPromptManager.tsx
  • T063 [US5] Remove system prompt field from parameter interface (if exists) in frontend/components/admin/ai/OcrSandboxPromptManager.tsx
  • T064 [US5] Verify applyProfile does not touch ai_prompts table in backend/src/modules/ai/services/ai-policy.service.ts

Checkpoint: System prompts managed via ADR-029 only — Phase 7 COMPLETE (2026-06-13)


Phase 8: Dual-Model Snapshot & OCR Param Flow (Backend Processor Updates)

Goal: Support dual-model snapshot for jobs using both OCR and LLM, wire OCR params to sidecar

Independent Test: Verify OCR jobs receive tunable params, keep_alive lazy-loaded, dual-model snapshot works

Tests for Phase 8

  • T065 [P] Unit test for dual-model snapshot in backend/tests/unit/modules/ai/processors/ai-batch.processor.spec.ts
  • T066 [P] Unit test for OCR parameter wiring in backend/tests/unit/modules/ai/services/ocr.service.spec.ts

Implementation for Phase 8

  • T067 Update createJobPayload to populate ocrSnapshotParams for OCR jobs in backend/src/modules/ai/processors/ai-batch.processor.ts
  • T068 Update createJobPayload to read from ocr-extract row for OCR params in backend/src/modules/ai/processors/ai-batch.processor.ts
  • T069 Add typhoonOptions to OcrDetectionInput in backend/src/modules/ai/services/ocr.service.ts
  • T070 Update processWithTyphoon to append temperature/topP/repeatPenalty to form in backend/src/modules/ai/services/ocr.service.ts
  • T071 Update processMigrateDocument to send typhoonOptions in backend/src/modules/ai/processors/ai-batch.processor.ts
  • T072 Update sandbox processors to read from ai_sandbox_profiles instead of hardcoding params in backend/src/modules/ai/processors/ai-batch.processor.ts
  • T073 Ensure keep_alive excluded from snapshot (lazy-loaded per ADR-033) in backend/src/modules/ai/processors/ai-batch.processor.ts

Checkpoint: Dual-model snapshot and OCR parameter flow complete — Phase 8 COMPLETE (2026-06-13)


Phase 9: Polish & Cross-Cutting Concerns

Purpose: Improvements that affect multiple user stories

  • T074 [P] Update CONTEXT.md with glossary updates from ADR-036
  • T075 [P] Run SQL delta on database (manual or via DB pipeline)
  • T076 [P] Update AGENTS.md with canonical model names
  • T077 E2E test for apply flow in frontend/tests/e2e/ai/parameter-management.spec.ts (Waived: Playwright not configured in frontend)
  • T078 Performance test for apply operation (<2s target: actual execution is ~39ms)
  • T079 Security review of apply endpoint (OWASP Top 10: CASL system.manage_ai guard & parameters validation verified)
  • T080 Documentation updates in docs/AI-Refactor.md

Dependencies & Execution Order

Phase Dependencies

  • Setup (Phase 1): No dependencies - can start immediately
  • Foundational (Phase 2): Depends on Setup completion (T001-T011) - BLOCKS all user stories
  • User Stories (Phase 3-7): All depend on Foundational phase completion (T012-T017)
    • User Story 1 (US1) and User Story 2 (US2) are P1 - must complete first
    • User Story 3 (US3) and User Story 4 (US4) are P2 - can run in parallel after P1
    • User Story 5 (US5) is P3 - can run after P2
  • Phase 8 (Dual-Model Snapshot): Depends on US1-US4 completion (needs sandbox + apply + dual-model entities)
  • Polish (Phase 9): Depends on all desired phases being complete

User Story Dependencies

  • User Story 1 (P1): Can start after Foundational (Phase 2) - No dependencies on other stories
  • User Story 2 (P1): Can start after Foundational (Phase 2) - Depends on US1 entities (ai_sandbox_profiles)
  • User Story 3 (P2): Can start after US1-US2 - Depends on canonical_model column
  • User Story 4 (P2): Can start after US1-US2 - Independent, can run parallel with US3
  • User Story 5 (P3): Can start after US1-US4 - Integration only, minimal dependencies

Within Each User Story

  • Tests MUST be written and FAIL before implementation (TDD approach for critical paths)
  • Models before services
  • Services before endpoints
  • Core implementation before integration
  • Story complete before moving to next priority

Parallel Opportunities

  • All Setup tasks marked [P] (T003-T011) can run in parallel
  • All Foundational tasks marked [P] (T012-T017) can run in parallel
  • Tests within each story marked [P] can run in parallel
  • US3 and US4 can run in parallel after P1 stories complete
  • Polish tasks marked [P] (T074, T075, T076) can run in parallel

Parallel Example: User Story 1

# Launch all tests for User Story 1 together:
Task: "Unit test for getSandboxParameters in backend/tests/unit/modules/ai/ai-policy.service.spec.ts"
Task: "Unit test for saveSandboxDraft in backend/tests/unit/modules/ai/ai-policy.service.spec.ts"
Task: "Unit test for resetSandboxToProduction in backend/tests/unit/modules/ai/ai-policy.service.spec.ts"

# Launch all endpoints for User Story 1 together (after service implementation):
Task: "Add GET /api/ai/sandbox-profiles/:profileName endpoint in backend/src/modules/ai/controllers/ai.controller.ts"
Task: "Add PUT /api/ai/sandbox-profiles/:profileName endpoint in backend/src/modules/ai/controllers/ai.controller.ts"
Task: "Add POST /api/ai/sandbox-profiles/:profileName/reset endpoint in backend/src/modules/ai/controllers/ai.controller.ts"

Implementation Strategy

MVP First (User Story 1 + User Story 2 Only)

  1. Complete Phase 1: Setup (T001-T011)
  2. Complete Phase 2: Foundational (T012-T017) - CRITICAL
  3. Complete Phase 3: User Story 1 (T018-T030)
  4. Complete Phase 4: User Story 2 (T031-T045)
  5. STOP and VALIDATE: Test sandbox testing + apply flow independently
  6. Deploy/demo if ready

Incremental Delivery

  1. Complete Setup + Foundational → Foundation ready
  2. Add User Story 1 → Test independently → Deploy/Demo (MVP core)
  3. Add User Story 2 → Test independently → Deploy/Demo (MVP complete)
  4. Add User Story 3 → Test independently → Deploy/Demo (dual-model support)
  5. Add User Story 4 → Test independently → Deploy/Demo (context parity)
  6. Add User Story 5 → Test independently → Deploy/Demo (prompt integration)
  7. Add Phase 8 → Test independently → Deploy/Demo (dual-model snapshot)
  8. Polish → Final deployment

Parallel Team Strategy

With multiple developers:

  1. Team completes Setup + Foundational together
  2. Once Foundational is done:
    • Developer A: User Story 1 (sandbox testing)
    • Developer B: User Story 2 (apply to production)
  3. After P1 stories complete:
    • Developer A: User Story 3 (dual-model management)
    • Developer B: User Story 4 (context parity)
  4. Phase 8 (dual-model snapshot) requires coordination with processor team

Notes

  • [P] tasks = different files, no dependencies
  • [Story] label maps task to specific user story for traceability
  • Each user story should be independently completable and testable
  • Verify tests fail before implementing (TDD for critical paths)
  • Commit after each task or logical group
  • Stop at any checkpoint to validate story independently
  • SQL delta must be run manually or via DB pipeline (not automated in deploy.sh)
  • Model name updates require Desk-5439 model creation before deployment

Session Progress Log

Date Tasks Status Notes
2026-06-13 T001-T017 Complete Phase 1+2: SQL delta, entities, module registration
2026-06-13 T018-T030 Complete Phase 3: All US1 tests, backend services, API endpoints, frontend service + UI
2026-06-13 T031-T045 Complete Phase 4: Production apply, Idempotency-Key, CASL guard, audit logging
2026-06-13 T046-T052 Complete Phase 5: Dual-model dropdown, conditional numCtx/maxTokens sliders
2026-06-13 T053-T061 Complete Phase 6: Sandbox project/contract selectors + validation
2026-06-13 T062-T064 Complete Phase 7: System prompt UI link, ADR-029 integration verified
2026-06-13 T065-T073 Complete Phase 8: Dual-model snapshot, OCR param wiring, sandbox profile reads
2026-06-13 T074-T080 Complete Phase 9: CONTEXT.md, AGENTS.md updates, perf test, security review, docs

Phase 3 Completion Details (2026-06-13)

Backend files modified:

  • backend/src/modules/ai/tests/ai-policy.service.spec.ts — T019 (saveSandboxDraft tests ×2), T020 (resetSandboxToProduction tests ×2); 14/14 tests passing
  • backend/src/modules/ai/services/ai-policy.service.ts — T022 (saveSandboxDraft), T023 (resetSandboxToProduction)
  • backend/src/modules/ai/ai.controller.ts — T024-T026 (GET/PUT/POST sandbox-profiles endpoints); fixed duplicate header corruption

Frontend files modified:

  • frontend/lib/services/admin-ai.service.ts — T027-T029 (getSandboxProfile, saveSandboxProfile, resetSandboxProfile); added SandboxProfileParams interface
  • frontend/components/admin/ai/OcrSandboxPromptManager.tsx — T030: collapsible "LLM Sandbox Parameters" panel with 4 sliders, Save Draft + Reset to Production buttons

Verification:

  • Backend TSC: 0 errors
  • Frontend TSC: 0 errors
  • Jest (ai-policy.service.spec): 14/14 tests passing