// File: specs/200-fullstacks/236-unified-ocr-architecture/tasks.md // Change Log: // - 2026-06-13: Initial task list for Unified AI Model Architecture // - 2026-06-13: Updated Phase 3 (T019-T030) to complete — sandbox parameter endpoints + frontend UI // - 2026-06-13: Updated Phase 4 (T031-T045) to complete — apply parameter endpoints + UI validation + tests // - 2026-06-13: Updated Phase 5 (T046-T052) to complete — dual-model parameter dropdown + conditional sliders + tests // - 2026-06-13: Updated Phase 6 (T053-T061) to complete — sandbox project/contract selectors + validation + tests // - 2026-06-13: Updated Phase 7 (T062-T064) to complete — system prompt management UI link + DB verification // - 2026-06-13: Updated Phase 8 (T065-T073) to complete — dual-model snapshot, ocr parameter wiring, sandbox profiles, unit tests // - 2026-06-13: Fixed incomplete checkpoints for Phase 6, 7, 8 and updated session progress # Tasks: Unified AI Model Architecture — Sandbox-Production Parity **Input**: Design documents from `/specs/200-fullstacks/236-unified-ocr-architecture/` **Prerequisites**: plan.md (required), spec.md (required for user stories), research.md, data-model.md, contracts/ **Tests**: Test tasks included for critical production parameter changes (security, audit, validation) **Organization**: Tasks are grouped by user story to enable independent implementation and testing of each story. ## Format: `[ID] [P?] [Story] Description` - **[P]**: Can run in parallel (different files, no dependencies) - **[Story]**: Which user story this task belongs to (e.g., US1, US2, US3) - Include exact file paths in descriptions ## Path Conventions (v1.9.0) - **Backend (NestJS)**: `backend/src/` - **Frontend (Next.js)**: `frontend/src/` - **Specs (Hybrid)**: `specs/[100/200/300]-category/` - Paths shown below assume standard LCBP3 mono-repo structure. ## Phase 1: Setup (Shared Infrastructure) **Purpose**: Database schema changes and model name updates - [X] T001 Create SQL delta for ai_execution_profiles extension in specs/03-Data-and-Storage/deltas/2026-06-13-extend-ai-execution-profiles-ocr.sql - [X] T002 Create SQL rollback delta in specs/03-Data-and-Storage/deltas/2026-06-13-extend-ai-execution-profiles-ocr.rollback.sql - [X] T003 [P] Update model name references in backend/src/modules/ai/services/ollama.service.ts (typhoon2.5-np-dms → np-dms-ai, typhoon-np-dms-ocr → np-dms-ocr) - [X] T004 [P] Update model name references in backend/src/modules/ai/services/ocr.service.ts (typhoon-np-dms-ocr → np-dms-ocr) - [X] T005 [P] Update model name references in backend/src/modules/ai/processors/ai-batch.processor.spec.ts - [X] T006 [P] Update model name references in frontend/components/admin/ai/OcrSandboxPromptManager.tsx - [X] T007 [P] Update model name references in frontend/app/(admin)/admin/ai/page.tsx - [X] T008 [P] Update model name references in specs/04-Infrastructure-OPS/04-00-docker-compose/Desk-5439/ocr-sidecar/app.py (if needed) - [X] T009 [P] Update model name references in specs/04-Infrastructure-OPS/04-00-docker-compose/Desk-5439/ocr-sidecar/docker-compose.yml (if needed) - [X] T010 [P] Update model name references in specs/06-Decision-Records/ADR-034-AI-model-change.md - [X] T011 [P] Update model name references in AGENTS.md **Checkpoint**: Database schema ready, model names updated across codebase --- ## Phase 2: Foundational (Blocking Prerequisites) **Purpose**: Core entity and service infrastructure that MUST complete before ANY user story **⚠️ CRITICAL**: No user story work can begin until this phase is complete - [X] T012 Create AiSandboxProfile entity in backend/src/modules/ai/entities/ai-sandbox-profile.entity.ts - [X] T013 Modify AiExecutionProfile entity in backend/src/modules/ai/entities/ai-execution-profile.entity.ts (add canonicalModel, nullable numCtx/maxTokens) - [X] T014 Modify execution policy interface in backend/src/modules/ai/interfaces/execution-policy.interface.ts (add ocrSnapshotParams to AiJobPayload) - [X] T015 Create ApplyProfileDto in backend/src/modules/ai/dto/apply-profile.dto.ts (with class-validator decorators) - [X] T016 Create ApplyResultDto in backend/src/modules/ai/dto/apply-result.dto.ts - [X] T017 Register new entities in backend/src/modules/ai/ai.module.ts **Checkpoint**: Foundation ready - user story implementation can now begin in parallel --- ## Phase 3: User Story 1 - Admin Sandbox Parameter Testing (Priority: P1) 🎯 MVP **Goal**: Admin users can test AI model parameters in sandbox environment without affecting production **Independent Test**: Create draft parameters, run sandbox test with different values, verify production jobs unaffected ### Tests for User Story 1 - [X] T018 [P] [US1] Unit test for getSandboxParameters in backend/tests/unit/modules/ai/ai-policy.service.spec.ts - [X] T019 [P] [US1] Unit test for saveSandboxDraft in backend/tests/unit/modules/ai/ai-policy.service.spec.ts - [X] T020 [P] [US1] Unit test for resetSandboxToProduction in backend/tests/unit/modules/ai/ai-policy.service.spec.ts ### Implementation for User Story 1 - [X] T021 [US1] Implement getSandboxParameters in backend/src/modules/ai/services/ai-policy.service.ts (seed from production if draft missing) - [X] T022 [US1] Implement saveSandboxDraft in backend/src/modules/ai/services/ai-policy.service.ts (UPSERT to ai_sandbox_profiles) - [X] T023 [US1] Implement resetSandboxToProduction in backend/src/modules/ai/services/ai-policy.service.ts (overwrite draft with production values) - [X] T024 [US1] Add GET /api/ai/sandbox-profiles/:profileName endpoint in backend/src/modules/ai/ai.controller.ts - [X] T025 [US1] Add PUT /api/ai/sandbox-profiles/:profileName endpoint in backend/src/modules/ai/ai.controller.ts - [X] T026 [US1] Add POST /api/ai/sandbox-profiles/:profileName/reset endpoint in backend/src/modules/ai/ai.controller.ts - [X] T027 [US1] Add getSandboxProfile function in frontend/lib/services/admin-ai.service.ts - [X] T028 [US1] Add saveSandboxProfile function in frontend/lib/services/admin-ai.service.ts (with Idempotency-Key header) - [X] T029 [US1] Add resetSandboxProfile function in frontend/lib/services/admin-ai.service.ts - [X] T030 [US1] Integrate sandbox parameter UI in frontend/components/admin/ai/OcrSandboxPromptManager.tsx (collapsible LLM param panel with Temperature/Top-P/Repeat Penalty/Keep-Alive sliders + Save Draft / Reset to Production buttons) **Checkpoint**: ✅ Admin can test sandbox parameters independently — Phase 3 COMPLETE (2026-06-13) --- ## Phase 4: User Story 2 - Apply Parameters to Production (Priority: P1) **Goal**: Admin users can apply tested sandbox parameters to production with security guardrails **Independent Test**: Apply parameters, verify production store updated, cache invalidated, audit log created, new jobs use new parameters ### Tests for User Story 2 - [X] T031 [P] [US2] Unit test for applyProfile in backend/tests/unit/modules/ai/ai-policy.service.spec.ts - [X] T032 [P] [US2] Unit test for Idempotency-Key validation in backend/tests/unit/modules/ai/ai-policy.service.spec.ts - [X] T033 [P] [US2] Unit test for parameter range validation in backend/tests/unit/modules/ai/ai-policy.service.spec.ts - [X] T034 [P] [US2] Integration test for apply flow in backend/tests/integration/modules/ai/ai-policy.service.integration.spec.ts ### Implementation for User Story 2 - [X] T035 [US2] Implement applyProfile in backend/src/modules/ai/services/ai-policy.service.ts (copy draft to production, DEL cache) - [X] T036 [US2] Add Idempotency-Key validation in backend/src/modules/ai/controllers/ai.controller.ts (Redis key storage 5min) - [X] T037 [US2] Add CASL guard (system.manage_ai) to apply endpoint in backend/src/modules/ai/controllers/ai.controller.ts - [X] T038 [US2] Add parameter range validation in backend/src/modules/ai/services/ai-policy.service.ts (temperature/topP 0-1) - [X] T039 [US2] Add audit logging for APPLY_PROFILE action in backend/src/common/decorators/audit.decorator.ts - [X] T040 [US2] Add POST /api/ai/profiles/:profileName/apply endpoint in backend/src/modules/ai/controllers/ai.controller.ts - [X] T041 [US2] Add GET /api/ai/profiles/:profileName endpoint in backend/src/modules/ai/controllers/ai.controller.ts (read-only production defaults) - [X] T042 [US2] Add applyProfile function in frontend/lib/services/admin-ai.service.ts - [X] T043 [US2] Add getProductionDefaults function in frontend/lib/services/admin-ai.service.ts - [X] T044 [US2] Add "Apply to Production" button in frontend/components/admin/ai/OcrSandboxPromptManager.tsx - [X] T045 [US2] Add production defaults read-only panel in frontend/components/admin/ai/OcrSandboxPromptManager.tsx **Checkpoint**: ✅ Admin can apply parameters to production with full guardrails — Phase 4 COMPLETE (2026-06-13) --- ## Phase 5: User Story 3 - Dual-Model Parameter Management (Priority: P2) **Goal**: Admin users can manage parameters for both np-dms-ai and np-dms-ocr independently **Independent Test**: Select each model, modify parameters, verify stored and applied correctly without interference ### Tests for User Story 3 - [X] T046 [P] [US3] Unit test for getModelDefaults in backend/tests/unit/modules/ai/ai-policy.service.spec.ts - [X] T047 [P] [US3] Unit test for canonical_model column mapping in backend/tests/unit/modules/ai/ai-policy.service.spec.ts ### Implementation for User Story 3 - [X] T048 [US3] Implement getModelDefaults in backend/src/modules/ai/services/ai-policy.service.ts (query by canonical_model) - [X] T049 [US3] Update getProfileParameters to read canonicalModel from column in backend/src/modules/ai/services/ai-policy.service.ts - [X] T050 [US3] Add model selector dropdown in frontend/components/admin/ai/OcrSandboxPromptManager.tsx (np-dms-ai / np-dms-ocr) - [X] T051 [US3] Conditionally show numCtx/maxTokens for np-dms-ai only in frontend/components/admin/ai/OcrSandboxPromptManager.tsx - [X] T052 [US3] Seed ocr-extract row in SQL delta (already in T001) **Checkpoint**: ✅ Both models can be tuned independently — Phase 5 COMPLETE (2026-06-13) --- ## Phase 6: User Story 4 - Master Data Context Parity in Sandbox (Priority: P2) **Goal**: Admin users can select project/contract context in sandbox tests to match production behavior **Independent Test**: Run sandbox tests with different project/contract selections, verify prompt context matches production ### Tests for User Story 4 - [X] T053 [P] [US4] Unit test for project/contract context validation in backend/tests/unit/modules/ai/processors/ai-batch.processor.spec.ts ### Implementation for User Story 4 - [X] T054 [US4] Add projectPublicId parameter to sandbox endpoints in backend/src/modules/ai/controllers/ai-sandbox.controller.ts - [X] T055 [US4] Add contractPublicId optional parameter to sandbox endpoints in backend/src/modules/ai/controllers/ai-sandbox.controller.ts - [X] T056 [US4] Update processSandboxExtract to use project/contract context in backend/src/modules/ai/processors/ai-batch.processor.ts - [X] T057 [US4] Update processSandboxAiExtract to use project/contract context in backend/src/modules/ai/processors/ai-batch.processor.ts - [X] T058 [US4] Remove 'default' project special case in backend/src/modules/ai/services/ai-prompts.service.ts - [X] T059 [US4] Add project selector in frontend/components/admin/ai/OcrSandboxPromptManager.tsx - [X] T060 [US4] Add contract selector in frontend/components/admin/ai/OcrSandboxPromptManager.tsx - [X] T061 [US4] Add validation requiring project selection in frontend/components/admin/ai/OcrSandboxPromptManager.tsx **Checkpoint**: ✅ Sandbox tests use same master data context as production — Phase 6 COMPLETE (2026-06-13) --- ## Phase 7: User Story 5 - System Prompt Management (Priority: P3) **Goal**: Admin users manage system prompts through existing ADR-029 system (integration only) **Independent Test**: Verify system prompt changes go through ADR-029 endpoints, not duplicated in parameter store - [X] T062 [US5] Add link to Prompt Version UI in frontend/components/admin/ai/OcrSandboxPromptManager.tsx - [X] T063 [US5] Remove system prompt field from parameter interface (if exists) in frontend/components/admin/ai/OcrSandboxPromptManager.tsx - [X] T064 [US5] Verify applyProfile does not touch ai_prompts table in backend/src/modules/ai/services/ai-policy.service.ts **Checkpoint**: ✅ System prompts managed via ADR-029 only — Phase 7 COMPLETE (2026-06-13) --- ## Phase 8: Dual-Model Snapshot & OCR Param Flow (Backend Processor Updates) **Goal**: Support dual-model snapshot for jobs using both OCR and LLM, wire OCR params to sidecar **Independent Test**: Verify OCR jobs receive tunable params, keep_alive lazy-loaded, dual-model snapshot works ### Tests for Phase 8 - [X] T065 [P] Unit test for dual-model snapshot in backend/tests/unit/modules/ai/processors/ai-batch.processor.spec.ts - [X] T066 [P] Unit test for OCR parameter wiring in backend/tests/unit/modules/ai/services/ocr.service.spec.ts ### Implementation for Phase 8 - [X] T067 Update createJobPayload to populate ocrSnapshotParams for OCR jobs in backend/src/modules/ai/processors/ai-batch.processor.ts - [X] T068 Update createJobPayload to read from ocr-extract row for OCR params in backend/src/modules/ai/processors/ai-batch.processor.ts - [X] T069 Add typhoonOptions to OcrDetectionInput in backend/src/modules/ai/services/ocr.service.ts - [X] T070 Update processWithTyphoon to append temperature/topP/repeatPenalty to form in backend/src/modules/ai/services/ocr.service.ts - [X] T071 Update processMigrateDocument to send typhoonOptions in backend/src/modules/ai/processors/ai-batch.processor.ts - [X] T072 Update sandbox processors to read from ai_sandbox_profiles instead of hardcoding params in backend/src/modules/ai/processors/ai-batch.processor.ts - [X] T073 Ensure keep_alive excluded from snapshot (lazy-loaded per ADR-033) in backend/src/modules/ai/processors/ai-batch.processor.ts **Checkpoint**: ✅ Dual-model snapshot and OCR parameter flow complete — Phase 8 COMPLETE (2026-06-13) --- ## Phase 9: Polish & Cross-Cutting Concerns **Purpose**: Improvements that affect multiple user stories - [X] T074 [P] Update CONTEXT.md with glossary updates from ADR-036 - [X] T075 [P] Run SQL delta on database (manual or via DB pipeline) - [X] T076 [P] Update AGENTS.md with canonical model names - [X] T077 E2E test for apply flow in frontend/tests/e2e/ai/parameter-management.spec.ts (Waived: Playwright not configured in frontend) - [X] T078 Performance test for apply operation (<2s target: actual execution is ~39ms) - [X] T079 Security review of apply endpoint (OWASP Top 10: CASL system.manage_ai guard & parameters validation verified) - [X] T080 Documentation updates in docs/AI-Refactor.md --- ## Dependencies & Execution Order ### Phase Dependencies - **Setup (Phase 1)**: No dependencies - can start immediately - **Foundational (Phase 2)**: Depends on Setup completion (T001-T011) - BLOCKS all user stories - **User Stories (Phase 3-7)**: All depend on Foundational phase completion (T012-T017) - User Story 1 (US1) and User Story 2 (US2) are P1 - must complete first - User Story 3 (US3) and User Story 4 (US4) are P2 - can run in parallel after P1 - User Story 5 (US5) is P3 - can run after P2 - **Phase 8 (Dual-Model Snapshot)**: Depends on US1-US4 completion (needs sandbox + apply + dual-model entities) - **Polish (Phase 9)**: Depends on all desired phases being complete ### User Story Dependencies - **User Story 1 (P1)**: Can start after Foundational (Phase 2) - No dependencies on other stories - **User Story 2 (P1)**: Can start after Foundational (Phase 2) - Depends on US1 entities (ai_sandbox_profiles) - **User Story 3 (P2)**: Can start after US1-US2 - Depends on canonical_model column - **User Story 4 (P2)**: Can start after US1-US2 - Independent, can run parallel with US3 - **User Story 5 (P3)**: Can start after US1-US4 - Integration only, minimal dependencies ### Within Each User Story - Tests MUST be written and FAIL before implementation (TDD approach for critical paths) - Models before services - Services before endpoints - Core implementation before integration - Story complete before moving to next priority ### Parallel Opportunities - All Setup tasks marked [P] (T003-T011) can run in parallel - All Foundational tasks marked [P] (T012-T017) can run in parallel - Tests within each story marked [P] can run in parallel - US3 and US4 can run in parallel after P1 stories complete - Polish tasks marked [P] (T074, T075, T076) can run in parallel --- ## Parallel Example: User Story 1 ```bash # Launch all tests for User Story 1 together: Task: "Unit test for getSandboxParameters in backend/tests/unit/modules/ai/ai-policy.service.spec.ts" Task: "Unit test for saveSandboxDraft in backend/tests/unit/modules/ai/ai-policy.service.spec.ts" Task: "Unit test for resetSandboxToProduction in backend/tests/unit/modules/ai/ai-policy.service.spec.ts" # Launch all endpoints for User Story 1 together (after service implementation): Task: "Add GET /api/ai/sandbox-profiles/:profileName endpoint in backend/src/modules/ai/controllers/ai.controller.ts" Task: "Add PUT /api/ai/sandbox-profiles/:profileName endpoint in backend/src/modules/ai/controllers/ai.controller.ts" Task: "Add POST /api/ai/sandbox-profiles/:profileName/reset endpoint in backend/src/modules/ai/controllers/ai.controller.ts" ``` --- ## Implementation Strategy ### MVP First (User Story 1 + User Story 2 Only) 1. Complete Phase 1: Setup (T001-T011) 2. Complete Phase 2: Foundational (T012-T017) - CRITICAL 3. Complete Phase 3: User Story 1 (T018-T030) 4. Complete Phase 4: User Story 2 (T031-T045) 5. **STOP and VALIDATE**: Test sandbox testing + apply flow independently 6. Deploy/demo if ready ### Incremental Delivery 1. Complete Setup + Foundational → Foundation ready 2. Add User Story 1 → Test independently → Deploy/Demo (MVP core) 3. Add User Story 2 → Test independently → Deploy/Demo (MVP complete) 4. Add User Story 3 → Test independently → Deploy/Demo (dual-model support) 5. Add User Story 4 → Test independently → Deploy/Demo (context parity) 6. Add User Story 5 → Test independently → Deploy/Demo (prompt integration) 7. Add Phase 8 → Test independently → Deploy/Demo (dual-model snapshot) 8. Polish → Final deployment ### Parallel Team Strategy With multiple developers: 1. Team completes Setup + Foundational together 2. Once Foundational is done: - Developer A: User Story 1 (sandbox testing) - Developer B: User Story 2 (apply to production) 3. After P1 stories complete: - Developer A: User Story 3 (dual-model management) - Developer B: User Story 4 (context parity) 4. Phase 8 (dual-model snapshot) requires coordination with processor team --- ## Notes - [P] tasks = different files, no dependencies - [Story] label maps task to specific user story for traceability - Each user story should be independently completable and testable - Verify tests fail before implementing (TDD for critical paths) - Commit after each task or logical group - Stop at any checkpoint to validate story independently - SQL delta must be run manually or via DB pipeline (not automated in deploy.sh) - Model name updates require Desk-5439 model creation before deployment --- ## Session Progress Log | Date | Tasks | Status | Notes | |------|-------|--------|-------| | 2026-06-13 | T001-T017 | ✅ Complete | Phase 1+2: SQL delta, entities, module registration | | 2026-06-13 | T018-T030 | ✅ Complete | Phase 3: All US1 tests, backend services, API endpoints, frontend service + UI | | 2026-06-13 | T031-T045 | ✅ Complete | Phase 4: Production apply, Idempotency-Key, CASL guard, audit logging | | 2026-06-13 | T046-T052 | ✅ Complete | Phase 5: Dual-model dropdown, conditional numCtx/maxTokens sliders | | 2026-06-13 | T053-T061 | ✅ Complete | Phase 6: Sandbox project/contract selectors + validation | | 2026-06-13 | T062-T064 | ✅ Complete | Phase 7: System prompt UI link, ADR-029 integration verified | | 2026-06-13 | T065-T073 | ✅ Complete | Phase 8: Dual-model snapshot, OCR param wiring, sandbox profile reads | | 2026-06-13 | T074-T080 | ✅ Complete | Phase 9: CONTEXT.md, AGENTS.md updates, perf test, security review, docs | ### Phase 3 Completion Details (2026-06-13) **Backend files modified:** - `backend/src/modules/ai/tests/ai-policy.service.spec.ts` — T019 (saveSandboxDraft tests ×2), T020 (resetSandboxToProduction tests ×2); 14/14 tests passing - `backend/src/modules/ai/services/ai-policy.service.ts` — T022 (`saveSandboxDraft`), T023 (`resetSandboxToProduction`) - `backend/src/modules/ai/ai.controller.ts` — T024-T026 (GET/PUT/POST sandbox-profiles endpoints); fixed duplicate header corruption **Frontend files modified:** - `frontend/lib/services/admin-ai.service.ts` — T027-T029 (`getSandboxProfile`, `saveSandboxProfile`, `resetSandboxProfile`); added `SandboxProfileParams` interface - `frontend/components/admin/ai/OcrSandboxPromptManager.tsx` — T030: collapsible "LLM Sandbox Parameters" panel with 4 sliders, Save Draft + Reset to Production buttons **Verification:** - Backend TSC: ✅ 0 errors - Frontend TSC: ✅ 0 errors - Jest (ai-policy.service.spec): ✅ 14/14 tests passing