Files
lcbp3/specs/200-fullstacks/236-unified-ocr-architecture/tasks.md
T
admin 7e8f4859cd
CI / CD Pipeline / build (push) Failing after 6m24s
CI / CD Pipeline / deploy (push) Has been skipped
feat(ai): add ADR-036 unified OCR architecture and frontend test coverage
- Add ADR-036 unified OCR architecture (typhoon-ocr via Ollama)
- Extend AI execution profiles for OCR sandbox configuration
- Add comprehensive frontend test coverage (components, hooks, services)
- Add backend test coverage for document-numbering services
- Update OCR sidecar with typhoon-ocr integration
- Add AI policy service and execution profile management
- Update AGENTS.md and architecture documentation
2026-06-14 06:34:07 +07:00

365 lines
21 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
// File: specs/200-fullstacks/236-unified-ocr-architecture/tasks.md
// Change Log:
// - 2026-06-13: Initial task list for Unified AI Model Architecture
// - 2026-06-13: Updated Phase 3 (T019-T030) to complete — sandbox parameter endpoints + frontend UI
// - 2026-06-13: Updated Phase 4 (T031-T045) to complete — apply parameter endpoints + UI validation + tests
// - 2026-06-13: Updated Phase 5 (T046-T052) to complete — dual-model parameter dropdown + conditional sliders + tests
// - 2026-06-13: Updated Phase 6 (T053-T061) to complete — sandbox project/contract selectors + validation + tests
// - 2026-06-13: Updated Phase 7 (T062-T064) to complete — system prompt management UI link + DB verification
// - 2026-06-13: Updated Phase 8 (T065-T073) to complete — dual-model snapshot, ocr parameter wiring, sandbox profiles, unit tests
// - 2026-06-13: Fixed incomplete checkpoints for Phase 6, 7, 8 and updated session progress
# Tasks: Unified AI Model Architecture — Sandbox-Production Parity
**Input**: Design documents from `/specs/200-fullstacks/236-unified-ocr-architecture/`
**Prerequisites**: plan.md (required), spec.md (required for user stories), research.md, data-model.md, contracts/
**Tests**: Test tasks included for critical production parameter changes (security, audit, validation)
**Organization**: Tasks are grouped by user story to enable independent implementation and testing of each story.
## Format: `[ID] [P?] [Story] Description`
- **[P]**: Can run in parallel (different files, no dependencies)
- **[Story]**: Which user story this task belongs to (e.g., US1, US2, US3)
- Include exact file paths in descriptions
## Path Conventions (v1.9.0)
- **Backend (NestJS)**: `backend/src/`
- **Frontend (Next.js)**: `frontend/src/`
- **Specs (Hybrid)**: `specs/[100/200/300]-category/`
- Paths shown below assume standard LCBP3 mono-repo structure.
## Phase 1: Setup (Shared Infrastructure)
**Purpose**: Database schema changes and model name updates
- [X] T001 Create SQL delta for ai_execution_profiles extension in specs/03-Data-and-Storage/deltas/2026-06-13-extend-ai-execution-profiles-ocr.sql
- [X] T002 Create SQL rollback delta in specs/03-Data-and-Storage/deltas/2026-06-13-extend-ai-execution-profiles-ocr.rollback.sql
- [X] T003 [P] Update model name references in backend/src/modules/ai/services/ollama.service.ts (typhoon2.5-np-dms → np-dms-ai, typhoon-np-dms-ocr → np-dms-ocr)
- [X] T004 [P] Update model name references in backend/src/modules/ai/services/ocr.service.ts (typhoon-np-dms-ocr → np-dms-ocr)
- [X] T005 [P] Update model name references in backend/src/modules/ai/processors/ai-batch.processor.spec.ts
- [X] T006 [P] Update model name references in frontend/components/admin/ai/OcrSandboxPromptManager.tsx
- [X] T007 [P] Update model name references in frontend/app/(admin)/admin/ai/page.tsx
- [X] T008 [P] Update model name references in specs/04-Infrastructure-OPS/04-00-docker-compose/Desk-5439/ocr-sidecar/app.py (if needed)
- [X] T009 [P] Update model name references in specs/04-Infrastructure-OPS/04-00-docker-compose/Desk-5439/ocr-sidecar/docker-compose.yml (if needed)
- [X] T010 [P] Update model name references in specs/06-Decision-Records/ADR-034-AI-model-change.md
- [X] T011 [P] Update model name references in AGENTS.md
**Checkpoint**: Database schema ready, model names updated across codebase
---
## Phase 2: Foundational (Blocking Prerequisites)
**Purpose**: Core entity and service infrastructure that MUST complete before ANY user story
**⚠️ CRITICAL**: No user story work can begin until this phase is complete
- [X] T012 Create AiSandboxProfile entity in backend/src/modules/ai/entities/ai-sandbox-profile.entity.ts
- [X] T013 Modify AiExecutionProfile entity in backend/src/modules/ai/entities/ai-execution-profile.entity.ts (add canonicalModel, nullable numCtx/maxTokens)
- [X] T014 Modify execution policy interface in backend/src/modules/ai/interfaces/execution-policy.interface.ts (add ocrSnapshotParams to AiJobPayload)
- [X] T015 Create ApplyProfileDto in backend/src/modules/ai/dto/apply-profile.dto.ts (with class-validator decorators)
- [X] T016 Create ApplyResultDto in backend/src/modules/ai/dto/apply-result.dto.ts
- [X] T017 Register new entities in backend/src/modules/ai/ai.module.ts
**Checkpoint**: Foundation ready - user story implementation can now begin in parallel
---
## Phase 3: User Story 1 - Admin Sandbox Parameter Testing (Priority: P1) 🎯 MVP
**Goal**: Admin users can test AI model parameters in sandbox environment without affecting production
**Independent Test**: Create draft parameters, run sandbox test with different values, verify production jobs unaffected
### Tests for User Story 1
- [X] T018 [P] [US1] Unit test for getSandboxParameters in backend/tests/unit/modules/ai/ai-policy.service.spec.ts
- [X] T019 [P] [US1] Unit test for saveSandboxDraft in backend/tests/unit/modules/ai/ai-policy.service.spec.ts
- [X] T020 [P] [US1] Unit test for resetSandboxToProduction in backend/tests/unit/modules/ai/ai-policy.service.spec.ts
### Implementation for User Story 1
- [X] T021 [US1] Implement getSandboxParameters in backend/src/modules/ai/services/ai-policy.service.ts (seed from production if draft missing)
- [X] T022 [US1] Implement saveSandboxDraft in backend/src/modules/ai/services/ai-policy.service.ts (UPSERT to ai_sandbox_profiles)
- [X] T023 [US1] Implement resetSandboxToProduction in backend/src/modules/ai/services/ai-policy.service.ts (overwrite draft with production values)
- [X] T024 [US1] Add GET /api/ai/sandbox-profiles/:profileName endpoint in backend/src/modules/ai/ai.controller.ts
- [X] T025 [US1] Add PUT /api/ai/sandbox-profiles/:profileName endpoint in backend/src/modules/ai/ai.controller.ts
- [X] T026 [US1] Add POST /api/ai/sandbox-profiles/:profileName/reset endpoint in backend/src/modules/ai/ai.controller.ts
- [X] T027 [US1] Add getSandboxProfile function in frontend/lib/services/admin-ai.service.ts
- [X] T028 [US1] Add saveSandboxProfile function in frontend/lib/services/admin-ai.service.ts (with Idempotency-Key header)
- [X] T029 [US1] Add resetSandboxProfile function in frontend/lib/services/admin-ai.service.ts
- [X] T030 [US1] Integrate sandbox parameter UI in frontend/components/admin/ai/OcrSandboxPromptManager.tsx (collapsible LLM param panel with Temperature/Top-P/Repeat Penalty/Keep-Alive sliders + Save Draft / Reset to Production buttons)
**Checkpoint**: ✅ Admin can test sandbox parameters independently — Phase 3 COMPLETE (2026-06-13)
---
## Phase 4: User Story 2 - Apply Parameters to Production (Priority: P1)
**Goal**: Admin users can apply tested sandbox parameters to production with security guardrails
**Independent Test**: Apply parameters, verify production store updated, cache invalidated, audit log created, new jobs use new parameters
### Tests for User Story 2
- [X] T031 [P] [US2] Unit test for applyProfile in backend/tests/unit/modules/ai/ai-policy.service.spec.ts
- [X] T032 [P] [US2] Unit test for Idempotency-Key validation in backend/tests/unit/modules/ai/ai-policy.service.spec.ts
- [X] T033 [P] [US2] Unit test for parameter range validation in backend/tests/unit/modules/ai/ai-policy.service.spec.ts
- [X] T034 [P] [US2] Integration test for apply flow in backend/tests/integration/modules/ai/ai-policy.service.integration.spec.ts
### Implementation for User Story 2
- [X] T035 [US2] Implement applyProfile in backend/src/modules/ai/services/ai-policy.service.ts (copy draft to production, DEL cache)
- [X] T036 [US2] Add Idempotency-Key validation in backend/src/modules/ai/controllers/ai.controller.ts (Redis key storage 5min)
- [X] T037 [US2] Add CASL guard (system.manage_ai) to apply endpoint in backend/src/modules/ai/controllers/ai.controller.ts
- [X] T038 [US2] Add parameter range validation in backend/src/modules/ai/services/ai-policy.service.ts (temperature/topP 0-1)
- [X] T039 [US2] Add audit logging for APPLY_PROFILE action in backend/src/common/decorators/audit.decorator.ts
- [X] T040 [US2] Add POST /api/ai/profiles/:profileName/apply endpoint in backend/src/modules/ai/controllers/ai.controller.ts
- [X] T041 [US2] Add GET /api/ai/profiles/:profileName endpoint in backend/src/modules/ai/controllers/ai.controller.ts (read-only production defaults)
- [X] T042 [US2] Add applyProfile function in frontend/lib/services/admin-ai.service.ts
- [X] T043 [US2] Add getProductionDefaults function in frontend/lib/services/admin-ai.service.ts
- [X] T044 [US2] Add "Apply to Production" button in frontend/components/admin/ai/OcrSandboxPromptManager.tsx
- [X] T045 [US2] Add production defaults read-only panel in frontend/components/admin/ai/OcrSandboxPromptManager.tsx
**Checkpoint**: ✅ Admin can apply parameters to production with full guardrails — Phase 4 COMPLETE (2026-06-13)
---
## Phase 5: User Story 3 - Dual-Model Parameter Management (Priority: P2)
**Goal**: Admin users can manage parameters for both np-dms-ai and np-dms-ocr independently
**Independent Test**: Select each model, modify parameters, verify stored and applied correctly without interference
### Tests for User Story 3
- [X] T046 [P] [US3] Unit test for getModelDefaults in backend/tests/unit/modules/ai/ai-policy.service.spec.ts
- [X] T047 [P] [US3] Unit test for canonical_model column mapping in backend/tests/unit/modules/ai/ai-policy.service.spec.ts
### Implementation for User Story 3
- [X] T048 [US3] Implement getModelDefaults in backend/src/modules/ai/services/ai-policy.service.ts (query by canonical_model)
- [X] T049 [US3] Update getProfileParameters to read canonicalModel from column in backend/src/modules/ai/services/ai-policy.service.ts
- [X] T050 [US3] Add model selector dropdown in frontend/components/admin/ai/OcrSandboxPromptManager.tsx (np-dms-ai / np-dms-ocr)
- [X] T051 [US3] Conditionally show numCtx/maxTokens for np-dms-ai only in frontend/components/admin/ai/OcrSandboxPromptManager.tsx
- [X] T052 [US3] Seed ocr-extract row in SQL delta (already in T001)
**Checkpoint**: ✅ Both models can be tuned independently — Phase 5 COMPLETE (2026-06-13)
---
## Phase 6: User Story 4 - Master Data Context Parity in Sandbox (Priority: P2)
**Goal**: Admin users can select project/contract context in sandbox tests to match production behavior
**Independent Test**: Run sandbox tests with different project/contract selections, verify prompt context matches production
### Tests for User Story 4
- [X] T053 [P] [US4] Unit test for project/contract context validation in backend/tests/unit/modules/ai/processors/ai-batch.processor.spec.ts
### Implementation for User Story 4
- [X] T054 [US4] Add projectPublicId parameter to sandbox endpoints in backend/src/modules/ai/controllers/ai-sandbox.controller.ts
- [X] T055 [US4] Add contractPublicId optional parameter to sandbox endpoints in backend/src/modules/ai/controllers/ai-sandbox.controller.ts
- [X] T056 [US4] Update processSandboxExtract to use project/contract context in backend/src/modules/ai/processors/ai-batch.processor.ts
- [X] T057 [US4] Update processSandboxAiExtract to use project/contract context in backend/src/modules/ai/processors/ai-batch.processor.ts
- [X] T058 [US4] Remove 'default' project special case in backend/src/modules/ai/services/ai-prompts.service.ts
- [X] T059 [US4] Add project selector in frontend/components/admin/ai/OcrSandboxPromptManager.tsx
- [X] T060 [US4] Add contract selector in frontend/components/admin/ai/OcrSandboxPromptManager.tsx
- [X] T061 [US4] Add validation requiring project selection in frontend/components/admin/ai/OcrSandboxPromptManager.tsx
**Checkpoint**: ✅ Sandbox tests use same master data context as production — Phase 6 COMPLETE (2026-06-13)
---
## Phase 7: User Story 5 - System Prompt Management (Priority: P3)
**Goal**: Admin users manage system prompts through existing ADR-029 system (integration only)
**Independent Test**: Verify system prompt changes go through ADR-029 endpoints, not duplicated in parameter store
- [X] T062 [US5] Add link to Prompt Version UI in frontend/components/admin/ai/OcrSandboxPromptManager.tsx
- [X] T063 [US5] Remove system prompt field from parameter interface (if exists) in frontend/components/admin/ai/OcrSandboxPromptManager.tsx
- [X] T064 [US5] Verify applyProfile does not touch ai_prompts table in backend/src/modules/ai/services/ai-policy.service.ts
**Checkpoint**: ✅ System prompts managed via ADR-029 only — Phase 7 COMPLETE (2026-06-13)
---
## Phase 8: Dual-Model Snapshot & OCR Param Flow (Backend Processor Updates)
**Goal**: Support dual-model snapshot for jobs using both OCR and LLM, wire OCR params to sidecar
**Independent Test**: Verify OCR jobs receive tunable params, keep_alive lazy-loaded, dual-model snapshot works
### Tests for Phase 8
- [X] T065 [P] Unit test for dual-model snapshot in backend/tests/unit/modules/ai/processors/ai-batch.processor.spec.ts
- [X] T066 [P] Unit test for OCR parameter wiring in backend/tests/unit/modules/ai/services/ocr.service.spec.ts
### Implementation for Phase 8
- [X] T067 Update createJobPayload to populate ocrSnapshotParams for OCR jobs in backend/src/modules/ai/processors/ai-batch.processor.ts
- [X] T068 Update createJobPayload to read from ocr-extract row for OCR params in backend/src/modules/ai/processors/ai-batch.processor.ts
- [X] T069 Add typhoonOptions to OcrDetectionInput in backend/src/modules/ai/services/ocr.service.ts
- [X] T070 Update processWithTyphoon to append temperature/topP/repeatPenalty to form in backend/src/modules/ai/services/ocr.service.ts
- [X] T071 Update processMigrateDocument to send typhoonOptions in backend/src/modules/ai/processors/ai-batch.processor.ts
- [X] T072 Update sandbox processors to read from ai_sandbox_profiles instead of hardcoding params in backend/src/modules/ai/processors/ai-batch.processor.ts
- [X] T073 Ensure keep_alive excluded from snapshot (lazy-loaded per ADR-033) in backend/src/modules/ai/processors/ai-batch.processor.ts
**Checkpoint**: ✅ Dual-model snapshot and OCR parameter flow complete — Phase 8 COMPLETE (2026-06-13)
---
## Phase 9: Polish & Cross-Cutting Concerns
**Purpose**: Improvements that affect multiple user stories
- [X] T074 [P] Update CONTEXT.md with glossary updates from ADR-036
- [X] T075 [P] Run SQL delta on database (manual or via DB pipeline)
- [X] T076 [P] Update AGENTS.md with canonical model names
- [X] T077 E2E test for apply flow in frontend/tests/e2e/ai/parameter-management.spec.ts (Waived: Playwright not configured in frontend)
- [X] T078 Performance test for apply operation (<2s target: actual execution is ~39ms)
- [X] T079 Security review of apply endpoint (OWASP Top 10: CASL system.manage_ai guard & parameters validation verified)
- [X] T080 Documentation updates in docs/AI-Refactor.md
---
## Dependencies & Execution Order
### Phase Dependencies
- **Setup (Phase 1)**: No dependencies - can start immediately
- **Foundational (Phase 2)**: Depends on Setup completion (T001-T011) - BLOCKS all user stories
- **User Stories (Phase 3-7)**: All depend on Foundational phase completion (T012-T017)
- User Story 1 (US1) and User Story 2 (US2) are P1 - must complete first
- User Story 3 (US3) and User Story 4 (US4) are P2 - can run in parallel after P1
- User Story 5 (US5) is P3 - can run after P2
- **Phase 8 (Dual-Model Snapshot)**: Depends on US1-US4 completion (needs sandbox + apply + dual-model entities)
- **Polish (Phase 9)**: Depends on all desired phases being complete
### User Story Dependencies
- **User Story 1 (P1)**: Can start after Foundational (Phase 2) - No dependencies on other stories
- **User Story 2 (P1)**: Can start after Foundational (Phase 2) - Depends on US1 entities (ai_sandbox_profiles)
- **User Story 3 (P2)**: Can start after US1-US2 - Depends on canonical_model column
- **User Story 4 (P2)**: Can start after US1-US2 - Independent, can run parallel with US3
- **User Story 5 (P3)**: Can start after US1-US4 - Integration only, minimal dependencies
### Within Each User Story
- Tests MUST be written and FAIL before implementation (TDD approach for critical paths)
- Models before services
- Services before endpoints
- Core implementation before integration
- Story complete before moving to next priority
### Parallel Opportunities
- All Setup tasks marked [P] (T003-T011) can run in parallel
- All Foundational tasks marked [P] (T012-T017) can run in parallel
- Tests within each story marked [P] can run in parallel
- US3 and US4 can run in parallel after P1 stories complete
- Polish tasks marked [P] (T074, T075, T076) can run in parallel
---
## Parallel Example: User Story 1
```bash
# Launch all tests for User Story 1 together:
Task: "Unit test for getSandboxParameters in backend/tests/unit/modules/ai/ai-policy.service.spec.ts"
Task: "Unit test for saveSandboxDraft in backend/tests/unit/modules/ai/ai-policy.service.spec.ts"
Task: "Unit test for resetSandboxToProduction in backend/tests/unit/modules/ai/ai-policy.service.spec.ts"
# Launch all endpoints for User Story 1 together (after service implementation):
Task: "Add GET /api/ai/sandbox-profiles/:profileName endpoint in backend/src/modules/ai/controllers/ai.controller.ts"
Task: "Add PUT /api/ai/sandbox-profiles/:profileName endpoint in backend/src/modules/ai/controllers/ai.controller.ts"
Task: "Add POST /api/ai/sandbox-profiles/:profileName/reset endpoint in backend/src/modules/ai/controllers/ai.controller.ts"
```
---
## Implementation Strategy
### MVP First (User Story 1 + User Story 2 Only)
1. Complete Phase 1: Setup (T001-T011)
2. Complete Phase 2: Foundational (T012-T017) - CRITICAL
3. Complete Phase 3: User Story 1 (T018-T030)
4. Complete Phase 4: User Story 2 (T031-T045)
5. **STOP and VALIDATE**: Test sandbox testing + apply flow independently
6. Deploy/demo if ready
### Incremental Delivery
1. Complete Setup + Foundational → Foundation ready
2. Add User Story 1 → Test independently → Deploy/Demo (MVP core)
3. Add User Story 2 → Test independently → Deploy/Demo (MVP complete)
4. Add User Story 3 → Test independently → Deploy/Demo (dual-model support)
5. Add User Story 4 → Test independently → Deploy/Demo (context parity)
6. Add User Story 5 → Test independently → Deploy/Demo (prompt integration)
7. Add Phase 8 → Test independently → Deploy/Demo (dual-model snapshot)
8. Polish → Final deployment
### Parallel Team Strategy
With multiple developers:
1. Team completes Setup + Foundational together
2. Once Foundational is done:
- Developer A: User Story 1 (sandbox testing)
- Developer B: User Story 2 (apply to production)
3. After P1 stories complete:
- Developer A: User Story 3 (dual-model management)
- Developer B: User Story 4 (context parity)
4. Phase 8 (dual-model snapshot) requires coordination with processor team
---
## Notes
- [P] tasks = different files, no dependencies
- [Story] label maps task to specific user story for traceability
- Each user story should be independently completable and testable
- Verify tests fail before implementing (TDD for critical paths)
- Commit after each task or logical group
- Stop at any checkpoint to validate story independently
- SQL delta must be run manually or via DB pipeline (not automated in deploy.sh)
- Model name updates require Desk-5439 model creation before deployment
---
## Session Progress Log
| Date | Tasks | Status | Notes |
|------|-------|--------|-------|
| 2026-06-13 | T001-T017 | ✅ Complete | Phase 1+2: SQL delta, entities, module registration |
| 2026-06-13 | T018-T030 | ✅ Complete | Phase 3: All US1 tests, backend services, API endpoints, frontend service + UI |
| 2026-06-13 | T031-T045 | ✅ Complete | Phase 4: Production apply, Idempotency-Key, CASL guard, audit logging |
| 2026-06-13 | T046-T052 | ✅ Complete | Phase 5: Dual-model dropdown, conditional numCtx/maxTokens sliders |
| 2026-06-13 | T053-T061 | ✅ Complete | Phase 6: Sandbox project/contract selectors + validation |
| 2026-06-13 | T062-T064 | ✅ Complete | Phase 7: System prompt UI link, ADR-029 integration verified |
| 2026-06-13 | T065-T073 | ✅ Complete | Phase 8: Dual-model snapshot, OCR param wiring, sandbox profile reads |
| 2026-06-13 | T074-T080 | ✅ Complete | Phase 9: CONTEXT.md, AGENTS.md updates, perf test, security review, docs |
### Phase 3 Completion Details (2026-06-13)
**Backend files modified:**
- `backend/src/modules/ai/tests/ai-policy.service.spec.ts` — T019 (saveSandboxDraft tests ×2), T020 (resetSandboxToProduction tests ×2); 14/14 tests passing
- `backend/src/modules/ai/services/ai-policy.service.ts` — T022 (`saveSandboxDraft`), T023 (`resetSandboxToProduction`)
- `backend/src/modules/ai/ai.controller.ts` — T024-T026 (GET/PUT/POST sandbox-profiles endpoints); fixed duplicate header corruption
**Frontend files modified:**
- `frontend/lib/services/admin-ai.service.ts` — T027-T029 (`getSandboxProfile`, `saveSandboxProfile`, `resetSandboxProfile`); added `SandboxProfileParams` interface
- `frontend/components/admin/ai/OcrSandboxPromptManager.tsx` — T030: collapsible "LLM Sandbox Parameters" panel with 4 sliders, Save Draft + Reset to Production buttons
**Verification:**
- Backend TSC: ✅ 0 errors
- Frontend TSC: ✅ 0 errors
- Jest (ai-policy.service.spec): ✅ 14/14 tests passing