lcbp3/specs/300-others/301-unified-ai-arch/spec.md

# Feature Specification: Unified AI Architecture

**Feature Branch**: `301-unified-ai-arch`
**Created**: 2026-05-14
**Status**: Draft
**Input**: User description: "ADR-023: Unified AI Architecture"

## Clarifications

### Session 2026-05-14
- Q: Legacy Migration Input Mechanism (How are legacy PDFs fed into the AI pipeline?) → A: Both a secure Admin API endpoint and a watched network folder.
- Q: Failure Handling for Corrupted PDFs (How does the system handle documents that fail AI processing?) → A: Mark as "Rejected" in the staging queue with an error reason.
- Q: AI Audit Log Retention Policy (How long are AI audit logs retained?) → A: Infinite retention until manually deleted by a System Admin.
- Q: RAG Concurrency Handling (How are concurrent RAG queries managed to prevent LLM timeouts?) → A: Request Queuing via BullMQ for sequential processing.

## User Scenarios & Testing _(mandatory)_

### User Story 1 - Legacy Document Migration and Review (Priority: P1)

As a System Administrator, I want to process legacy PDF documents through an AI pipeline and review the extracted metadata in a staging queue so that I can ensure data integrity before importing them into the main database.

**Why this priority**: Legacy document migration is the most critical immediate need for the LCBP3 project, as 20,000+ documents need to be imported securely and accurately.

**Independent Test**: Can be fully tested by uploading a batch of PDF documents, verifying they appear in the staging queue with AI-extracted metadata, and approving/rejecting them, which then commits them to the permanent storage.

**Acceptance Scenarios**:

1. **Given** a batch of legacy PDF documents, **When** processed by the AI pipeline, **Then** they appear in the staging queue with extracted metadata and confidence scores.
2. **Given** a document in the staging queue with a high confidence score, **When** reviewed by an Admin, **Then** it is marked as ready for import without warnings.
3. **Given** a document in the staging queue, **When** the Admin approves it, **Then** the file is moved to permanent storage and committed to the main database.

---

### User Story 2 - RAG Conversational Q&A (Priority: P2)

As a Document Controller or Org Admin, I want to ask natural language questions about project documents (RFA/Correspondence) and get context-aware answers so that I can quickly find information without manually reading full texts.

**Why this priority**: Enhances the user experience significantly by allowing conversational search over complex engineering documents.

**Independent Test**: Can be tested by asking a specific project-related question and receiving an answer generated by the AI, accompanied by citations to the relevant documents.

**Acceptance Scenarios**:

1. **Given** a user has access to a specific project, **When** they ask a question via the RAG interface, **Then** the system returns an answer based ONLY on documents within that project's scope.
2. **Given** the RAG system is queried, **When** generating an answer, **Then** the response is returned in under 10 seconds.

---

### User Story 3 - AI Audit Log Management (Priority: P3)

As a System Admin, I want to view and manage AI audit logs so that I can provide feedback to the development team to improve the AI models and clean up test data.

**Why this priority**: Essential for continuous improvement of the AI models and keeping the database clean, but not a blocker for the core user workflows.

**Independent Test**: Can be tested by generating AI suggestions, verifying they are logged in the audit table, and then having a System Admin perform a hard delete on those logs.

**Acceptance Scenarios**:

1. **Given** an AI suggestion is presented to a user, **When** the user confirms or overrides it, **Then** the action is recorded in the AI audit log with confidence scores and user feedback.
2. **Given** a System Admin is viewing the AI audit logs, **When** they select records to delete, **Then** the records are hard-deleted and this deletion is recorded in the main compliance audit log.

### Edge Cases

- What happens when the isolated AI host is offline?
  - The system must fail gracefully, hiding AI suggestions and disabling RAG features, while allowing normal manual document operations to continue.
- How does the system handle concurrent processing of a massive number of documents?
  - The system must queue tasks sequentially to prevent memory/resource overload on the isolated machine.
- What happens if the AI suggests an invalid category or enum value?
  - The validation layer must reject invalid values before they are presented to the user or saved to the database.
- What happens if a legacy document fails AI processing (e.g., corruption or timeout)?
  - The document must be marked as "Rejected" in the staging queue with a specific error reason, keeping it visible for Admin review.
- How does the system handle concurrent RAG queries from multiple users?
  - RAG queries must be placed in a BullMQ queue and processed sequentially (or with strict concurrency limits) to prevent LLM crashes, potentially showing a "Waiting in queue..." state to the user.

## Requirements _(mandatory)_

### Functional Requirements

- **FR-001**: System MUST process AI workloads exclusively on a physically isolated machine without direct database or storage access.
- **FR-001b**: System MUST support legacy document ingestion via BOTH a secure Admin API upload endpoint and a watched network folder.
- **FR-002**: System MUST enforce a multi-tenant boundary by filtering vector search queries by project identifier.
- **FR-003**: System MUST provide a staging queue for legacy document migration with human-in-the-loop review.
- **FR-004**: System MUST record AI suggestions and human overrides in a dedicated AI audit log.
- **FR-005**: System MUST allow System Admins to hard-delete AI audit logs, logging the deletion action in the main compliance audit log.
- **FR-005b**: System MUST retain AI audit logs indefinitely until a System Admin explicitly performs a manual hard-delete.
- **FR-006**: System MUST gracefully disable AI features when the AI host is offline, allowing manual data entry to continue uninterrupted.
- **FR-007**: System MUST validate the final payload (whether AI-suggested or human-overridden) against defined master data before database commitment.
- **FR-008**: System MUST utilize BullMQ for asynchronous vector deletion in Qdrant when documents are deleted, ensuring eventual consistency via retries to prevent orphaned vectors.
- **FR-009**: System MUST limit users to 1 active concurrent RAG query, blocking new queries until the active one completes.
- **FR-010**: System MUST enforce a rate limit of maximum 5 RAG queries per minute per user.
- **FR-011**: System SHOULD attempt to abort the underlying LLM generation if a client disconnects during a streaming response, requiring the user to requeue upon reconnecting.
- **FR-012**: For all constrained master data fields, the UI MUST enforce selection via predefined lists (Dropdowns) and prohibit arbitrary free-text input during the human review process.

### Key Entities

- **Migration Review Record**: Represents a document in the staging queue, containing extracted metadata, confidence score, and review status (Pending, Imported, Rejected).
- **AI Audit Log**: Represents a record of AI suggestions vs human decisions for model feedback and training purposes.

## Success Criteria _(mandatory)_

### Measurable Outcomes

- **SC-001**: Legacy document migration pipeline can process 1,000 pages per hour without system degradation.
- **SC-002**: The RAG Q&A system answers user queries with a response time (p95) of under 10 seconds.
- **SC-003**: 100% of AI-processed documents pass validation against master data schemas before being committed to the database.
- **SC-004**: Zero cross-project data leakage incidents during RAG queries.