7.8 KiB
Feature Specification: Unified AI Architecture
Feature Branch: 301-unified-ai-arch
Created: 2026-05-14
Status: Draft
Input: User description: "ADR-023: Unified AI Architecture"
Clarifications
Session 2026-05-14
- Q: Legacy Migration Input Mechanism (How are legacy PDFs fed into the AI pipeline?) → A: Both a secure Admin API endpoint and a watched network folder.
- Q: Failure Handling for Corrupted PDFs (How does the system handle documents that fail AI processing?) → A: Mark as "Rejected" in the staging queue with an error reason.
- Q: AI Audit Log Retention Policy (How long are AI audit logs retained?) → A: Infinite retention until manually deleted by a System Admin.
- Q: RAG Concurrency Handling (How are concurrent RAG queries managed to prevent LLM timeouts?) → A: Request Queuing via BullMQ for sequential processing.
User Scenarios & Testing (mandatory)
User Story 1 - Legacy Document Migration and Review (Priority: P1)
As a System Administrator, I want to process legacy PDF documents through an AI pipeline and review the extracted metadata in a staging queue so that I can ensure data integrity before importing them into the main database.
Why this priority: Legacy document migration is the most critical immediate need for the LCBP3 project, as 20,000+ documents need to be imported securely and accurately.
Independent Test: Can be fully tested by uploading a batch of PDF documents, verifying they appear in the staging queue with AI-extracted metadata, and approving/rejecting them, which then commits them to the permanent storage.
Acceptance Scenarios:
- Given a batch of legacy PDF documents, When processed by the AI pipeline, Then they appear in the staging queue with extracted metadata and confidence scores.
- Given a document in the staging queue with a high confidence score, When reviewed by an Admin, Then it is marked as ready for import without warnings.
- Given a document in the staging queue, When the Admin approves it, Then the file is moved to permanent storage and committed to the main database.
User Story 2 - RAG Conversational Q&A (Priority: P2)
As a Document Controller or Org Admin, I want to ask natural language questions about project documents (RFA/Correspondence) and get context-aware answers so that I can quickly find information without manually reading full texts.
Why this priority: Enhances the user experience significantly by allowing conversational search over complex engineering documents.
Independent Test: Can be tested by asking a specific project-related question and receiving an answer generated by the AI, accompanied by citations to the relevant documents.
Acceptance Scenarios:
- Given a user has access to a specific project, When they ask a question via the RAG interface, Then the system returns an answer based ONLY on documents within that project's scope.
- Given the RAG system is queried, When generating an answer, Then the response is returned in under 10 seconds.
User Story 3 - AI Audit Log Management (Priority: P3)
As a System Admin, I want to view and manage AI audit logs so that I can provide feedback to the development team to improve the AI models and clean up test data.
Why this priority: Essential for continuous improvement of the AI models and keeping the database clean, but not a blocker for the core user workflows.
Independent Test: Can be tested by generating AI suggestions, verifying they are logged in the audit table, and then having a System Admin perform a hard delete on those logs.
Acceptance Scenarios:
- Given an AI suggestion is presented to a user, When the user confirms or overrides it, Then the action is recorded in the AI audit log with confidence scores and user feedback.
- Given a System Admin is viewing the AI audit logs, When they select records to delete, Then the records are hard-deleted and this deletion is recorded in the main compliance audit log.
Edge Cases
- What happens when the isolated AI host is offline?
- The system must fail gracefully, hiding AI suggestions and disabling RAG features, while allowing normal manual document operations to continue.
- How does the system handle concurrent processing of a massive number of documents?
- The system must queue tasks sequentially to prevent memory/resource overload on the isolated machine.
- What happens if the AI suggests an invalid category or enum value?
- The validation layer must reject invalid values before they are presented to the user or saved to the database.
- What happens if a legacy document fails AI processing (e.g., corruption or timeout)?
- The document must be marked as "Rejected" in the staging queue with a specific error reason, keeping it visible for Admin review.
- How does the system handle concurrent RAG queries from multiple users?
- RAG queries must be placed in a BullMQ queue and processed sequentially (or with strict concurrency limits) to prevent LLM crashes, potentially showing a "Waiting in queue..." state to the user.
Requirements (mandatory)
Functional Requirements
- FR-001: System MUST process AI workloads exclusively on a physically isolated machine without direct database or storage access.
- FR-001b: System MUST support legacy document ingestion via BOTH a secure Admin API upload endpoint and a watched network folder.
- FR-002: System MUST enforce a multi-tenant boundary by filtering vector search queries by project identifier.
- FR-003: System MUST provide a staging queue for legacy document migration with human-in-the-loop review.
- FR-004: System MUST record AI suggestions and human overrides in a dedicated AI audit log.
- FR-005: System MUST allow System Admins to hard-delete AI audit logs, logging the deletion action in the main compliance audit log.
- FR-005b: System MUST retain AI audit logs indefinitely until a System Admin explicitly performs a manual hard-delete.
- FR-006: System MUST gracefully disable AI features when the AI host is offline, allowing manual data entry to continue uninterrupted.
- FR-007: System MUST validate the final payload (whether AI-suggested or human-overridden) against defined master data before database commitment.
- FR-008: System MUST utilize BullMQ for asynchronous vector deletion in Qdrant when documents are deleted, ensuring eventual consistency via retries to prevent orphaned vectors.
- FR-009: System MUST limit users to 1 active concurrent RAG query, blocking new queries until the active one completes.
- FR-010: System MUST enforce a rate limit of maximum 5 RAG queries per minute per user.
- FR-011: System SHOULD attempt to abort the underlying LLM generation if a client disconnects during a streaming response, requiring the user to requeue upon reconnecting.
- FR-012: For all constrained master data fields, the UI MUST enforce selection via predefined lists (Dropdowns) and prohibit arbitrary free-text input during the human review process.
Key Entities
- Migration Review Record: Represents a document in the staging queue, containing extracted metadata, confidence score, and review status (Pending, Imported, Rejected).
- AI Audit Log: Represents a record of AI suggestions vs human decisions for model feedback and training purposes.
Success Criteria (mandatory)
Measurable Outcomes
- SC-001: Legacy document migration pipeline can process 1,000 pages per hour without system degradation.
- SC-002: The RAG Q&A system answers user queries with a response time (p95) of under 10 seconds.
- SC-003: 100% of AI-processed documents pass validation against master data schemas before being committed to the database.
- SC-004: Zero cross-project data leakage incidents during RAG queries.