feat(ai): unify AI architecture, implement RAG and legacy migration
This commit is contained in:
@@ -1,21 +1,181 @@
|
||||
# Quickstart: Unified AI Architecture
|
||||
# Quickstart: Unified AI Architecture (ADR-023)
|
||||
|
||||
> **Target Machine:** Desk-5439 (AI Host) — IP: `<desk-5439-ip>`
|
||||
> **Stack:** Ollama + Qdrant + n8n + Redis + NestJS BullMQ
|
||||
|
||||
---
|
||||
|
||||
## 1. Setup the AI Host (Desk-5439)
|
||||
1. Install Ollama and pull `gemma4:9b` and `nomic-embed-text`.
|
||||
2. Start the Qdrant container with persistent storage.
|
||||
3. Start n8n and configure the API key to connect to the DMS backend.
|
||||
|
||||
## 2. Environment Variables (Backend)
|
||||
Add the following to your `.env`:
|
||||
### 1.1 Ollama
|
||||
|
||||
```bash
|
||||
AI_HOST_URL=http://<desk-5439-ip>
|
||||
AI_QDRANT_URL=http://<desk-5439-ip>:6333
|
||||
AI_N8N_WEBHOOK_URL=http://<desk-5439-ip>:5678
|
||||
AI_N8N_SERVICE_TOKEN=your-secure-token
|
||||
# Install Ollama (Linux)
|
||||
curl -fsSL https://ollama.com/install.sh | sh
|
||||
|
||||
# Pull required models
|
||||
ollama pull gemma2:9b # RAG generation (OLLAMA_RAG_MODEL)
|
||||
ollama pull nomic-embed-text # Embedding (OLLAMA_EMBED_MODEL)
|
||||
|
||||
# Verify
|
||||
ollama list
|
||||
# gemma2:9b ...
|
||||
# nomic-embed-text ...
|
||||
|
||||
# Start Ollama server (default port: 11434)
|
||||
ollama serve
|
||||
```
|
||||
|
||||
## 3. Usage Flow (RAG)
|
||||
1. User submits a query via the Next.js `RagChatWidget`.
|
||||
2. Backend validates JWT and creates a BullMQ job on `rag-query-queue`.
|
||||
3. Worker retrieves the job, injects the `projectPublicId` filter into Qdrant.
|
||||
4. Worker fetches context, queries Ollama, and streams/returns the response.
|
||||
### 1.2 Qdrant (Vector Database)
|
||||
|
||||
```bash
|
||||
# Start Qdrant with persistent storage via Docker
|
||||
docker run -d \
|
||||
--name qdrant \
|
||||
-p 6333:6333 \
|
||||
-p 6334:6334 \
|
||||
-v /opt/qdrant/data:/qdrant/storage \
|
||||
qdrant/qdrant:latest
|
||||
|
||||
# Verify
|
||||
curl http://localhost:6333/health
|
||||
# {"status":"ok","version":"..."}
|
||||
```
|
||||
|
||||
Collection `lcbp3_vectors` is created automatically on first vector ingest.
|
||||
Vector size: **768** (nomic-embed-text output dimension).
|
||||
|
||||
### 1.3 n8n (Workflow Orchestrator)
|
||||
|
||||
```bash
|
||||
docker run -d \
|
||||
--name n8n \
|
||||
-p 5678:5678 \
|
||||
-e N8N_BASIC_AUTH_ACTIVE=true \
|
||||
-e N8N_BASIC_AUTH_USER=admin \
|
||||
-e N8N_BASIC_AUTH_PASSWORD=<secure-password> \
|
||||
-v /opt/n8n/data:/home/node/.n8n \
|
||||
n8nio/n8n:latest
|
||||
```
|
||||
|
||||
Configure the DMS backend webhook URL as `http://<backend-ip>:3001/api/ai/callback`.
|
||||
|
||||
### 1.4 Redis (BullMQ + Cache)
|
||||
|
||||
Redis should already be running as part of the core LCBP3 stack.
|
||||
BullMQ queues registered in the AI module:
|
||||
|
||||
| Queue | Purpose | Concurrency |
|
||||
|---|---|---|
|
||||
| `ai-ingest-queue` | Legacy PDF batch ingestion | 2 |
|
||||
| `ai-rag-query` | RAG Q&A LLM generation | **1** (VRAM guard) |
|
||||
| `ai-vector-deletion` | Async Qdrant cleanup | 3 |
|
||||
|
||||
---
|
||||
|
||||
## 2. Environment Variables (Backend `.env`)
|
||||
|
||||
```bash
|
||||
# ─── Core AI Host ───────────────────────────────────────────
|
||||
AI_HOST_URL=http://<desk-5439-ip>
|
||||
AI_QDRANT_URL=http://<desk-5439-ip>:6333
|
||||
AI_N8N_WEBHOOK_URL=http://<desk-5439-ip>:5678/webhook/lcbp3
|
||||
AI_N8N_SERVICE_TOKEN=<generate-with: openssl rand -hex 32>
|
||||
|
||||
# ─── Ollama Models ──────────────────────────────────────────
|
||||
OLLAMA_URL=http://<desk-5439-ip>:11434
|
||||
OLLAMA_RAG_MODEL=gemma2:9b
|
||||
OLLAMA_EMBED_MODEL=nomic-embed-text
|
||||
|
||||
# ─── RAG Tuning ─────────────────────────────────────────────
|
||||
RAG_TIMEOUT_MS=30000 # 30 second LLM timeout
|
||||
|
||||
# ─── AI Timeout ─────────────────────────────────────────────
|
||||
AI_TIMEOUT_MS=30000 # n8n extraction timeout
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 3. Usage Flows
|
||||
|
||||
### 3.1 RAG Conversational Q&A
|
||||
|
||||
```
|
||||
User → RagChatWidget (Next.js)
|
||||
→ POST /api/ai/rag/query { question, projectPublicId }
|
||||
→ BullMQ: ai-rag-query (concurrency=1)
|
||||
→ AiRagProcessor
|
||||
→ AiQdrantService.searchByProject (project isolation enforced)
|
||||
→ Ollama /api/embeddings (nomic-embed-text)
|
||||
→ Ollama /api/generate (gemma2:9b)
|
||||
→ Redis result stored (TTL: 5min)
|
||||
→ GET /api/ai/rag/jobs/:requestPublicId (polling every 2s)
|
||||
→ Response: { answer, citations, confidence }
|
||||
```
|
||||
|
||||
**Rate limit:** 5 requests/minute per user.
|
||||
**FR-009:** Only 1 active job per user at a time (Redis-enforced).
|
||||
**FR-011:** Cancel via `DELETE /api/ai/rag/jobs/:requestPublicId`.
|
||||
|
||||
### 3.2 Real-time Document Extraction
|
||||
|
||||
```
|
||||
User uploads document →
|
||||
POST /api/ai/extract { attachmentPublicId, projectPublicId }
|
||||
→ AiService.extractRealtime
|
||||
→ n8n webhook (OCR + Gemma4 extraction)
|
||||
→ POST /api/ai/callback (n8n callback with Bearer token)
|
||||
→ AiAuditLog saved with AI suggestion JSON
|
||||
```
|
||||
|
||||
**Permission required:** `ai.extract` (standard DMS user role).
|
||||
|
||||
### 3.3 Legacy Migration Batch Ingest
|
||||
|
||||
```
|
||||
n8n POST /api/ai/legacy-migration/ingest (ServiceAccountGuard)
|
||||
→ AiIngestService.ingest (PDFs → MigrationReviewRecord)
|
||||
→ BullMQ: ai-ingest-queue
|
||||
→ Admin reviews via GET /api/ai/legacy-migration/queue
|
||||
→ POST /api/ai/legacy-migration/queue/:publicId/approve
|
||||
→ MigrationService.importCorrespondence
|
||||
→ AiAuditLog saved with { aiSuggestionJson, humanOverrideJson }
|
||||
```
|
||||
|
||||
**Permission required:** `ai.migration_manage`.
|
||||
|
||||
### 3.4 Vector Cleanup (Async)
|
||||
|
||||
When an attachment is deleted:
|
||||
```
|
||||
RagService.deleteVectors(attachmentPublicId)
|
||||
→ DocumentChunk deleted (synchronous, DB)
|
||||
→ BullMQ: ai-vector-deletion (async, 3 retries exponential)
|
||||
→ AiVectorDeletionProcessor
|
||||
→ AiQdrantService.deleteByDocumentPublicId
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 4. Audit Logs
|
||||
|
||||
AI audit logs are stored in `ai_audit_logs` table.
|
||||
|
||||
**Hard delete (SYSTEM_ADMIN only):**
|
||||
```http
|
||||
DELETE /api/ai/audit-logs?olderThanDays=90
|
||||
DELETE /api/ai/audit-logs?documentPublicId=<uuid>
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 5. Troubleshooting
|
||||
|
||||
| Symptom | Cause | Fix |
|
||||
|---|---|---|
|
||||
| RAG returns `RAG_NOT_READY` | Qdrant not reachable | Check `AI_QDRANT_URL`, restart Qdrant container |
|
||||
| RAG returns `ไม่พบข้อมูลในเอกสารที่ระบุ` | No vectors for project | Trigger document re-ingest via RAG module |
|
||||
| Callback returns 401 | Wrong `AI_N8N_SERVICE_TOKEN` | Regenerate token, update n8n + `.env` |
|
||||
| Jobs stuck in `pending` | Redis/BullMQ not running | `docker ps` check Redis container |
|
||||
| Ollama timeout | Model too large for VRAM | Use `gemma2:2b` for low-resource machines |
|
||||
| Qdrant 5xx on vector insert | Collection not initialized | Restart backend (auto-creates collection on `onModuleInit`) |
|
||||
|
||||
@@ -8,9 +8,9 @@
|
||||
|
||||
**Purpose**: Project initialization and basic structure
|
||||
|
||||
- [ ] T001 Initialize `AiModule` inside `backend/src/ai/ai.module.ts`
|
||||
- [ ] T002 [P] Install `qdrant-js` client dependency in the backend workspace
|
||||
- [ ] T003 Add `AI_HOST_URL`, `AI_QDRANT_URL`, `AI_N8N_SERVICE_TOKEN` to backend `.env` configuration
|
||||
- [X] T001 Initialize `AiModule` inside `backend/src/ai/ai.module.ts`
|
||||
- [X] T002 [P] Install `qdrant-js` client dependency in the backend workspace
|
||||
- [X] T003 Add `AI_HOST_URL`, `AI_QDRANT_URL`, `AI_N8N_SERVICE_TOKEN` to backend `.env` configuration
|
||||
|
||||
---
|
||||
|
||||
@@ -19,11 +19,11 @@
|
||||
**Purpose**: Core infrastructure that MUST be complete before ANY user story can be implemented
|
||||
**⚠️ CRITICAL**: No user story work can begin until this phase is complete
|
||||
|
||||
- [ ] T004 Setup `QdrantService` in `backend/src/ai/qdrant.service.ts` to manage vector DB connections
|
||||
- [ ] T005 [P] Setup BullMQ infrastructure in `AiModule` (configure `AiQueueService`)
|
||||
- [ ] T006 [P] Implement `ServiceAccountGuard` to validate n8n service tokens for internal API routes
|
||||
- [ ] T007 Implement SQL Schema Deltas for `migration_review_queue` and `ai_audit_logs` in MariaDB
|
||||
- [ ] T008 Implement TypeORM base entities mapping to the created SQL tables
|
||||
- [X] T004 Setup `QdrantService` in `backend/src/ai/qdrant.service.ts` to manage vector DB connections
|
||||
- [X] T005 [P] Setup BullMQ infrastructure in `AiModule` (configure `AiQueueService`)
|
||||
- [X] T006 [P] Implement `ServiceAccountGuard` to validate n8n service tokens for internal API routes
|
||||
- [X] T007 Implement SQL Schema Deltas for `migration_review_queue` and `ai_audit_logs` in MariaDB
|
||||
- [X] T008 Implement TypeORM base entities mapping to the created SQL tables
|
||||
|
||||
**Checkpoint**: Foundation ready - user story implementation can now begin
|
||||
|
||||
@@ -36,16 +36,16 @@
|
||||
|
||||
### Implementation for User Story 1
|
||||
|
||||
- [ ] T009 [P] [US1] Create `MigrationReviewRecord` TypeORM Entity in `backend/src/ai/entities/migration-review.entity.ts`
|
||||
- [ ] T010 [US1] Implement `AiIngestService` to handle batch ingestion and queue creation
|
||||
- [ ] T011 [US1] Implement `POST /api/ai/legacy-migration/ingest` in `AiController` using `ServiceAccountGuard`
|
||||
- [ ] T011b [P] [US1] Export n8n workflow definition to `backend/src/ai/workflows/folder-watcher.json` to monitor the network directory and POST to the ingest API (FR-001b)
|
||||
- [ ] T012 [US1] Implement `GET /api/ai/legacy-migration/queue` in `AiController`
|
||||
- [ ] T013 [US1] Implement `POST /api/ai/legacy-migration/queue/{publicId}/approve` with Zod/class-validator payload checking (FR-007)
|
||||
- [ ] T014 [P] [US1] Create Frontend API hooks for staging queue in `frontend/src/lib/api/ai.ts`
|
||||
- [ ] T015 [US1] Build Frontend Staging Queue Table UI in `frontend/src/app/(dashboard)/ai-staging/page.tsx`
|
||||
- [ ] T016 [US1] Implement UI Form dropdown constraints for master data fields in the approval modal (FR-012)
|
||||
- [ ] T017 [US1] Build `AiStatusBanner.tsx` component in `frontend/src/components/ai/AiStatusBanner.tsx` to handle offline graceful degradation
|
||||
- [X] T009 [P] [US1] Create `MigrationReviewRecord` TypeORM Entity in `backend/src/ai/entities/migration-review.entity.ts`
|
||||
- [X] T010 [US1] Implement `AiIngestService` to handle batch ingestion and queue creation
|
||||
- [X] T011 [US1] Implement `POST /api/ai/legacy-migration/ingest` in `AiController` using `ServiceAccountGuard`
|
||||
- [X] T011b [P] [US1] Export n8n workflow definition to `backend/src/ai/workflows/folder-watcher.json` to monitor the network directory and POST to the ingest API (FR-001b)
|
||||
- [X] T012 [US1] Implement `GET /api/ai/legacy-migration/queue` in `AiController`
|
||||
- [X] T013 [US1] Implement `POST /api/ai/legacy-migration/queue/{publicId}/approve` with Zod/class-validator payload checking (FR-007)
|
||||
- [X] T014 [P] [US1] Create Frontend API hooks for staging queue in `frontend/src/lib/api/ai.ts`
|
||||
- [X] T015 [US1] Build Frontend Staging Queue Table UI in `frontend/src/app/(dashboard)/ai-staging/page.tsx`
|
||||
- [X] T016 [US1] Implement UI Form dropdown constraints for master data fields in the approval modal (FR-012)
|
||||
- [X] T017 [US1] Build `AiStatusBanner.tsx` component in `frontend/src/components/ai/AiStatusBanner.tsx` to handle offline graceful degradation
|
||||
|
||||
**Checkpoint**: At this point, User Story 1 should be fully functional.
|
||||
|
||||
@@ -58,12 +58,12 @@
|
||||
|
||||
### Implementation for User Story 2
|
||||
|
||||
- [ ] T018 [P] [US2] Create BullMQ Processor `rag.processor.ts` with strict concurrency limit = 1 (FR-009)
|
||||
- [ ] T019 [US2] Implement `AiRagService` containing Ollama LLM integration logic
|
||||
- [ ] T020 [US2] Enforce `projectPublicId` filtering natively in Qdrant search payload inside `AiRagService`
|
||||
- [ ] T021 [US2] Implement `POST /api/ai/rag/query` to push jobs to BullMQ and apply rate limiting (5 per min) (FR-010)
|
||||
- [ ] T022 [US2] Add AbortController logic to backend processor to cancel LLM generation on client disconnect (FR-011)
|
||||
- [ ] T023 [P] [US2] Build `RagChatWidget.tsx` component with streaming/polling UI for queue wait status
|
||||
- [X] T018 [P] [US2] Create BullMQ Processor `rag.processor.ts` with strict concurrency limit = 1 (FR-009)
|
||||
- [X] T019 [US2] Implement `AiRagService` containing Ollama LLM integration logic
|
||||
- [X] T020 [US2] Enforce `projectPublicId` filtering natively in Qdrant search payload inside `AiRagService`
|
||||
- [X] T021 [US2] Implement `POST /api/ai/rag/query` to push jobs to BullMQ and apply rate limiting (5 per min) (FR-010)
|
||||
- [X] T022 [US2] Add AbortController logic to backend processor to cancel LLM generation on client disconnect (FR-011)
|
||||
- [X] T023 [P] [US2] Build `RagChatWidget.tsx` component with streaming/polling UI for queue wait status
|
||||
|
||||
**Checkpoint**: RAG capability is fully implemented and throttled safely.
|
||||
|
||||
@@ -76,11 +76,11 @@
|
||||
|
||||
### Implementation for User Story 3
|
||||
|
||||
- [ ] T024 [P] [US3] Create `AiAuditLog` TypeORM Entity in `backend/src/ai/entities/ai-audit-log.entity.ts`
|
||||
- [ ] T025 [US3] Inject Audit Log creation logic into the `/approve` endpoint (capture Human vs AI differences)
|
||||
- [ ] T026 [US3] Implement `DELETE /api/ai/audit-logs` endpoint with `@UseGuards(CaslAbilityGuard)` checking for `SYSTEM_ADMIN`
|
||||
- [ ] T027 [US3] Create BullMQ Processor `vector-deletion.processor.ts` to handle asynchronous vector cleanup (FR-008)
|
||||
- [ ] T028 [US3] Integrate `vector-deletion-queue` dispatch into the main Document Deletion service
|
||||
- [X] T024 [P] [US3] Create `AiAuditLog` TypeORM Entity in `backend/src/ai/entities/ai-audit-log.entity.ts`
|
||||
- [X] T025 [US3] Inject Audit Log creation logic into the `/approve` endpoint (capture Human vs AI differences)
|
||||
- [X] T026 [US3] Implement `DELETE /api/ai/audit-logs` endpoint with `@UseGuards(CaslAbilityGuard)` checking for `SYSTEM_ADMIN`
|
||||
- [X] T027 [US3] Create BullMQ Processor `vector-deletion.processor.ts` to handle asynchronous vector cleanup (FR-008)
|
||||
- [X] T028 [US3] Integrate `vector-deletion-queue` dispatch into the main Document Deletion service
|
||||
|
||||
**Checkpoint**: AI Audit and safe vector cleanup are complete.
|
||||
|
||||
@@ -90,9 +90,9 @@
|
||||
|
||||
**Purpose**: Improvements that affect multiple user stories
|
||||
|
||||
- [ ] T029 Code cleanup and CASL RBAC matrix review for all AI endpoints
|
||||
- [ ] T030 E2E Validation of the BullMQ concurrency limit (stress test 10 concurrent requests)
|
||||
- [ ] T031 Finalize `README.md` and `quickstart.md` documentation for Desk-5439 setup
|
||||
- [X] T029 Code cleanup and CASL RBAC matrix review for all AI endpoints
|
||||
- [X] T030 E2E Validation of the BullMQ concurrency limit (stress test 10 concurrent requests)
|
||||
- [X] T031 Finalize `README.md` and `quickstart.md` documentation for Desk-5439 setup
|
||||
|
||||
---
|
||||
|
||||
|
||||
Reference in New Issue
Block a user