refactor(ai): OCR sidecar canonical naming cleanup — typhoon→np-dms, remove hardcoded keys, asyncio.to_thread, ADR-040/041
CI / CD Pipeline / build (push) Successful in 7m37s
CI / CD Pipeline / deploy (push) Failing after 20m15s

This commit is contained in:
2026-06-20 16:37:04 +07:00
parent d418d791a4
commit a80ebef285
70 changed files with 5762 additions and 452 deletions
@@ -0,0 +1,36 @@
# Specification Quality Checklist: Single-Host Server Consolidation
**Purpose**: Validate specification completeness and quality before proceeding to planning
**Created**: 2026-06-20
**Feature**: [spec.md](../spec.md)
## Content Quality
- [x] No implementation details (languages, frameworks, APIs) — spec focuses on operational outcomes
- [x] Focused on user value and business needs — admin/ops workflows clearly defined
- [x] Written for non-technical stakeholders — user stories describe journeys, not code
- [x] All mandatory sections completed — User Scenarios, Requirements, Success Criteria all filled
## Requirement Completeness
- [x] No [NEEDS CLARIFICATION] markers remain — all requirements have clear definitions
- [x] Requirements are testable and unambiguous — each FR has measurable acceptance criteria
- [x] Success criteria are measurable — SC-001 through SC-010 have specific metrics
- [x] Success criteria are technology-agnostic — focus on outcomes (parity, latency, uptime) not tools
- [x] All acceptance scenarios are defined — 5 user stories with Given/When/Then scenarios
- [x] Edge cases are identified — 7 edge cases covering GPU OOM, RAM, CIFS, SPOF, network, migration failures
- [x] Scope is clearly bounded — includes provisioning, migration, cutover, security, decommission
- [x] Dependencies and assumptions identified — 7 assumptions documented
## Feature Readiness
- [x] All functional requirements have clear acceptance criteria — FR-001 through FR-015 mapped to user stories
- [x] User scenarios cover primary flows — P1 (provision) → P2 (migrate) → P3 (cutover) → P4 (security) → P5 (decommission)
- [x] Feature meets measurable outcomes defined in Success Criteria — 10 measurable outcomes
- [x] No implementation details leak into specification — Docker/tech names are inherent to infra spec but kept at architecture level
## Notes
- This is an infrastructure specification based on ADR-041; some technical terms (Docker, CIFS, VRAM) are inherent to the domain
- ADR-040 (OCR Sidecar Refactor) is a hard dependency for FR-008 (remove X-API-Key) and FR-009 (GPU VRAM management)
- Spec is ready for `/speckit-clarify` or `/speckit-plan`
@@ -0,0 +1,69 @@
# Docker Compose Contract: New Host
**Branch**: `141-server-consolidation` | **Date**: 2026-06-20
This contract defines the service topology for the consolidated single-host deployment.
The actual `docker-compose.new-host.yml` will be created at:
`specs/04-Infrastructure-OPS/04-00-docker-compose/New-Host/docker-compose.new-host.yml`
## Service Topology
| Service | Image | Networks | LAN Ports | Internal Port | Memory Limit | Depends On |
|---------|-------|----------|-----------|---------------|--------------|------------|
| ollama | ollama/ollama:latest | dms-internal | none | 11434 | 2G (host) | — |
| ocr-sidecar | build (local) | dms-internal | none | 8765 | 1G | ollama |
| backend | lcbp3-backend:latest | dms-internal, dms-frontend | 3001→3000 | 3000 | 2G | ollama, ocr-sidecar, redis, mariadb, elasticsearch, qdrant, clamav |
| frontend | lcbp3-frontend:latest | dms-frontend | 3000 | 3000 | 1G | backend |
| redis | redis:7-alpine | dms-internal | none | 6379 | 1G | — |
| mariadb | mariadb:11.8 | dms-internal | none | 3306 | 8G | — |
| elasticsearch | elasticsearch:8.11.1 | dms-internal | none | 9200 | 4G | — |
| qdrant | qdrant/qdrant:v1.16.1 | dms-internal | none | 6333 | 1G | — |
| clamav | clamav/clamav:1.4.4 | dms-internal | none | 3310 | 2G | — |
| ollama-metrics | ghcr.io/norskhelsenett/ollama-metrics:latest | dms-internal | 9924 | 9924 | 256M | ollama |
## Network Topology
```
dms-internal (bridge, no LAN access)
├── ollama:11434
├── ocr-sidecar:8765
├── backend:3000 (also on dms-frontend)
├── redis:6379
├── mariadb:3306
├── elasticsearch:9200
├── qdrant:6333
├── clamav:3310
└── ollama-metrics:9924
dms-frontend (bridge, LAN published)
├── frontend:3000 → LAN:3000
├── backend:3000 → LAN:3001 (NPM routes backend.np-dms.work → :3001)
└── ollama-metrics:9924 → LAN:9924 (Prometheus scrape target)
```
## Environment Variables (New)
| Variable | Default | Description |
|----------|---------|-------------|
| ASUSTOR_USER | (required) | CIFS share username |
| ASUSTOR_PASS | (required) | CIFS share password |
| NEW_HOST_IP | (required) | New host LAN IP for CI/CD deploy target |
## Environment Variables (Changed from QNAP)
| Variable | Old Value (QNAP) | New Value (New Host) |
|----------|------------------|---------------------|
| DB_HOST | mariadb | mariadb (unchanged — Docker DNS) |
| REDIS_HOST | cache | redis (service name change) |
| ELASTICSEARCH_HOST | search | elasticsearch (service name change) |
| QDRANT_HOST | qdrant | qdrant (unchanged) |
| OCR_API_URL | http://192.168.10.100:8765 | http://ocr-sidecar:8765 |
| OLLAMA_API_URL | http://192.168.10.100:11434 | http://ollama:11434 |
| CLAMAV_HOST | clamav | clamav (unchanged) |
## Removed Environment Variables
| Variable | Reason |
|----------|--------|
| OCR_SIDECAR_API_KEY | ADR-040 D5 — network-only auth, no API key needed |
| OCR_SIDECAR_UPLOAD_BASE | Still needed but value changes to /mnt/uploads (same) |
@@ -0,0 +1,230 @@
// File: specs/100-Infrastructures/141-server-consolidation/data-model.md
// Change Log:
// - 2026-06-20: Initial data model for Single-Host Server Consolidation
# Data Model: Single-Host Server Consolidation
**Branch**: `141-server-consolidation` | **Date**: 2026-06-20
## Infrastructure Entities
### 1. Docker Network: dms-internal
| Attribute | Type | Description |
|-----------|------|-------------|
| name | string | `dms-internal` |
| driver | string | `bridge` |
| scope | string | local (single host) |
| published_ports | none | No ports published to LAN |
**Members**: ollama, ocr-sidecar, backend, redis, mariadb, elasticsearch, qdrant, clamav, ollama-metrics
### 2. Docker Network: dms-frontend
| Attribute | Type | Description |
|-----------|------|-------------|
| name | string | `dms-frontend` |
| driver | string | `bridge` |
| scope | string | local (single host) |
| published_ports | 3000 (frontend), 3001→3000 (backend), 9924 (ollama-metrics) | Only ports published to LAN |
**Members**: frontend, backend
### 3. Docker Volume: asustor_uploads
| Attribute | Type | Description |
|-----------|------|-------------|
| driver | string | `local` |
| type | string | `cifs` |
| device | string | `//192.168.10.9/np-dms-as/data/uploads` |
| mount_options | string | `username=${ASUSTOR_USER},password=${ASUSTOR_PASS},vers=3.0,uid=0,gid=0` |
| mount_point (sidecar) | string | `/mnt/uploads` (read-only) |
| mount_point (backend) | string | `/app/uploads` (read-write) |
### 4. Docker Volume: ollama_models
| Attribute | Type | Description |
|-----------|------|-------------|
| driver | string | `local` (named volume) |
| mount_point | string | `/root/.ollama` |
| content | string | Ollama model files (np-dms-ai, np-dms-ocr, nomic-embed-text) |
### 5. Docker Volume: mariadb_data
| Attribute | Type | Description |
|-----------|------|-------------|
| driver | string | `local` (named volume) |
| mount_point | string | `/var/lib/mysql` |
| content | string | MariaDB data files (migrated from QNAP) |
### 6. Docker Volume: es_data
| Attribute | Type | Description |
|-----------|------|-------------|
| driver | string | `local` (named volume) |
| mount_point | string | `/usr/share/elasticsearch/data` |
| content | string | Elasticsearch indices (migrated from QNAP) |
### 7. Docker Volume: redis_data
| Attribute | Type | Description |
|-----------|------|-------------|
| driver | string | `local` (named volume) |
| mount_point | string | `/data` |
| content | string | Redis AOF persistence + BullMQ queue data |
### 8. Docker Volume: qdrant_data
| Attribute | Type | Description |
|-----------|------|-------------|
| driver | string | `local` (named volume) |
| mount_point | string | `/qdrant/storage` |
| content | string | Qdrant vector collections |
## Service Definitions
### ollama
| Attribute | Value |
|-----------|-------|
| image | `ollama/ollama:latest` |
| GPU | NVIDIA RTX 5060 Ti 16GB (passthrough) |
| network | dms-internal only |
| ports | none (expose 11434 internal only) |
| volumes | ollama_models → /root/.ollama |
| depends_on | none |
| healthcheck | `ollama list` (verify API responsive) |
### ocr-sidecar
| Attribute | Value |
|-----------|-------|
| build | `./specs/04-Infrastructure-OPS/04-00-docker-compose/New-Host/ocr-sidecar` |
| network | dms-internal only |
| ports | none (expose 8765 internal only) |
| volumes | asustor_uploads → /mnt/uploads (read-only) |
| depends_on | ollama |
| env | OLLAMA_API_URL=http://ollama:11434, OCR_SIDECAR_UPLOAD_BASE=/mnt/uploads |
| healthcheck | `curl -f http://localhost:8765/health` |
### backend
| Attribute | Value |
|-----------|-------|
| image | `lcbp3-backend:${BACKEND_IMAGE_TAG:-latest}` |
| networks | dms-internal + dms-frontend |
| ports | 3001:3000 (published to LAN — NPM routes `backend.np-dms.work` → :3001) |
| volumes | asustor_uploads → /app/uploads (read-write) |
| depends_on | ollama, ocr-sidecar, redis, mariadb, elasticsearch, qdrant, clamav |
| env | OCR_API_URL=http://ocr-sidecar:8765, OLLAMA_API_URL=http://ollama:11434, DB_HOST=mariadb, REDIS_HOST=redis, ELASTICSEARCH_HOST=elasticsearch, QDRANT_HOST=qdrant |
| healthcheck | `curl -f http://localhost:3000/health` |
| memory_limit | 2G |
### frontend
| Attribute | Value |
|-----------|-------|
| image | `lcbp3-frontend:${FRONTEND_IMAGE_TAG:-latest}` |
| networks | dms-frontend only |
| ports | 3000:3000 (published to LAN) |
| depends_on | backend |
| env | INTERNAL_API_URL=http://backend:3000/api |
| healthcheck | `curl -f http://localhost:3000/` |
| memory_limit | 1G |
### redis
| Attribute | Value |
|-----------|-------|
| image | `redis:7-alpine` |
| network | dms-internal only |
| ports | none (expose 6379 internal only) |
| volumes | redis_data → /data |
| command | `redis-server --requirepass ${REDIS_PASSWORD} --appendonly yes --maxmemory-policy noeviction` |
| healthcheck | `redis-cli -a ${REDIS_PASSWORD} --no-auth-warning ping` |
| memory_limit | 1G |
### mariadb
| Attribute | Value |
|-----------|-------|
| image | `mariadb:11.8` |
| network | dms-internal only |
| ports | none (expose 3306 internal only) |
| volumes | mariadb_data → /var/lib/mysql |
| env | MARIADB_ROOT_PASSWORD, MARIADB_DATABASE=lcbp3, MARIADB_USER=center |
| command | `--character-set-server=utf8mb4 --collation-server=utf8mb4_general_ci` |
| healthcheck | `healthcheck.sh --connect --innodb_initialized` |
| memory_limit | 8G |
### elasticsearch
| Attribute | Value |
|-----------|-------|
| image | `elasticsearch:8.11.1` |
| network | dms-internal only |
| ports | none (expose 9200 internal only) |
| volumes | es_data → /usr/share/elasticsearch/data |
| env | discovery.type=single-node, xpack.security.enabled=false, ES_JAVA_OPTS=-Xms2g -Xmx2g |
| healthcheck | `curl -s http://localhost:9200/_cluster/health` |
| memory_limit | 4G |
### qdrant
| Attribute | Value |
|-----------|-------|
| image | `qdrant/qdrant:v1.16.1` |
| network | dms-internal only |
| ports | none (expose 6333 internal only) |
| volumes | qdrant_data → /qdrant/storage |
| healthcheck | TCP check on port 6333 |
| memory_limit | 1G |
### clamav
| Attribute | Value |
|-----------|-------|
| image | `clamav/clamav:1.4.4` |
| network | dms-internal only |
| ports | none (expose 3310 internal only) |
| healthcheck | `clamdcheck.sh` |
| memory_limit | 2G |
### ollama-metrics
| Attribute | Value |
|-----------|-------|
| image | `ghcr.io/norskhelsenett/ollama-metrics:latest` |
| network | dms-internal only |
| ports | 9924:9924 (published to LAN — Prometheus on ASUSTOR scrapes `http://<new-host-ip>:9924/metrics`) |
| env | OLLAMA_HOST=http://ollama:11434 |
| memory_limit | 256M |
## Service Communication Map
```
LAN (VLAN 10)
├── :3000 (Frontend) ──→ http://backend:3000/api (dms-frontend)
├── :3001 (Backend) ──→ http://backend:3000/api (dms-frontend)
└── :9924 (ollama-metrics) ──→ Prometheus scrape target
├──→ mariadb:3306 (dms-internal)
├──→ redis:6379 (dms-internal)
├──→ elasticsearch:9200 (dms-internal)
├──→ qdrant:6333 (dms-internal)
├──→ clamav:3310 (dms-internal)
├──→ ocr-sidecar:8765 (dms-internal)
└──→ ollama:11434 (dms-internal)
```
## Path Mapping
| Service | Container Path | Source |
|---------|---------------|--------|
| Backend | `/app/uploads/temp` | ASUSTOR CIFS `/data/uploads/temp` |
| Backend | `/app/uploads/permanent` | ASUSTOR CIFS `/data/uploads/permanent` |
| Sidecar | `/mnt/uploads/temp` (read-only) | ASUSTOR CIFS `/data/uploads/temp` |
| Sidecar | `/mnt/uploads/permanent` (read-only) | ASUSTOR CIFS `/data/uploads/permanent` |
**Note**: Backend uses `/app/uploads` (read-write), Sidecar uses `/mnt/uploads` (read-only). Both map to the same ASUSTOR CIFS share. Path remapping in `ocr.service.ts` (`remapPath()`) continues to work — strip `/app/uploads` and replace with `/mnt/uploads`.
@@ -0,0 +1,124 @@
// File: specs/100-Infrastructures/141-server-consolidation/plan.md
// Change Log:
// - 2026-06-20: Initial implementation plan for Single-Host Server Consolidation
# Implementation Plan: Single-Host Server Consolidation
**Branch**: `141-server-consolidation` | **Date**: 2026-06-20 | **Spec**: [spec.md](./spec.md)
**Input**: Feature specification from `/specs/100-Infrastructures/141-server-consolidation/spec.md`
**Related ADRs**: [ADR-041](../../06-Decision-Records/ADR-041-server-consolidation.md), [ADR-040](../../06-Decision-Records/ADR-040-ocr-sidecar-refactor.md)
## Summary
Consolidate all LCBP3-DMS services from a 2-host architecture (QNAP NAS + Desk-5439) onto a single Docker host (Ryzen 5 5600 / 32GB / RTX 5060 Ti 16GB). ASUSTOR becomes primary NAS for file storage via CIFS. Docker internal bridge network isolates Ollama and OCR Sidecar from LAN, enabling removal of X-API-Key auth (ADR-040 D5). QNAP becomes backup server; Desk-5439 is retired.
## Technical Context
**Language/Version**: Docker Compose v2 (YAML), Bash scripts, PowerShell provisioning
**Primary Dependencies**: Docker Engine 24+, Docker Compose v2, NVIDIA Container Toolkit, CIFS Utils
**Storage**: MariaDB 11.8 (Docker volume), Elasticsearch 8.11 (Docker volume), Redis 7 (Docker volume), Qdrant v1.16 (Docker volume), ASUSTOR CIFS for file uploads
**Testing**: Smoke tests (manual + scripted), health check endpoints, data parity verification scripts
**Target Platform**: Linux (Ubuntu 22.04 LTS or Debian 12) on Ryzen 5 5600 / 32GB / RTX 5060 Ti 16GB
**Project Type**: Infrastructure (Docker Compose stack + provisioning scripts)
**Performance Goals**: Backend-to-Ollama latency <50ms (localhost vs ~2ms LAN), all containers healthy within 5 min
**Constraints**: 32GB RAM total (target <28GB usage), 16GB VRAM (target <15GB usage), CIFS mount reliability
**Scale/Scope**: 8 containers (Ollama, OCR Sidecar, Backend, Frontend, Redis, MariaDB, ES, Qdrant) + ClamAV + ollama-metrics
## Constitution Check
_GATE: Must pass before Phase 0 research. Re-check after Phase 1 design._
| Principle | Status | Notes |
|-----------|--------|-------|
| ADR-016 Security | ✅ Pass | Network isolation replaces API key; no ports published for internal services |
| ADR-019 UUID | ✅ Pass | No UUID changes — infrastructure only |
| ADR-009 Schema | ✅ Pass | No schema changes — data migration via dump/restore |
| ADR-023/023A AI Boundary | ✅ Pass | Ollama isolated on Docker internal network; no direct DB/storage access |
| ADR-040 D5 Network Auth | ✅ Pass | Docker bridge isolation enables X-API-Key removal |
| ADR-008 BullMQ | ✅ Pass | Redis co-located on same host; queue behavior unchanged |
| ADR-002 Document Numbering | ✅ Pass | Redis Redlock unchanged; co-located reduces lock latency |
| SPOF Risk | ⚠️ Acknowledged | Single host = SPOF; mitigated by QNAP backup + DR plan |
**Gate Result**: PASS — no violations. SPOF risk is acknowledged in ADR-041 with mitigation plan.
## Project Structure
### Documentation (this feature)
```text
specs/100-Infrastructures/141-server-consolidation/
├── spec.md # Feature specification
├── plan.md # This file
├── research.md # Phase 0 output — research findings
├── data-model.md # Phase 1 output — infrastructure data model
├── quickstart.md # Phase 1 output — deployment guide
├── contracts/ # Phase 1 output — docker-compose contracts
│ └── docker-compose.new-host.yml
├── checklists/
│ └── requirements.md # Spec quality checklist
└── tasks.md # Phase 2 output (/speckit.tasks command)
```
### Source Code (repository root)
```text
specs/04-Infrastructure-OPS/04-00-docker-compose/
├── New-Host/ # NEW — consolidated host
│ ├── docker-compose.new-host.yml # Unified compose for all 8+ services
│ ├── .env.template # Environment template for new host
│ ├── ocr-sidecar/ # Sidecar (copied from Desk-5439, adapted)
│ │ ├── Dockerfile
│ │ ├── app.py
│ │ └── requirements.txt
│ ├── scripts/
│ │ ├── provision-host.sh # OS prep + Docker + NVIDIA toolkit
│ │ ├── migrate-mariadb.sh # Dump from QNAP → restore to new host
│ │ ├── migrate-elasticsearch.sh # Snapshot from QNAP → restore to new host
│ │ ├── smoke-test.sh # Post-cutover verification
│ │ └── rollback.sh # Emergency rollback to QNAP + Desk-5439
│ └── README.md # Deployment guide for new host
├── QNAP/ # EXISTING — becomes backup
├── Desk-5439/ # EXISTING — retired after cutover
└── ASUSTOR/ # EXISTING — Gitea runner stays
```
**Structure Decision**: New `New-Host/` directory under existing `04-00-docker-compose/` follows the established per-host directory pattern (QNAP/, Desk-5439/, ASUSTOR/). The unified compose file replaces the split QNAP/app + QNAP/service + QNAP/mariadb + Desk-5439/ocr-sidecar pattern with a single stack.
## Complexity Tracking
> No constitution check violations — table not needed.
## Implementation Phases
### Phase 1: Provision New Host (T001-T002)
- Install Ubuntu 22.04 LTS / Debian 12
- Install Docker Engine + Docker Compose v2
- Install NVIDIA drivers + nvidia-container-toolkit
- Mount ASUSTOR CIFS share to `/mnt/uploads`
- Create directory structure for Docker volumes
### Phase 2: Create Unified Docker Compose (T003-T005)
- Write `docker-compose.new-host.yml` with all services
- Configure `dms-internal` bridge network (no LAN publish for Ollama/sidecar)
- Configure `dms-frontend` bridge network (Frontend + Backend published)
- Copy OCR sidecar code from Desk-5439, adapt for Docker-internal Ollama URL
- Configure per-container memory limits per ADR-041 D5
### Phase 3: Migrate Data (T006-T007)
- Dump MariaDB from QNAP → restore to new host container
- Snapshot Elasticsearch from QNAP → restore to new host container
- Verify row count + document count parity
- Verify CIFS file access from backend container
### Phase 4: Cutover (T008-T010)
- Update Gitea CI/CD deploy target to new host
- Deploy services on new host
- Run smoke tests (login, document CRUD, OCR, AI, search)
- Remove X-API-Key from sidecar + backend (ADR-040 D5)
- Update DNS/NPM to point to new host
### Phase 5: Decommission (T011-T012)
- Stop services on QNAP (retain data for backup)
- Retire Desk-5439 (power off or repurpose)
- Monitor RAM/VRAM for 24-48 hours
- Document rollback procedure
@@ -0,0 +1,154 @@
// File: specs/100-Infrastructures/141-server-consolidation/quickstart.md
// Change Log:
// - 2026-06-20: Initial quickstart guide for Single-Host Server Consolidation
# Quickstart: Single-Host Server Consolidation
**Branch**: `141-server-consolidation` | **Date**: 2026-06-20
## Prerequisites
- New host with Ubuntu 22.04 LTS or Debian 12 installed
- Ryzen 5 5600 / 32GB RAM / RTX 5060 Ti 16GB
- Network access to VLAN 10 (192.168.10.x)
- ASUSTOR NAS accessible at 192.168.10.9 with CIFS share `np-dms-as`
- SSH access to QNAP (192.168.10.8) for data migration
- Gitea CI/CD access for deploy target update
## Step 1: Provision Host
```bash
# Run on new host (as root or sudo user)
cd /opt/lcbp3
bash specs/04-Infrastructure-OPS/04-00-docker-compose/New-Host/scripts/provision-host.sh
```
This script:
1. Installs Docker Engine + Docker Compose v2
2. Installs NVIDIA drivers + nvidia-container-toolkit
3. Creates CIFS mount for ASUSTOR at `/mnt/uploads`
4. Creates Docker volume directories
5. Verifies GPU access with `nvidia-smi`
## Step 2: Prepare .env
```bash
cd /opt/lcbp3/specs/04-Infrastructure-OPS/04-00-docker-compose/New-Host
cp .env.template .env
# Edit .env with real values:
# - ASUSTOR_USER, ASUSTOR_PASS (CIFS credentials)
# - DB_PASSWORD, DB_ROOT_PASSWORD (from QNAP .env)
# - REDIS_PASSWORD (from QNAP .env)
# - JWT_SECRET, JWT_REFRESH_SECRET (from QNAP .env)
# - AUTH_SECRET (from QNAP .env)
# - ELASTICSEARCH_PASSWORD (from QNAP .env)
```
## Step 3: Migrate Data
```bash
# Migrate MariaDB (from QNAP to new host)
bash scripts/migrate-mariadb.sh
# Migrate Elasticsearch (from QNAP to new host)
bash scripts/migrate-elasticsearch.sh
# Verify parity
bash scripts/verify-data-parity.sh
```
## Step 4: Deploy Services
```bash
# Pull latest images from Gitea registry
docker compose --env-file .env -f docker-compose.new-host.yml pull
# Start all services
docker compose --env-file .env -f docker-compose.new-host.yml up -d
# Check health
docker compose -f docker-compose.new-host.yml ps
docker compose -f docker-compose.new-host.yml logs --tail=50
```
## Step 5: Smoke Test
```bash
# Run smoke tests
bash scripts/smoke-test.sh
```
Smoke tests verify:
- Backend health check (`GET http://localhost:3001/health`)
- Frontend accessible (`GET http://localhost:3000/`)
- Login flow (POST /api/auth/login)
- Document list (GET /api/correspondences)
- OCR endpoint (POST /api/ai/sandbox/ocr)
- AI inference (POST /api/ai/sandbox/extract)
- Full-text search (GET /api/search)
## Step 6: Update CI/CD
Update Gitea secrets:
- `HOST` → new host IP (e.g., `192.168.10.50`)
- `COMPOSE_FILE``specs/04-Infrastructure-OPS/04-00-docker-compose/New-Host/docker-compose.new-host.yml`
## Step 7: Cutover DNS
Update NPM (Nginx Proxy Manager) on QNAP:
- `lcbp3.np-dms.work` → new host IP
- `backend.np-dms.work` → new host IP
## Step 8: Remove X-API-Key (ADR-040 D5)
After verifying Docker-internal network isolation:
1. Remove `OCR_SIDECAR_API_KEY` from sidecar environment
2. Remove API key validation from `app.py`
3. Remove `X-API-Key` header from backend `ocr.service.ts`
4. Rebuild and redeploy sidecar + backend
## Step 9: Monitor (24-48 hours)
```bash
# Monitor RAM usage
docker stats --no-stream
# Monitor VRAM usage
nvidia-smi --query-gpu=memory.used,memory.total --format=csv -l 60
# Monitor container health
watch -n 30 'docker compose -f docker-compose.new-host.yml ps'
```
## Step 10: Decommission Old Hosts
After 24-48 hours of stable operation:
```bash
# Stop QNAP services (retain data for backup)
ssh admin@192.168.10.8 'cd /share/np-dms/app && docker compose down'
ssh admin@192.168.10.8 'cd /share/np-dms/services && docker compose down'
# Power off Desk-5439
ssh user@192.168.10.100 'sudo shutdown -h now'
```
## Rollback (Emergency)
```bash
# Stop new host services
docker compose -f docker-compose.new-host.yml down
# Restore QNAP services
ssh admin@192.168.10.8 'cd /share/np-dms/app && docker compose up -d'
ssh admin@192.168.10.8 'cd /share/np-dms/services && docker compose up -d'
# Restore Desk-5439 services
ssh user@192.168.10.100 'cd /opt/ocr-sidecar && docker compose up -d'
# Revert DNS
# Update NPM to point back to QNAP (192.168.10.8)
# Revert CI/CD
# Update Gitea secrets HOST back to 192.168.10.8
```
@@ -0,0 +1,139 @@
// File: specs/100-Infrastructures/141-server-consolidation/research.md
// Change Log:
// - 2026-06-20: Initial research for Single-Host Server Consolidation
# Research: Single-Host Server Consolidation
**Branch**: `141-server-consolidation` | **Date**: 2026-06-20
## R1: Docker Network Isolation Strategy
**Decision**: Use two Docker bridge networks — `dms-internal` (all services) and `dms-frontend` (Frontend + Backend only, for LAN publish).
**Rationale**: Docker bridge networks provide L2 isolation. Services on `dms-internal` without `ports` mapping are unreachable from LAN. Only Frontend (3000) and Backend (3000) need LAN access. This replaces VLAN/firewall ACL reliance with Docker-native isolation.
**Alternatives Considered**:
- Single bridge network + iptables rules — more complex, error-prone
- Docker Swarm overlay network — overkill for single host
- Host network mode — no isolation, security risk
## R2: CIFS Mount Strategy for ASUSTOR
**Decision**: Use Docker named volume with CIFS driver to mount ASUSTOR share `//192.168.10.9/np-dms-as/data/uploads` as `asustor_uploads` volume, mounted at `/mnt/uploads` in sidecar and `/app/uploads` in backend.
**Rationale**: Docker CIFS volume driver handles mount lifecycle with container start/stop. Credentials in `.env` (gitignored). Both backend and sidecar see the same files via the same CIFS mount point.
**Alternatives Considered**:
- Host-level `mount -t cifs` then bind mount — requires host OS config, not portable
- SSHFS — slower than CIFS for file operations
- Sync files to local SSD — adds complexity, storage duplication
**Key Consideration**: Previous Desk-5439 setup had issues with Docker Desktop WSL2 + CIFS (see memory). On Linux host, CIFS volume driver works natively without WSL2 layer.
## R3: MariaDB Migration Strategy
**Decision**: Use `mariadb-dump` (logical dump) from QNAP MariaDB 11.8, pipe directly to new host MariaDB 11.8 container.
**Rationale**: Same MariaDB version (11.8) on both hosts → logical dump is safest. Database is small enough (<10GB estimated) that dump/restore completes within maintenance window.
**Alternatives Considered**:
- `mariabackup` (physical backup) — faster but requires same filesystem layout
- Replication (binlog) — overkill for one-time migration
- Copy raw data files — risky, requires same version + config
**Migration Command**:
```bash
# From QNAP (source) — dump all databases
mariadb-dump --single-transaction --routines --triggers \
-h 127.0.0.1 -u root -p"$DB_ROOT_PASSWORD" \
--all-databases > qnap-full-dump.sql
# On new host — restore
docker exec -i lcbp3-mariadb mariadb -u root -p"$DB_ROOT_PASSWORD" < qnap-full-dump.sql
```
## R4: Elasticsearch Migration Strategy
**Decision**: Use ES snapshot/restore API — create snapshot on QNAP ES, transfer to new host, restore.
**Rationale**: ES snapshot API is the official migration path. Handles index mappings, settings, and data. Works across same ES version (8.11.x).
**Alternatives Considered**:
- Copy raw data directory — risky, requires identical ES config
- Re-index from MariaDB — slow, loses search index tuning
- Logstash pipeline — overkill for one-time migration
**Migration Steps**:
1. Register shared filesystem repo on QNAP ES
2. Create snapshot of all indices
3. Copy snapshot files to new host ES data volume
4. Register repo on new host ES
5. Restore snapshot
## R5: GPU VRAM Management on Single Host
**Decision**: Rely on ADR-040 D3 (Adaptive OCR Residency via `calculate_ocr_residency()`) and ADR-040 D4 (CPU Fallback Retrieval). LLM-First GPU Ownership from CONTEXT.md.
**Rationale**: RTX 5060 Ti 16GB must serve:
- np-dms-ai (Typhoon-2.5 ~7-8B): ~6-8GB VRAM
- np-dms-ocr (Typhoon OCR): ~5GB VRAM
- nomic-embed-text: ~0.5GB VRAM
- CUDA overhead: ~1.5GB
- Total: ~13-15GB → tight but feasible with adaptive residency
**Key Policy**: When LLM (np-dms-ai) needs to load, OCR model is unloaded first (`keep_alive=0` for OCR). BGE-M3 + Reranker use CPU fallback when GPU is occupied.
**Alternatives Considered**:
- Force GPU-resident for all models — OOM risk (15.5GB > 16GB with overhead)
- CPU-only for all AI — too slow for production
- Second GPU — not available on new host
## R6: RAM Budget Allocation
**Decision**: Per-container memory limits in Docker Compose:
| Service | Memory Limit | Notes |
|---------|-------------|-------|
| MariaDB | 8G | Largest consumer, tune innodb_buffer_pool |
| Elasticsearch | 4G | ES_JAVA_OPTS=-Xms2g -Xmx2g |
| Backend (NestJS) | 2G | Node.js + BullMQ workers |
| Frontend (Next.js) | 1G | Standalone mode |
| Redis | 1G | In-memory + AOF |
| Qdrant | 1G | Vector DB |
| OCR Sidecar | 1G | Python + PyMuPDF |
| Ollama | 2G | Model loading + inference |
| ClamAV | 2G | Virus definitions |
| ollama-metrics | 256M | Lightweight proxy |
| **Total** | **~22.3G** | Leaves ~9.7G for OS + swap |
**Rationale**: 32GB total - 22.3GB containers = ~9.7GB for OS kernel + page cache + swap. Comfortable margin.
**Alternatives Considered**:
- No limits — risk of OOM killer affecting critical services
- Tighter limits — may cause ES/MariaDB instability
## R7: CI/CD Pipeline Update
**Decision**: Update Gitea Actions `ci-deploy.yml` to SSH-deploy to new host IP instead of QNAP IP. ASUSTOR Gitea runner stays unchanged.
**Rationale**: Gitea runner on ASUSTOR (192.168.10.9) can reach new host via VLAN 10. Only the deploy target IP changes. `deploy.sh` path to compose file updates to `New-Host/docker-compose.new-host.yml`.
**Alternatives Considered**:
- Move Gitea runner to new host — unnecessary, runner works remotely
- Manual deployment — not sustainable for ongoing releases
## R8: Rollback Strategy
**Decision**: Multi-step rollback plan documented in `rollback.sh`:
1. Stop services on new host (`docker compose down`)
2. Restore services on QNAP (start existing containers with old data)
3. Restore services on Desk-5439 (start Ollama + sidecar)
4. Revert DNS/NPM to point to QNAP
5. Revert Gitea CI/CD deploy target to QNAP
6. Re-enable X-API-Key in sidecar + backend
**Rationale**: QNAP retains all data (MariaDB, ES, Redis, files) until verified stable. Rollback is fast (<2 hours) because old infrastructure is intact.
**Alternatives Considered**:
- No rollback (accept SPOF) — too risky for production DMS
- Hot failover with replication — overkill for current scale
@@ -0,0 +1,160 @@
// File: specs/100-Infrastructures/141-server-consolidation/spec.md
// Change Log:
// - 2026-06-20: Initial specification for Single-Host Server Consolidation (ADR-041)
# Feature Specification: Single-Host Server Consolidation
**Feature Branch**: `141-server-consolidation`
**Created**: 2026-06-20
**Status**: Draft
**Category**: 100-Infrastructures
**Input**: ADR-041 — Consolidate all LCBP3-DMS services onto a single Docker host with ASUSTOR as primary NAS.
**Related ADRs**: [ADR-041](../../06-Decision-Records/ADR-041-server-consolidation.md), [ADR-040](../../06-Decision-Records/ADR-040-ocr-sidecar-refactor.md), [ADR-016](../../06-Decision-Records/ADR-016-security-authentication.md), [ADR-023A](../../06-Decision-Records/ADR-023A-unified-ai-architecture.md), [ADR-034](../../06-Decision-Records/ADR-034-AI-model-change.md)
## User Scenarios & Testing _(mandatory)_
### User Story 1 - Provision and Deploy on New Host (Priority: P1)
System administrator provisions the new single host (Ryzen 5 5600 / 32GB / RTX 5060 Ti 16GB), installs Docker, mounts CIFS share from ASUSTOR, and deploys all services (Ollama, OCR Sidecar, Backend, Frontend, Redis, MariaDB, Elasticsearch) using a single Docker Compose stack with internal bridge network isolation.
**Why this priority**: Without a running host, no other work can proceed. This is the foundation for all subsequent stories.
**Independent Test**: Can be fully tested by running `docker compose up` on the new host and verifying all containers are healthy via `docker ps` and health check endpoints.
**Acceptance Scenarios**:
1. **Given** a fresh OS installation on the new host, **When** the administrator runs the provisioning script, **Then** Docker Engine and Docker Compose are installed and verified with `docker --version`
2. **Given** Docker is installed, **When** the administrator mounts the ASUSTOR CIFS share, **Then** `/mnt/uploads/temp` and `/mnt/uploads/permanent` are accessible and writable by containers
3. **Given** CIFS mounts are ready, **When** the administrator runs `docker compose up -d`, **Then** all 7 service containers start and report healthy within 5 minutes
4. **Given** all containers are running, **When** the administrator checks network isolation, **Then** Ollama and OCR Sidecar ports are NOT accessible from LAN (only Frontend port 3000 and Backend port 3000 are published)
---
### User Story 2 - Migrate Data from QNAP to New Host (Priority: P2)
Database administrator migrates MariaDB data and Elasticsearch indices from QNAP to the new host, ensuring zero data loss and minimal downtime.
**Why this priority**: Data migration is the critical path for cutover. Without migrated data, the new host cannot serve production traffic.
**Independent Test**: Can be tested by comparing row counts and index document counts between source (QNAP) and destination (new host) after migration.
**Acceptance Scenarios**:
1. **Given** the new host is running with empty MariaDB, **When** the administrator performs a database dump-and-restore from QNAP, **Then** all tables and row counts match the source exactly
2. **Given** the new host is running with empty Elasticsearch, **When** the administrator migrates indices from QNAP, **Then** all index document counts match the source exactly
3. **Given** data migration is complete, **When** the administrator runs a data integrity check script, **Then** all critical tables pass checksum verification with zero discrepancies
4. **Given** file storage is on ASUSTOR CIFS mount, **When** the administrator verifies file access from the backend container, **Then** all existing uploaded files are accessible at the expected paths
---
### User Story 3 - Cutover and Smoke Test (Priority: P3)
Operations team performs the cutover from the old 2-host architecture (QNAP + Desk-5439) to the new single host, updates DNS/network routing, and runs smoke tests to verify all system functions work end-to-end.
**Why this priority**: Cutover is the final step that makes the new host production-active. It depends on P1 and P2 being complete.
**Independent Test**: Can be tested by accessing the application via the new host's IP/hostname and performing core DMS operations (login, document upload, search, AI inference).
**Acceptance Scenarios**:
1. **Given** data migration is verified, **When** the administrator updates DNS to point to the new host, **Then** users accessing the application URL reach the new host within the DNS TTL period
2. **Given** DNS is updated, **When** a user logs in and creates a new Correspondence, **Then** the document is saved successfully and visible in the list
3. **Given** the system is live on the new host, **When** a user uploads a PDF and triggers OCR, **Then** OCR text extraction completes successfully via the internal Docker network (sidecar → Ollama)
4. **Given** the system is live, **When** a user performs a full-text search, **Then** Elasticsearch returns results with the same accuracy as before migration
5. **Given** the system is live, **When** a user triggers AI metadata extraction, **Then** the AI inference completes successfully via the internal Docker network (backend → Ollama)
---
### User Story 4 - Remove X-API-Key and Verify Network-Only Auth (Priority: P4)
Security administrator removes the `X-API-Key` header authentication from the OCR Sidecar and Backend, relying solely on Docker-internal network isolation as per ADR-040 D5.
**Why this priority**: This is a key security improvement enabled by the consolidation. It simplifies the architecture but must be validated carefully.
**Independent Test**: Can be tested by attempting to access sidecar endpoints from outside the Docker network (should fail) and from within the Docker network (should succeed without API key).
**Acceptance Scenarios**:
1. **Given** all services are on the Docker internal bridge, **When** the backend calls the sidecar without `X-API-Key`, **Then** the sidecar processes the request successfully
2. **Given** the sidecar is not publishing ports to LAN, **When** an external client attempts to reach the sidecar directly, **Then** the connection is refused
3. **Given** the `X-API-Key` code is removed, **When** the administrator reviews the sidecar and backend configuration, **Then** no hardcoded API keys remain in the codebase
---
### User Story 5 - Decommission Old Hosts (Priority: P5)
Operations team stops services on QNAP (which becomes backup server) and retires Desk-5439, completing the consolidation.
**Why this priority**: Cleanup is the final step after the new host is verified stable. It frees up old hardware and reduces management complexity.
**Independent Test**: Can be tested by verifying that QNAP services are stopped (except backup-related) and Desk-5439 is powered off or repurposed.
**Acceptance Scenarios**:
1. **Given** the new host has been stable for 24-48 hours, **When** the administrator stops backend/frontend/Redis/DB/ES services on QNAP, **Then** QNAP remains available as a backup server with data intact
2. **Given** QNAP services are stopped, **When** the administrator powers off Desk-5439, **Then** no LCBP3-DMS services are affected on the new host
3. **Given** old hosts are decommissioned, **When** the administrator verifies monitoring dashboards, **Then** only the new host is tracked as the active production host
---
### Edge Cases
- **GPU OOM during concurrent AI + OCR load**: What happens when np-dms-ai and np-dms-ocr are loaded simultaneously and VRAM exceeds 16GB? ADR-040 D3 (Adaptive OCR Residency) must unload OCR model to make room for LLM.
- **RAM exhaustion under heavy load**: What happens when MariaDB + Elasticsearch + CPU-fallback tensors consume more than 32GB? System must have swap space configured and memory limits per container.
- **CIFS mount failure**: What happens when ASUSTOR NAS is unreachable? File upload/download will fail; system must degrade gracefully with clear error messages.
- **Single host hardware failure**: What happens when the new host crashes? SPOF mitigation requires backup data on QNAP and a disaster recovery plan.
- **Network misconfiguration**: What happens if Docker bridge network is accidentally exposed? Sidecar and Ollama would be accessible from LAN, breaking the security model.
- **Database migration partial failure**: What happens if MariaDB migration fails midway? Rollback plan must restore QNAP as the active database host.
- **Elasticsearch index corruption during migration**: What happens if ES indices are corrupted during transfer? Re-indexing from MariaDB data must be available as a fallback.
## Requirements _(mandatory)_
### Functional Requirements
- **FR-001**: System MUST co-locate all 7 services (Ollama, OCR Sidecar, Backend, Frontend, Redis, MariaDB, Elasticsearch) on a single Docker host with a unified `docker-compose.yml`
- **FR-002**: System MUST use ASUSTOR (192.168.10.9) as the primary NAS for file storage via CIFS mount at `/mnt/uploads`
- **FR-003**: System MUST isolate Ollama and OCR Sidecar on a Docker internal bridge network (`dms-internal`) with no ports published to LAN
- **FR-004**: System MUST publish only Frontend (port 3000) and Backend (port 3000) to the LAN
- **FR-005**: System MUST enable backend-to-sidecar and backend-to-Ollama communication via Docker service names (`http://ocr-sidecar:8765`, `http://ollama:11434`)
- **FR-006**: System MUST migrate MariaDB data from QNAP to the new host with zero data loss
- **FR-007**: System MUST migrate Elasticsearch indices from QNAP to the new host with zero data loss
- **FR-008**: System MUST remove `X-API-Key` authentication from sidecar and backend after confirming Docker-internal network isolation (ADR-040 D5)
- **FR-009**: System MUST enforce GPU VRAM management via Adaptive OCR Residency (ADR-040 D3) and CPU Fallback Retrieval (ADR-040 D4)
- **FR-010**: System MUST configure per-container memory limits to prevent any single service from exhausting 32GB RAM
- **FR-011**: System MUST retain QNAP as a backup server with database and file storage data intact after cutover
- **FR-012**: System MUST retire Desk-5439 after cutover is verified stable for 24-48 hours
- **FR-013**: System MUST provide a rollback plan to restore services on QNAP and Desk-5439 if the new host fails
- **FR-014**: System MUST verify all core DMS functions (login, document CRUD, OCR, AI inference, search) work end-to-end on the new host before decommissioning old hosts
- **FR-015**: System MUST monitor RAM and VRAM usage for 24-48 hours post-cutover to detect resource pressure
### Key Entities _(include if feature involves data)_
- **Docker Compose Stack**: Single `docker-compose.yml` defining all 7 services, 2 networks (`dms-internal`, `dms-frontend`), and volumes (CIFS, named volumes for data)
- **CIFS Volume Mount**: ASUSTOR network share mounted as Docker volume for file storage (`/mnt/uploads/temp`, `/mnt/uploads/permanent`)
- **Docker Internal Network**: Bridge network (`dms-internal`) isolating Ollama, Sidecar, Backend, Redis, MariaDB, and Elasticsearch from LAN access
- **GPU Resource Allocation**: NVIDIA GPU passthrough to Ollama container with VRAM management via adaptive residency policies
## Success Criteria _(mandatory)_
### Measurable Outcomes
- **SC-001**: All 7 service containers start and report healthy within 5 minutes of `docker compose up -d` on the new host
- **SC-002**: Database migration completes with 100% row count parity between QNAP and new host for all critical tables
- **SC-003**: Elasticsearch migration completes with 100% document count parity between QNAP and new host for all indices
- **SC-004**: Core DMS operations (login, document upload, search, OCR, AI inference) complete successfully on the new host with zero functional regressions
- **SC-005**: Ollama and OCR Sidecar are unreachable from LAN (port scan returns closed/refused for ports 11434 and 8765)
- **SC-006**: Backend-to-Ollama latency is reduced by at least 50% compared to cross-host LAN communication (measured via AI inference response time)
- **SC-007**: RAM usage remains below 28GB (87.5% of 32GB) under normal operational load for 24 hours post-cutover
- **SC-008**: VRAM usage remains below 15GB (93.7% of 16GB) during concurrent AI inference and OCR workloads
- **SC-009**: Rollback plan can be executed within 2 hours to restore services on QNAP and Desk-5439 if needed
- **SC-010**: QNAP backup server retains a valid database snapshot within 24 hours of cutover
### Assumptions
- The new host hardware (Ryzen 5 5600 / 32GB / RTX 5060 Ti 16GB) is physically available and OS-installed before provisioning begins
- ASUSTOR NAS (192.168.10.9) has sufficient storage capacity for all file uploads (temp + permanent)
- Network connectivity between the new host and ASUSTOR is via VLAN 10 with CIFS/SMB 3.0 support
- NVIDIA drivers and Docker GPU runtime (nvidia-container-toolkit) are compatible with the RTX 5060 Ti
- QNAP data (MariaDB, Elasticsearch) is in a consistent state suitable for dump-and-restore migration
- ADR-040 (OCR Sidecar Refactor) is implemented concurrently or prior to cutover for network-only auth and adaptive residency
- Gitea CI/CD pipeline can be updated to target the new host for deployment
@@ -0,0 +1,221 @@
// File: specs/100-Infrastructures/141-server-consolidation/tasks.md
// Change Log:
// - 2026-06-20: Initial task list for Single-Host Server Consolidation
// - 2026-06-20: Fix C1-C5 from analysis: backend env var update, port conflict, GPU residency, ollama-metrics port, n8n endpoints
# Tasks: Single-Host Server Consolidation
**Input**: Design documents from `/specs/100-Infrastructures/141-server-consolidation/`
**Prerequisites**: plan.md, spec.md, research.md, data-model.md, contracts/
**Related ADRs**: ADR-041, ADR-040, ADR-016, ADR-023A, ADR-034
**Organization**: Tasks are grouped by user story to enable independent implementation and testing of each story.
## Format: `[ID] [P?] [Story] Description`
- **[P]**: Can run in parallel (different files, no dependencies)
- **[Story]**: Which user story this task belongs to (e.g., US1, US2, US3)
- Include exact file paths in descriptions
## Phase 1: Setup (Shared Infrastructure)
**Purpose**: Create directory structure and initial files for the new host deployment
- [ ] T001 Create `specs/04-Infrastructure-OPS/04-00-docker-compose/New-Host/` directory structure with subdirectories: `ocr-sidecar/`, `scripts/`
- [ ] T002 [P] Create `.env.template` at `specs/04-Infrastructure-OPS/04-00-docker-compose/New-Host/.env.template` with all required env vars from contracts
- [ ] T003 [P] Create `README.md` at `specs/04-Infrastructure-OPS/04-00-docker-compose/New-Host/README.md` with deployment overview
---
## Phase 2: Foundational (Blocking Prerequisites)
**Purpose**: Provision the new host OS and create the unified Docker Compose stack — MUST be complete before any user story can proceed
**⚠️ CRITICAL**: No user story work can begin until this phase is complete
- [ ] T004 Create `specs/04-Infrastructure-OPS/04-00-docker-compose/New-Host/scripts/provision-host.sh` — installs Docker Engine, Docker Compose v2, NVIDIA drivers, nvidia-container-toolkit, CIFS utils, creates directory structure
- [ ] T005 Create `specs/04-Infrastructure-OPS/04-00-docker-compose/New-Host/docker-compose.new-host.yml` — unified compose with all 10 services, 2 networks (dms-internal, dms-frontend), CIFS volume, named volumes, memory limits per data-model.md. Backend publishes `3001:3000` to LAN (NPM routes `backend.np-dms.work` → :3001); Frontend publishes `3000:3000`; ollama-metrics publishes `9924:9924` to LAN for Prometheus scraping from ASUSTOR
- [ ] T006 [P] Copy OCR sidecar code from `specs/04-Infrastructure-OPS/04-00-docker-compose/Desk-5439/ocr-sidecar/` to `specs/04-Infrastructure-OPS/04-00-docker-compose/New-Host/ocr-sidecar/` — adapt `OLLAMA_API_URL` to `http://ollama:11434` (Docker DNS), remove `ports` mapping, use `expose` only
- [ ] T007 [P] Update `specs/04-Infrastructure-OPS/04-00-docker-compose/New-Host/ocr-sidecar/Dockerfile` — verify GPU access via nvidia-container-toolkit, ensure poppler-utils installed
- [ ] T008 [P] Update `specs/04-Infrastructure-OPS/04-00-docker-compose/New-Host/ocr-sidecar/requirements.txt` — verify typhoon-ocr, PyMuPDF, httpx, fastapi versions match Desk-5439
- [ ] T008b Update backend environment variables for renamed service names: `REDIS_HOST=redis` (was `cache`), `ELASTICSEARCH_HOST=elasticsearch` (was `search`) in `specs/04-Infrastructure-OPS/04-00-docker-compose/New-Host/.env.template` and `specs/04-Infrastructure-OPS/04-00-docker-compose/New-Host/docker-compose.new-host.yml` backend environment section — these service names changed from QNAP compose where Redis was `cache` and ES was `search`
**Checkpoint**: New host directory structure and unified compose file ready — user story implementation can now begin
---
## Phase 3: User Story 1 - Provision and Deploy on New Host (Priority: P1) 🎯 MVP
**Goal**: Administrator provisions the new host, mounts ASUSTOR CIFS, and deploys all services with Docker internal network isolation
**Independent Test**: Run `docker compose up -d` on the new host and verify all containers are healthy via `docker ps` and health check endpoints
### Implementation for User Story 1
- [ ] T009 [US1] Run `provision-host.sh` on new host — verify Docker, NVIDIA, CIFS mount at `/mnt/uploads`
- [ ] T010 [US1] Pull Ollama models on new host: `ollama pull np-dms-ai:latest`, `ollama pull np-dms-ocr:latest`, `ollama pull nomic-embed-text:latest` — verify with `ollama list`
- [ ] T011 [US1] Copy `.env.template` to `.env`, fill in all secrets from QNAP `.env` (DB passwords, JWT secrets, Redis password, ASUSTOR CIFS credentials)
- [ ] T012 [US1] Run `docker compose --env-file .env -f docker-compose.new-host.yml up -d` and verify all 10 containers start
- [ ] T013 [US1] Verify network isolation: `nmap -p 11434 <new-host-ip>` from another VLAN 10 machine should show closed/refused; `nmap -p 8765` should show closed/refused; `nmap -p 3000` (frontend) and `nmap -p 3001` (backend) should show open; `nmap -p 9924` (ollama-metrics) should show open for Prometheus
- [ ] T014 [US1] Verify health checks: `curl http://localhost:3001/health` (backend on published port 3001), `curl http://localhost:3000/` (frontend), `curl http://ocr-sidecar:8765/health` (from inside backend container via Docker DNS)
**Checkpoint**: All services running on new host with correct network isolation — MVP achieved
---
## Phase 4: User Story 2 - Migrate Data from QNAP to New Host (Priority: P2)
**Goal**: Migrate MariaDB and Elasticsearch data from QNAP to the new host with zero data loss
**Independent Test**: Compare row counts and index document counts between QNAP (source) and new host (destination) after migration
### Implementation for User Story 2
- [ ] T015 [P] [US2] Create `specs/04-Infrastructure-OPS/04-00-docker-compose/New-Host/scripts/migrate-mariadb.sh` — dump from QNAP MariaDB 11.8 via `mariadb-dump --single-transaction --routines --triggers`, pipe to new host container
- [ ] T016 [P] [US2] Create `specs/04-Infrastructure-OPS/04-00-docker-compose/New-Host/scripts/migrate-elasticsearch.sh` — create snapshot on QNAP ES, transfer files, register repo on new host, restore
- [ ] T017 [US2] Run `migrate-mariadb.sh` — verify all table row counts match between QNAP and new host
- [ ] T018 [US2] Run `migrate-elasticsearch.sh` — verify all index document counts match between QNAP and new host
- [ ] T019 [US2] Create and run `specs/04-Infrastructure-OPS/04-00-docker-compose/New-Host/scripts/verify-data-parity.sh` — automated row count + document count comparison script
- [ ] T020 [US2] Verify CIFS file access: list files in `/app/uploads/temp` and `/app/uploads/permanent` from backend container, compare with ASUSTOR share
**Checkpoint**: All data migrated and verified — new host has complete production data
---
## Phase 5: User Story 3 - Cutover and Smoke Test (Priority: P3)
**Goal**: Perform production cutover from old 2-host architecture to new single host, verify all DMS functions work end-to-end
**Independent Test**: Access application via new host IP, perform core DMS operations (login, document upload, search, AI inference)
### Implementation for User Story 3
- [ ] T021 [P] [US3] Create `specs/04-Infrastructure-OPS/04-00-docker-compose/New-Host/scripts/smoke-test.sh` — automated tests for: backend health, frontend accessible, login flow, document list, OCR endpoint, AI inference, full-text search
- [ ] T022 [US3] Update Gitea secrets: `HOST` → new host IP, `COMPOSE_FILE``specs/04-Infrastructure-OPS/04-00-docker-compose/New-Host/docker-compose.new-host.yml`
- [ ] T023 [US3] Update `scripts/deploy.sh` — change `COMPOSE_FILE` path to New-Host directory
- [ ] T024 [US3] Update NPM (Nginx Proxy Manager) on QNAP: `lcbp3.np-dms.work` → new host IP:3000 (frontend), `backend.np-dms.work` → new host IP:3001 (backend)
- [ ] T024b [US3] Update n8n workflow endpoints on QNAP: change all backend API URLs from `http://192.168.10.8:3000/api` (QNAP) to `http://<new-host-ip>:3001/api` (new host) — n8n stays on QNAP but must reach backend on new host via LAN port 3001
- [ ] T025 [US3] Run `smoke-test.sh` on new host — verify all 7 smoke tests pass
- [ ] T026 [US3] Verify from external machine on VLAN 10: access `https://lcbp3.np-dms.work`, login, create a test Correspondence, upload a PDF, trigger OCR, perform search
**Checkpoint**: New host is production-active — all DMS functions verified end-to-end
---
## Phase 6: User Story 4 - Remove X-API-Key and Verify Network-Only Auth (Priority: P4)
**Goal**: Remove `X-API-Key` authentication from sidecar and backend, relying solely on Docker-internal network isolation per ADR-040 D5
**Independent Test**: Attempt to access sidecar from outside Docker network (should fail); verify backend calls sidecar without API key (should succeed)
### Implementation for User Story 4
- [ ] T027 [P] [US4] Remove `OCR_SIDECAR_API_KEY` from `specs/04-Infrastructure-OPS/04-00-docker-compose/New-Host/docker-compose.new-host.yml` ocr-sidecar environment
- [ ] T028 [P] [US4] Remove API key validation from `specs/04-Infrastructure-OPS/04-00-docker-compose/New-Host/ocr-sidecar/app.py` — remove `X-API-Key` header check middleware
- [ ] T029 [US4] Remove `X-API-Key` header from `backend/src/modules/ai/services/ocr.service.ts` — remove API key from HTTP client headers
- [ ] T030 [US4] Remove `OCR_SIDECAR_API_KEY` from `backend/.env.example` and any backend config that sets it
- [ ] T031 [US4] Rebuild and redeploy sidecar + backend containers — verify backend can call sidecar without API key
- [ ] T032 [US4] Verify external access blocked: `curl http://<new-host-ip>:8765/health` from VLAN 10 machine should fail (connection refused)
**Checkpoint**: Network-only auth verified — no API key needed, Docker isolation sufficient
---
## Phase 7: User Story 5 - Decommission Old Hosts (Priority: P5)
**Goal**: Stop services on QNAP (becomes backup) and retire Desk-5439, completing the consolidation
**Independent Test**: Verify QNAP services stopped (except backup), Desk-5439 powered off, new host unaffected
### Implementation for User Story 5
- [ ] T033 [P] [US5] Create `specs/04-Infrastructure-OPS/04-00-docker-compose/New-Host/scripts/rollback.sh` — emergency rollback: stop new host, restore QNAP + Desk-5439 services, revert DNS, revert CI/CD
- [ ] T034 [US5] Monitor new host for 24-48 hours: RAM usage (`docker stats`), VRAM usage (`nvidia-smi`), container health, application logs
- [ ] T034b [US5] Verify Adaptive OCR Residency (ADR-040 D3) on new RTX 5060 Ti: load `np-dms-ai` and `np-dms-ocr` concurrently, confirm `calculate_ocr_residency()` unloads OCR model when LLM needs VRAM; verify CPU Fallback Retrieval (ADR-040 D4) activates for BGE-M3/Reranker when GPU is occupied by LLM
- [ ] T035 [US5] Stop QNAP app services: `ssh admin@192.168.10.8 'cd /share/np-dms/app && docker compose down'`
- [ ] T036 [US5] Stop QNAP service stack: `ssh admin@192.168.10.8 'cd /share/np-dms/services && docker compose down'`
- [ ] T037 [US5] Retire Desk-5439: `ssh user@192.168.10.100 'sudo shutdown -h now'` (or repurpose)
- [ ] T038 [US5] Verify new host still fully operational after old hosts decommissioned — re-run `smoke-test.sh`
- [ ] T039 [US5] Take QNAP backup snapshot: `mariadb-dump` on QNAP MariaDB (if still running) or verify existing backup is current
**Checkpoint**: Consolidation complete — single host is sole production, old hosts decommissioned
---
## Phase 8: Polish & Cross-Cutting Concerns
**Purpose**: Documentation, monitoring, and final verification
- [ ] T040 [P] Update `specs/04-Infrastructure-OPS/04-00-docker-compose/README.md` — add New-Host section, mark QNAP as backup, mark Desk-5439 as retired
- [ ] T041 [P] Update `CONTEXT.md` — update infrastructure topology to reflect single-host architecture
- [ ] T042 [P] Update `AGENTS.md` — update infrastructure references (Desk-5439 → New Host, QNAP → backup)
- [ ] T043 Update `specs/04-Infrastructure-OPS/04-00-docker-compose/.env.template` — add ASUSTOR_USER, ASUSTOR_PASS, NEW_HOST_IP variables
- [ ] T044 [P] Update Prometheus/Grafana scrape config on ASUSTOR — update ollama-metrics target from `192.168.10.100:9924` to new host internal or host-published port
- [ ] T045 Run `quickstart.md` validation — follow all steps end-to-end on a fresh provision
- [ ] T046 [P] Document disaster recovery procedure — backup schedule, restore from QNAP backup, estimated RTO/RPO
---
## Dependencies & Execution Order
### Phase Dependencies
- **Setup (Phase 1)**: No dependencies — can start immediately
- **Foundational (Phase 2)**: Depends on Setup — BLOCKS all user stories
- **US1 (Phase 3)**: Depends on Foundational — requires physical access to new host
- **US2 (Phase 4)**: Depends on US1 (services must be running to receive migrated data)
- **US3 (Phase 5)**: Depends on US1 + US2 (services running + data migrated for cutover)
- **US4 (Phase 6)**: Depends on US3 (cutover complete, network isolation verified)
- **US5 (Phase 7)**: Depends on US3 + US4 (stable production before decommissioning)
- **Polish (Phase 8)**: Can start after US3; some tasks depend on US5
### User Story Dependencies
- **US1 (P1)**: Foundational → US1 — no dependencies on other stories
- **US2 (P2)**: US1 → US2 — needs running services to receive data
- **US3 (P3)**: US1 + US2 → US3 — needs running services + migrated data
- **US4 (P4)**: US3 → US4 — needs cutover complete to verify network isolation in production
- **US5 (P5)**: US3 + US4 → US5 — needs stable production before decommissioning
### Parallel Opportunities
- T002, T003 can run in parallel (different files)
- T006, T007, T008 can run in parallel (sidecar files, no dependencies)
- T015, T016 can run in parallel (different migration scripts)
- T027, T028 can run in parallel (different files: compose vs app.py)
- T040, T041, T042, T044 can run in parallel (different doc files)
- T027, T028, T030 can run in parallel (different files: compose, app.py, .env.example)
---
## Implementation Strategy
### MVP First (User Story 1 Only)
1. Complete Phase 1: Setup (create directory structure)
2. Complete Phase 2: Foundational (provision host + create compose)
3. Complete Phase 3: User Story 1 (deploy services)
4. **STOP and VALIDATE**: All containers healthy, network isolation verified
5. Demo to stakeholders if ready
### Incremental Delivery
1. Setup + Foundational → Infrastructure ready
2. Add US1 → Services deployed → Validate (MVP!)
3. Add US2 → Data migrated → Validate parity
4. Add US3 → Cutover complete → Validate end-to-end
5. Add US4 → Security hardened → Validate network-only auth
6. Add US5 → Old hosts retired → Validate stability
7. Polish → Documentation updated → Final validation
---
## Notes
- This is an infrastructure task — most work is shell scripts, Docker Compose YAML, and manual operations
- Physical access to the new host is required for US1
- Data migration (US2) requires SSH access to QNAP
- Cutover (US3) requires DNS/NPM access and coordination with users
- Decommission (US5) should only proceed after 24-48 hours of stable monitoring
- Rollback plan must be tested before cutover
- All env secrets must come from `.env` (gitignored) — never commit real secrets