6.3 KiB
// File: specs/100-Infrastructures/141-server-consolidation/research.md // Change Log: // - 2026-06-20: Initial research for Single-Host Server Consolidation
Research: Single-Host Server Consolidation
Branch: 141-server-consolidation | Date: 2026-06-20
R1: Docker Network Isolation Strategy
Decision: Use two Docker bridge networks — dms-internal (all services) and dms-frontend (Frontend + Backend only, for LAN publish).
Rationale: Docker bridge networks provide L2 isolation. Services on dms-internal without ports mapping are unreachable from LAN. Only Frontend (3000) and Backend (3000) need LAN access. This replaces VLAN/firewall ACL reliance with Docker-native isolation.
Alternatives Considered:
- Single bridge network + iptables rules — more complex, error-prone
- Docker Swarm overlay network — overkill for single host
- Host network mode — no isolation, security risk
R2: CIFS Mount Strategy for ASUSTOR
Decision: Use Docker named volume with CIFS driver to mount ASUSTOR share //192.168.10.9/np-dms-as/data/uploads as asustor_uploads volume, mounted at /mnt/uploads in sidecar and /app/uploads in backend.
Rationale: Docker CIFS volume driver handles mount lifecycle with container start/stop. Credentials in .env (gitignored). Both backend and sidecar see the same files via the same CIFS mount point.
Alternatives Considered:
- Host-level
mount -t cifsthen bind mount — requires host OS config, not portable - SSHFS — slower than CIFS for file operations
- Sync files to local SSD — adds complexity, storage duplication
Key Consideration: Previous Desk-5439 setup had issues with Docker Desktop WSL2 + CIFS (see memory). On Linux host, CIFS volume driver works natively without WSL2 layer.
R3: MariaDB Migration Strategy
Decision: Use mariadb-dump (logical dump) from QNAP MariaDB 11.8, pipe directly to new host MariaDB 11.8 container.
Rationale: Same MariaDB version (11.8) on both hosts → logical dump is safest. Database is small enough (<10GB estimated) that dump/restore completes within maintenance window.
Alternatives Considered:
mariabackup(physical backup) — faster but requires same filesystem layout- Replication (binlog) — overkill for one-time migration
- Copy raw data files — risky, requires same version + config
Migration Command:
# From QNAP (source) — dump all databases
mariadb-dump --single-transaction --routines --triggers \
-h 127.0.0.1 -u root -p"$DB_ROOT_PASSWORD" \
--all-databases > qnap-full-dump.sql
# On new host — restore
docker exec -i lcbp3-mariadb mariadb -u root -p"$DB_ROOT_PASSWORD" < qnap-full-dump.sql
R4: Elasticsearch Migration Strategy
Decision: Use ES snapshot/restore API — create snapshot on QNAP ES, transfer to new host, restore.
Rationale: ES snapshot API is the official migration path. Handles index mappings, settings, and data. Works across same ES version (8.11.x).
Alternatives Considered:
- Copy raw data directory — risky, requires identical ES config
- Re-index from MariaDB — slow, loses search index tuning
- Logstash pipeline — overkill for one-time migration
Migration Steps:
- Register shared filesystem repo on QNAP ES
- Create snapshot of all indices
- Copy snapshot files to new host ES data volume
- Register repo on new host ES
- Restore snapshot
R5: GPU VRAM Management on Single Host
Decision: Rely on ADR-040 D3 (Adaptive OCR Residency via calculate_ocr_residency()) and ADR-040 D4 (CPU Fallback Retrieval). LLM-First GPU Ownership from CONTEXT.md.
Rationale: RTX 5060 Ti 16GB must serve:
- np-dms-ai (Typhoon-2.5 ~7-8B): ~6-8GB VRAM
- np-dms-ocr (Typhoon OCR): ~5GB VRAM
- nomic-embed-text: ~0.5GB VRAM
- CUDA overhead: ~1.5GB
- Total: ~13-15GB → tight but feasible with adaptive residency
Key Policy: When LLM (np-dms-ai) needs to load, OCR model is unloaded first (keep_alive=0 for OCR). BGE-M3 + Reranker use CPU fallback when GPU is occupied.
Alternatives Considered:
- Force GPU-resident for all models — OOM risk (15.5GB > 16GB with overhead)
- CPU-only for all AI — too slow for production
- Second GPU — not available on new host
R6: RAM Budget Allocation
Decision: Per-container memory limits in Docker Compose:
| Service | Memory Limit | Notes |
|---|---|---|
| MariaDB | 8G | Largest consumer, tune innodb_buffer_pool |
| Elasticsearch | 4G | ES_JAVA_OPTS=-Xms2g -Xmx2g |
| Backend (NestJS) | 2G | Node.js + BullMQ workers |
| Frontend (Next.js) | 1G | Standalone mode |
| Redis | 1G | In-memory + AOF |
| Qdrant | 1G | Vector DB |
| OCR Sidecar | 1G | Python + PyMuPDF |
| Ollama | 2G | Model loading + inference |
| ClamAV | 2G | Virus definitions |
| ollama-metrics | 256M | Lightweight proxy |
| Total | ~22.3G | Leaves ~9.7G for OS + swap |
Rationale: 32GB total - 22.3GB containers = ~9.7GB for OS kernel + page cache + swap. Comfortable margin.
Alternatives Considered:
- No limits — risk of OOM killer affecting critical services
- Tighter limits — may cause ES/MariaDB instability
R7: CI/CD Pipeline Update
Decision: Update Gitea Actions ci-deploy.yml to SSH-deploy to new host IP instead of QNAP IP. ASUSTOR Gitea runner stays unchanged.
Rationale: Gitea runner on ASUSTOR (192.168.10.9) can reach new host via VLAN 10. Only the deploy target IP changes. deploy.sh path to compose file updates to New-Host/docker-compose.new-host.yml.
Alternatives Considered:
- Move Gitea runner to new host — unnecessary, runner works remotely
- Manual deployment — not sustainable for ongoing releases
R8: Rollback Strategy
Decision: Multi-step rollback plan documented in rollback.sh:
- Stop services on new host (
docker compose down) - Restore services on QNAP (start existing containers with old data)
- Restore services on Desk-5439 (start Ollama + sidecar)
- Revert DNS/NPM to point to QNAP
- Revert Gitea CI/CD deploy target to QNAP
- Re-enable X-API-Key in sidecar + backend
Rationale: QNAP retains all data (MariaDB, ES, Redis, files) until verified stable. Rollback is fast (<2 hours) because old infrastructure is intact.
Alternatives Considered:
- No rollback (accept SPOF) — too risky for production DMS
- Hot failover with replication — overkill for current scale