Files
lcbp3/specs/100-Infrastructures/140-ocr-sidecar-refactor/quickstart.md
T
admin a80ebef285
CI / CD Pipeline / build (push) Successful in 7m37s
CI / CD Pipeline / deploy (push) Failing after 20m15s
refactor(ai): OCR sidecar canonical naming cleanup — typhoon→np-dms, remove hardcoded keys, asyncio.to_thread, ADR-040/041
2026-06-20 16:37:04 +07:00

9.0 KiB

Quickstart: OCR Sidecar Refactor

Date: 2026-06-20 Purpose: Deployment and testing guide for OCR sidecar refactor

Prerequisites

  • Access to Desk-5439 (192.168.10.100) with Docker
  • Access to backend services (QNAP 192.168.10.8)
  • Python 3.11+ for local testing (optional)
  • pytest for testing (optional)

Phase 1: Deployment (Before ADR-041 Consolidation)

Step 1: Update Sidecar Code

  1. Navigate to sidecar directory:
cd specs/04-Infrastructure-OPS/04-00-docker-compose/Desk-5439/ocr-sidecar
  1. Update app.py with the following changes:

    • Remove hardcoded default API key
    • Fail-fast if OCR_SIDECAR_API_KEY env missing
    • Implement async I/O with httpx.AsyncClient via lifespan
    • Replace @app.on_event("startup") with lifespan context manager
    • Wire calculate_ocr_residency() into process_ocr
    • Implement path canonicalization + base-path whitelist on /ocr
    • Remove hardcoded runtime parameters
    • Receive systemPrompt and DMS tags from backend
    • Remove /normalize endpoint
    • Fix mutable default argument options_override={}
    • Load models via asyncio.to_thread during lifespan
  2. Update requirements.txt:

PyMuPDF==1.24.0
fastapi==0.111.0
uvicorn[standard]==0.30.1
python-multipart==0.0.9
httpx==0.27.0
FlagEmbedding>=1.2.0
typhoon-ocr>=0.4.1
  1. Update .env:
# Phase 1 (before ADR-041)
OCR_SIDECAR_API_KEY=your-secure-api-key-here

# Common variables
OCR_SIDECAR_UPLOAD_BASE=/mnt/uploads
OLLAMA_API_URL=http://localhost:11434
OCR_MODEL=np-dms-ocr:latest

Step 2: Update Backend Services

  1. Update backend/src/modules/ai/services/ocr.service.ts:

    • Add parameter resolution from ai_execution_profiles (row ocr-extract)
    • Add Active Prompt resolution from ai_prompts (type ocr_extraction)
    • Extract systemPrompt and DMS tags from Active Prompt
    • Send resolved parameters to sidecar in OCR requests
    • Keep X-API-Key send-side (Phase 1)
  2. Update backend/src/modules/ai/services/sandbox-ocr-engine.service.ts:

    • Same parameter resolution pattern as OcrService
    • Keep X-API-Key send-side (Phase 1)
  3. Update backend .env:

# Phase 1 (before ADR-041)
OCR_API_URL=http://192.168.10.100:8765
OCR_API_KEY=your-secure-api-key-here

# Common variables
OCR_SIDECAR_UPLOAD_BASE=/app/uploads

Step 3: Rebuild and Deploy Sidecar

  1. Build Docker image on Desk-5439:
cd specs/04-Infrastructure-OPS/04-00-docker-compose/Desk-5439/ocr-sidecar
docker-compose build
  1. Stop existing container:
docker-compose down
  1. Start new container:
docker-compose up -d
  1. Verify health:
curl http://192.168.10.100:8765/health

Expected response:

{
  "status": "healthy",
  "timestamp": "2026-06-20T10:30:00Z",
  "version": "1.0.0"
}

Step 4: Deploy Backend Changes

  1. Build backend:
cd backend
pnpm run build
  1. Deploy backend containers (via existing deploy script or manual):
# From repo root
./scripts/deploy.sh
  1. Verify backend health:
curl http://localhost:3001/api/ai/health

Phase 2: Deployment (After ADR-041 Consolidation)

Note: This phase can only be executed after ADR-041 server consolidation completes (single Docker host).

Step 1: Remove X-API-Key from Sidecar

  1. Update app.py on sidecar:

    • Remove X-API-Key validation from all endpoints
    • Remove OCR_SIDECAR_API_KEY environment variable check
  2. Update .env on sidecar:

# Remove OCR_SIDECAR_API_KEY line
# Keep common variables
OCR_SIDECAR_UPLOAD_BASE=/mnt/uploads
OLLAMA_API_URL=http://localhost:11434
TYPHOON_OCR_MODEL=typhoon-np-dms-ocr:latest
  1. Rebuild and redeploy sidecar:
docker-compose down
docker-compose build
docker-compose up -d

Step 2: Remove X-API-Key from Backend

  1. Update backend/src/modules/ai/services/ocr.service.ts:

    • Remove X-API-Key header from sidecar requests
    • Remove OCR_API_KEY environment variable usage
  2. Update backend/src/modules/ai/services/sandbox-ocr-engine.service.ts:

    • Remove X-API-Key header from sidecar requests
    • Remove OCR_API_KEY environment variable usage
  3. Update backend .env:

# Remove OCR_API_KEY line
# Keep common variables
OCR_API_URL=http://sidecar:8765  # Docker-internal URL
OCR_SIDECAR_UPLOAD_BASE=/app/uploads
  1. Rebuild and redeploy backend:
cd backend
pnpm run build
./scripts/deploy.sh

Testing

Unit Tests (Sidecar)

  1. Navigate to sidecar tests directory:
cd specs/04-Infrastructure-OPS/04-00-docker-compose/Desk-5439/ocr-sidecar/tests
  1. Run path traversal tests:
pytest test_path_traversal.py -v

Expected output: All tests pass, path traversal attempts return 403

  1. Run residency wiring tests:
pytest test_residency_wiring.py -v

Expected output: All tests pass, calculate_ocr_residency() is called correctly

Integration Tests (Backend)

  1. Run backend AI service tests:
cd backend
pnpm test ai/ocr.service.spec.ts
pnpm test ai/sandbox-ocr-engine.service.spec.ts
  1. Verify parameter resolution from database:
    • Check that ai_execution_profiles row ocr-extract exists
    • Check that ai_prompts has active row for ocr_extraction type
    • Verify parameters are correctly resolved and sent to sidecar

Manual Testing

  1. Test path traversal protection:
curl -X POST http://192.168.10.100:8765/ocr \
  -H "Content-Type: application/json" \
  -H "X-API-Key: your-api-key" \
  -d '{
    "pdf_path": "/mnt/uploads/temp/../../etc/passwd",
    "runtime_params": {
      "temperature": 0.7,
      "top_p": 0.9,
      "repeat_penalty": 1.1,
      "max_tokens": 4096
    }
  }'

Expected: 403 Forbidden

  1. Test valid OCR request:
curl -X POST http://192.168.10.100:8765/ocr \
  -H "Content-Type: application/json" \
  -H "X-API-Key: your-api-key" \
  -d '{
    "pdf_path": "/mnt/uploads/temp/test.pdf",
    "runtime_params": {
      "temperature": 0.7,
      "top_p": 0.9,
      "repeat_penalty": 1.1,
      "max_tokens": 4096
    }
  }'

Expected: 200 OK with extracted text

  1. Test parameter governance:

    • Modify ai_execution_profiles row ocr-extract parameters
    • Run OCR request
    • Verify new parameters are used (check sidecar logs)
  2. Test Active Prompt integration:

    • Modify active prompt in ai_prompts for ocr_extraction
    • Run OCR request
    • Verify new system prompt is used

Performance Testing

  1. Benchmark async vs sync I/O:
# Use Apache Bench or similar tool
ab -n 1000 -c 10 -p ocr_request.json -T application/json \
  http://192.168.10.100:8765/ocr

Expected: 20%+ throughput improvement with async I/O

  1. Monitor VRAM usage:
# On Desk-5439, monitor GPU usage during OCR operations
nvidia-smi -l 1

Expected: VRAM usage stays within limits, no exhaustion

Monitoring

Health Checks

  • Sidecar health: GET http://192.168.10.100:8765/health
  • Backend AI health: GET http://localhost:3001/api/ai/health

Logs

  • Sidecar logs: docker-compose logs -f ocr-sidecar
  • Backend logs: Check backend application logs

Metrics

  • Monitor OCR request latency
  • Monitor VRAM usage on Desk-5439
  • Monitor error rates (403 for path traversal, 500 for internal errors)

Rollback

If issues arise during deployment:

Rollback Sidecar

  1. Revert app.py to previous version
  2. Restore previous .env file
  3. Rebuild and redeploy:
docker-compose down
docker-compose build
docker-compose up -d

Rollback Backend

  1. Revert service changes in ocr.service.ts and sandbox-ocr-engine.service.ts
  2. Restore previous .env file
  3. Rebuild and redeploy:
cd backend
pnpm run build
./scripts/deploy.sh

Emergency Rollback

If immediate rollback is needed:

  1. Revert keep_alive to fixed value 0 in process_ocr
  2. Restore hardcoded runtime parameters
  3. Restore X-API-Key validation
  4. Rebuild and redeploy

Troubleshooting

Sidecar fails to start

  1. Check environment variables are set correctly
  2. Check OCR_SIDECAR_API_KEY is provided (Phase 1)
  3. Check Docker logs: docker-compose logs ocr-sidecar
  4. Verify Ollama is running on Desk-5439

Path traversal returns 200 instead of 403

  1. Verify OCR_SIDECAR_UPLOAD_BASE is set correctly
  2. Check path canonicalization logic in app.py
  3. Test with absolute paths to verify whitelist check

Parameters not being used

  1. Check ai_execution_profiles row ocr-extract exists
  2. Check backend service parameter resolution logic
  3. Check sidecar receives parameters in request body
  4. Check sidecar passes parameters to Ollama

VRAM exhaustion

  1. Check calculate_ocr_residency() is being called
  2. Check vram_monitor.py and residency_policy.py are present
  3. Verify CPU fallback is working for /embed and /rerank
  4. Monitor GPU usage with nvidia-smi

References

  • ADR-040: OCR Sidecar Refactor
  • ADR-036: Profile-Only Parameter Governance
  • ADR-029: Dynamic Prompt Management
  • ADR-037: Active Prompt System
  • ADR-041: Server Consolidation (dependency for Phase 2)
  • Sidecar API Contract