9.0 KiB
Quickstart: OCR Sidecar Refactor
Date: 2026-06-20 Purpose: Deployment and testing guide for OCR sidecar refactor
Prerequisites
- Access to Desk-5439 (192.168.10.100) with Docker
- Access to backend services (QNAP 192.168.10.8)
- Python 3.11+ for local testing (optional)
- pytest for testing (optional)
Phase 1: Deployment (Before ADR-041 Consolidation)
Step 1: Update Sidecar Code
- Navigate to sidecar directory:
cd specs/04-Infrastructure-OPS/04-00-docker-compose/Desk-5439/ocr-sidecar
-
Update
app.pywith the following changes:- Remove hardcoded default API key
- Fail-fast if
OCR_SIDECAR_API_KEYenv missing - Implement async I/O with
httpx.AsyncClientvia lifespan - Replace
@app.on_event("startup")with lifespan context manager - Wire
calculate_ocr_residency()intoprocess_ocr - Implement path canonicalization + base-path whitelist on
/ocr - Remove hardcoded runtime parameters
- Receive systemPrompt and DMS tags from backend
- Remove
/normalizeendpoint - Fix mutable default argument
options_override={} - Load models via
asyncio.to_threadduring lifespan
-
Update
requirements.txt:
PyMuPDF==1.24.0
fastapi==0.111.0
uvicorn[standard]==0.30.1
python-multipart==0.0.9
httpx==0.27.0
FlagEmbedding>=1.2.0
typhoon-ocr>=0.4.1
- Update
.env:
# Phase 1 (before ADR-041)
OCR_SIDECAR_API_KEY=your-secure-api-key-here
# Common variables
OCR_SIDECAR_UPLOAD_BASE=/mnt/uploads
OLLAMA_API_URL=http://localhost:11434
OCR_MODEL=np-dms-ocr:latest
Step 2: Update Backend Services
-
Update
backend/src/modules/ai/services/ocr.service.ts:- Add parameter resolution from
ai_execution_profiles(rowocr-extract) - Add Active Prompt resolution from
ai_prompts(typeocr_extraction) - Extract systemPrompt and DMS tags from Active Prompt
- Send resolved parameters to sidecar in OCR requests
- Keep X-API-Key send-side (Phase 1)
- Add parameter resolution from
-
Update
backend/src/modules/ai/services/sandbox-ocr-engine.service.ts:- Same parameter resolution pattern as OcrService
- Keep X-API-Key send-side (Phase 1)
-
Update backend
.env:
# Phase 1 (before ADR-041)
OCR_API_URL=http://192.168.10.100:8765
OCR_API_KEY=your-secure-api-key-here
# Common variables
OCR_SIDECAR_UPLOAD_BASE=/app/uploads
Step 3: Rebuild and Deploy Sidecar
- Build Docker image on Desk-5439:
cd specs/04-Infrastructure-OPS/04-00-docker-compose/Desk-5439/ocr-sidecar
docker-compose build
- Stop existing container:
docker-compose down
- Start new container:
docker-compose up -d
- Verify health:
curl http://192.168.10.100:8765/health
Expected response:
{
"status": "healthy",
"timestamp": "2026-06-20T10:30:00Z",
"version": "1.0.0"
}
Step 4: Deploy Backend Changes
- Build backend:
cd backend
pnpm run build
- Deploy backend containers (via existing deploy script or manual):
# From repo root
./scripts/deploy.sh
- Verify backend health:
curl http://localhost:3001/api/ai/health
Phase 2: Deployment (After ADR-041 Consolidation)
Note: This phase can only be executed after ADR-041 server consolidation completes (single Docker host).
Step 1: Remove X-API-Key from Sidecar
-
Update
app.pyon sidecar:- Remove X-API-Key validation from all endpoints
- Remove
OCR_SIDECAR_API_KEYenvironment variable check
-
Update
.envon sidecar:
# Remove OCR_SIDECAR_API_KEY line
# Keep common variables
OCR_SIDECAR_UPLOAD_BASE=/mnt/uploads
OLLAMA_API_URL=http://localhost:11434
TYPHOON_OCR_MODEL=typhoon-np-dms-ocr:latest
- Rebuild and redeploy sidecar:
docker-compose down
docker-compose build
docker-compose up -d
Step 2: Remove X-API-Key from Backend
-
Update
backend/src/modules/ai/services/ocr.service.ts:- Remove X-API-Key header from sidecar requests
- Remove
OCR_API_KEYenvironment variable usage
-
Update
backend/src/modules/ai/services/sandbox-ocr-engine.service.ts:- Remove X-API-Key header from sidecar requests
- Remove
OCR_API_KEYenvironment variable usage
-
Update backend
.env:
# Remove OCR_API_KEY line
# Keep common variables
OCR_API_URL=http://sidecar:8765 # Docker-internal URL
OCR_SIDECAR_UPLOAD_BASE=/app/uploads
- Rebuild and redeploy backend:
cd backend
pnpm run build
./scripts/deploy.sh
Testing
Unit Tests (Sidecar)
- Navigate to sidecar tests directory:
cd specs/04-Infrastructure-OPS/04-00-docker-compose/Desk-5439/ocr-sidecar/tests
- Run path traversal tests:
pytest test_path_traversal.py -v
Expected output: All tests pass, path traversal attempts return 403
- Run residency wiring tests:
pytest test_residency_wiring.py -v
Expected output: All tests pass, calculate_ocr_residency() is called correctly
Integration Tests (Backend)
- Run backend AI service tests:
cd backend
pnpm test ai/ocr.service.spec.ts
pnpm test ai/sandbox-ocr-engine.service.spec.ts
- Verify parameter resolution from database:
- Check that
ai_execution_profilesrowocr-extractexists - Check that
ai_promptshas active row forocr_extractiontype - Verify parameters are correctly resolved and sent to sidecar
- Check that
Manual Testing
- Test path traversal protection:
curl -X POST http://192.168.10.100:8765/ocr \
-H "Content-Type: application/json" \
-H "X-API-Key: your-api-key" \
-d '{
"pdf_path": "/mnt/uploads/temp/../../etc/passwd",
"runtime_params": {
"temperature": 0.7,
"top_p": 0.9,
"repeat_penalty": 1.1,
"max_tokens": 4096
}
}'
Expected: 403 Forbidden
- Test valid OCR request:
curl -X POST http://192.168.10.100:8765/ocr \
-H "Content-Type: application/json" \
-H "X-API-Key: your-api-key" \
-d '{
"pdf_path": "/mnt/uploads/temp/test.pdf",
"runtime_params": {
"temperature": 0.7,
"top_p": 0.9,
"repeat_penalty": 1.1,
"max_tokens": 4096
}
}'
Expected: 200 OK with extracted text
-
Test parameter governance:
- Modify
ai_execution_profilesrowocr-extractparameters - Run OCR request
- Verify new parameters are used (check sidecar logs)
- Modify
-
Test Active Prompt integration:
- Modify active prompt in
ai_promptsforocr_extraction - Run OCR request
- Verify new system prompt is used
- Modify active prompt in
Performance Testing
- Benchmark async vs sync I/O:
# Use Apache Bench or similar tool
ab -n 1000 -c 10 -p ocr_request.json -T application/json \
http://192.168.10.100:8765/ocr
Expected: 20%+ throughput improvement with async I/O
- Monitor VRAM usage:
# On Desk-5439, monitor GPU usage during OCR operations
nvidia-smi -l 1
Expected: VRAM usage stays within limits, no exhaustion
Monitoring
Health Checks
- Sidecar health:
GET http://192.168.10.100:8765/health - Backend AI health:
GET http://localhost:3001/api/ai/health
Logs
- Sidecar logs:
docker-compose logs -f ocr-sidecar - Backend logs: Check backend application logs
Metrics
- Monitor OCR request latency
- Monitor VRAM usage on Desk-5439
- Monitor error rates (403 for path traversal, 500 for internal errors)
Rollback
If issues arise during deployment:
Rollback Sidecar
- Revert
app.pyto previous version - Restore previous
.envfile - Rebuild and redeploy:
docker-compose down
docker-compose build
docker-compose up -d
Rollback Backend
- Revert service changes in
ocr.service.tsandsandbox-ocr-engine.service.ts - Restore previous
.envfile - Rebuild and redeploy:
cd backend
pnpm run build
./scripts/deploy.sh
Emergency Rollback
If immediate rollback is needed:
- Revert
keep_aliveto fixed value0inprocess_ocr - Restore hardcoded runtime parameters
- Restore X-API-Key validation
- Rebuild and redeploy
Troubleshooting
Sidecar fails to start
- Check environment variables are set correctly
- Check
OCR_SIDECAR_API_KEYis provided (Phase 1) - Check Docker logs:
docker-compose logs ocr-sidecar - Verify Ollama is running on Desk-5439
Path traversal returns 200 instead of 403
- Verify
OCR_SIDECAR_UPLOAD_BASEis set correctly - Check path canonicalization logic in
app.py - Test with absolute paths to verify whitelist check
Parameters not being used
- Check
ai_execution_profilesrowocr-extractexists - Check backend service parameter resolution logic
- Check sidecar receives parameters in request body
- Check sidecar passes parameters to Ollama
VRAM exhaustion
- Check
calculate_ocr_residency()is being called - Check
vram_monitor.pyandresidency_policy.pyare present - Verify CPU fallback is working for
/embedand/rerank - Monitor GPU usage with
nvidia-smi
References
- ADR-040: OCR Sidecar Refactor
- ADR-036: Profile-Only Parameter Governance
- ADR-029: Dynamic Prompt Management
- ADR-037: Active Prompt System
- ADR-041: Server Consolidation (dependency for Phase 2)
- Sidecar API Contract