feat(infra-ops): finalize infrastructure configurations before merge

- Update ASUSTOR gitea-runner and registry configurations - Add environment examples for registry services - Clean up MariaDB configuration files - Prepare for merge to main branch
690420:2332 Refactor QNAP service
2026-04-21 13:33:12 +07:00 · 2026-04-20 23:32:30 +07:00
20 changed files with 1696 additions and 34 deletions
@@ -28,7 +28,7 @@
    "editor.rulers": [80, 120],
    "editor.minimap.enabled": true,
    "editor.minimap.sectionHeaderFontSize": 12,
-    "editor.renderWhitespace": "selection",
+    "editor.renderWhitespace": "none",
    // "editor.renderWhitespace": "boundary",
    "editor.renderControlCharacters": true,
    "editor.bracketPairColorization.enabled": true,
@@ -0,0 +1,34 @@
+# Specification Quality Checklist: Infrastructure Operations & Deployment Automation
+
+**Purpose**: Validate specification completeness and quality before proceeding to planning
+**Created**: 2026-04-20
+**Feature**: [Infrastructure Operations & Deployment Automation](../spec.md)
+
+## Content Quality
+
+- [x] No implementation details (languages, frameworks, APIs)
+- [x] Focused on user value and business needs
+- [x] Written for non-technical stakeholders
+- [x] All mandatory sections completed
+
+## Requirement Completeness
+
+- [x] No [NEEDS CLARIFICATION] markers remain
+- [x] Requirements are testable and unambiguous
+- [x] Success criteria are measurable
+- [x] Success criteria are technology-agnostic (no implementation details)
+- [x] All acceptance scenarios are defined
+- [x] Edge cases are identified
+- [x] Scope is clearly bounded
+- [x] Dependencies and assumptions identified
+
+## Feature Readiness
+
+- [x] All functional requirements have clear acceptance criteria
+- [x] User scenarios cover primary flows
+- [x] Feature meets measurable outcomes defined in Success Criteria
+- [x] No implementation details leak into specification
+
+## Notes
+
+- Items marked incomplete require spec updates before `/speckit-clarify` or `/speckit-plan`
@@ -0,0 +1,500 @@
+openapi: 3.0.3
+info:
+  title: Infrastructure Operations API
+  description: API for managing infrastructure operations, deployments, and monitoring
+  version: 1.0.0
+  contact:
+    name: Infrastructure Team
+    email: infra@np-dms.work
+
+paths:
+  /deployments:
+    get:
+      summary: List all deployments
+      description: Retrieve status of all deployment environments
+      tags:
+        - Deployments
+      responses:
+        '200':
+          description: List of deployments retrieved successfully
+          content:
+            application/json:
+              schema:
+                type: object
+                properties:
+                  deployments:
+                    type: array
+                    items:
+                      $ref: '#/components/schemas/Deployment'
+    
+    post:
+      summary: Create new deployment
+      description: Initiate a new deployment to specified environment
+      tags:
+        - Deployments
+      requestBody:
+        required: true
+        content:
+          application/json:
+            schema:
+              $ref: '#/components/schemas/DeploymentRequest'
+      responses:
+        '201':
+          description: Deployment initiated successfully
+          content:
+            application/json:
+              schema:
+                $ref: '#/components/schemas/Deployment'
+        '400':
+          description: Invalid deployment request
+        '409':
+          description: Deployment already in progress
+
+  /deployments/{deploymentId}:
+    get:
+      summary: Get deployment details
+      description: Retrieve detailed information about a specific deployment
+      tags:
+        - Deployments
+      parameters:
+        - name: deploymentId
+          in: path
+          required: true
+          schema:
+            type: string
+            format: uuid
+      responses:
+        '200':
+          description: Deployment details retrieved successfully
+          content:
+            application/json:
+              schema:
+                $ref: '#/components/schemas/Deployment'
+        '404':
+          description: Deployment not found
+
+    patch:
+      summary: Update deployment status
+      description: Update deployment status or trigger rollback
+      tags:
+        - Deployments
+      parameters:
+        - name: deploymentId
+          in: path
+          required: true
+          schema:
+            type: string
+            format: uuid
+      requestBody:
+        required: true
+        content:
+          application/json:
+            schema:
+              $ref: '#/components/schemas/DeploymentUpdate'
+      responses:
+        '200':
+          description: Deployment updated successfully
+          content:
+            application/json:
+              schema:
+                $ref: '#/components/schemas/Deployment'
+        '404':
+          description: Deployment not found
+        '409':
+          description: Invalid state transition
+
+  /backups:
+    get:
+      summary: List backup archives
+      description: Retrieve list of available backup archives
+      tags:
+        - Backups
+      parameters:
+        - name: status
+          in: query
+          schema:
+            type: string
+            enum: [completed, in_progress, failed, validated]
+        - name: environment
+          in: query
+          schema:
+            type: string
+      responses:
+        '200':
+          description: List of backup archives retrieved successfully
+          content:
+            application/json:
+              schema:
+                type: object
+                properties:
+                  backups:
+                    type: array
+                    items:
+                      $ref: '#/components/schemas/BackupArchive'
+
+    post:
+      summary: Create backup
+      description: Initiate a new backup operation
+      tags:
+        - Backups
+      requestBody:
+        required: true
+        content:
+          application/json:
+            schema:
+              $ref: '#/components/schemas/BackupRequest'
+      responses:
+        '201':
+          description: Backup initiated successfully
+          content:
+            application/json:
+              schema:
+                $ref: '#/components/schemas/BackupArchive'
+        '409':
+          description: Backup already in progress
+
+  /backups/{backupId}/restore:
+    post:
+      summary: Restore from backup
+      description: Initiate restore operation from specified backup
+      tags:
+        - Backups
+      parameters:
+        - name: backupId
+          in: path
+          required: true
+          schema:
+            type: string
+            format: uuid
+      requestBody:
+        required: true
+        content:
+          application/json:
+            schema:
+              $ref: '#/components/schemas/RestoreRequest'
+      responses:
+        '202':
+          description: Restore operation initiated
+          content:
+            application/json:
+              schema:
+                $ref: '#/components/schemas/RestoreOperation'
+        '404':
+          description: Backup not found
+        '409':
+          description: Restore operation already in progress
+
+  /monitoring/metrics:
+    get:
+      summary: Get monitoring metrics
+      description: Retrieve current monitoring metrics for all services
+      tags:
+        - Monitoring
+      parameters:
+        - name: service
+          in: query
+          schema:
+            type: string
+        - name: metric
+          in: query
+          schema:
+            type: string
+        - name: timeRange
+          in: query
+          schema:
+            type: string
+            enum: [1h, 6h, 24h, 7d, 30d]
+      responses:
+        '200':
+          description: Metrics retrieved successfully
+          content:
+            application/json:
+              schema:
+                type: object
+                properties:
+                  metrics:
+                    type: array
+                    items:
+                      $ref: '#/components/schemas/MonitoringMetric'
+
+  /monitoring/alerts:
+    get:
+      summary: Get active alerts
+      description: Retrieve list of active monitoring alerts
+      tags:
+        - Monitoring
+      parameters:
+        - name: severity
+          in: query
+          schema:
+            type: string
+            enum: [critical, warning, info]
+        - name: status
+          in: query
+          schema:
+            type: string
+            enum: [active, acknowledged, resolved]
+      responses:
+        '200':
+          description: Alerts retrieved successfully
+          content:
+            application/json:
+              schema:
+                type: object
+                properties:
+                  alerts:
+                    type: array
+                    items:
+                      $ref: '#/components/schemas/Alert'
+
+    post:
+      summary: Acknowledge alert
+      description: Acknowledge an active alert
+      tags:
+        - Monitoring
+      requestBody:
+        required: true
+        content:
+          application/json:
+            schema:
+              $ref: '#/components/schemas/AlertAcknowledgment'
+      responses:
+        '200':
+          description: Alert acknowledged successfully
+        '404':
+          description: Alert not found
+
+components:
+  schemas:
+    Deployment:
+      type: object
+      properties:
+        id:
+          type: string
+          format: uuid
+        environment:
+          type: string
+          enum: [blue, green, staging, production]
+        status:
+          type: string
+          enum: [planned, in_progress, testing, live, failed, decommissioned]
+        version:
+          type: string
+        services:
+          type: array
+          items:
+            type: string
+        createdAt:
+          type: string
+          format: date-time
+        updatedAt:
+          type: string
+          format: date-time
+        healthStatus:
+          type: string
+          enum: [healthy, unhealthy, unknown]
+
+    DeploymentRequest:
+      type: object
+      required:
+        - environment
+        - version
+      properties:
+        environment:
+          type: string
+          enum: [blue, green, staging, production]
+        version:
+          type: string
+        services:
+          type: array
+          items:
+            type: string
+        rollbackPlan:
+          type: boolean
+        healthCheckTimeout:
+          type: integer
+          format: int32
+
+    DeploymentUpdate:
+      type: object
+      properties:
+        status:
+          type: string
+          enum: [testing, live, failed, decommissioned]
+        rollback:
+          type: boolean
+        reason:
+          type: string
+
+    BackupArchive:
+      type: object
+      properties:
+        id:
+          type: string
+          format: uuid
+        type:
+          type: string
+          enum: [full, incremental, differential]
+        status:
+          type: string
+          enum: [scheduled, in_progress, completed, failed, validated, expired]
+        environment:
+          type: string
+        size:
+          type: integer
+          format: int64
+        compressionRatio:
+          type: number
+          format: float
+        encrypted:
+          type: boolean
+        validated:
+          type: boolean
+        createdAt:
+          type: string
+          format: date-time
+        expiresAt:
+          type: string
+          format: date-time
+        retentionDays:
+          type: integer
+          format: int32
+
+    BackupRequest:
+      type: object
+      required:
+        - type
+        - environment
+      properties:
+        type:
+          type: string
+          enum: [full, incremental, differential]
+        environment:
+          type: string
+        include:
+          type: array
+          items:
+            type: string
+            enum: [databases, files, configurations, logs]
+        compression:
+          type: boolean
+        encryption:
+          type: boolean
+        validation:
+          type: boolean
+
+    RestoreRequest:
+      type: object
+      required:
+        - targetEnvironment
+      properties:
+        targetEnvironment:
+          type: string
+        include:
+          type: array
+          items:
+            type: string
+            enum: [databases, files, configurations, logs]
+        confirm:
+          type: boolean
+        reason:
+          type: string
+
+    RestoreOperation:
+      type: object
+      properties:
+        id:
+          type: string
+          format: uuid
+        backupId:
+          type: string
+          format: uuid
+        targetEnvironment:
+          type: string
+        status:
+          type: string
+          enum: [pending, in_progress, completed, failed]
+        progress:
+          type: integer
+          format: int32
+        estimatedCompletion:
+          type: string
+          format: date-time
+        startedAt:
+          type: string
+          format: date-time
+
+    MonitoringMetric:
+      type: object
+      properties:
+        id:
+          type: string
+          format: uuid
+        service:
+          type: string
+        metric:
+          type: string
+        value:
+          type: number
+          format: float
+        unit:
+          type: string
+        timestamp:
+          type: string
+          format: date-time
+        labels:
+          type: object
+          additionalProperties:
+            type: string
+
+    Alert:
+      type: object
+      properties:
+        id:
+          type: string
+          format: uuid
+        rule:
+          type: string
+        severity:
+          type: string
+          enum: [critical, warning, info]
+        status:
+          type: string
+          enum: [active, acknowledged, resolved]
+        service:
+          type: string
+        message:
+          type: string
+        triggeredAt:
+          type: string
+          format: date-time
+        acknowledgedAt:
+          type: string
+          format: date-time
+        acknowledgedBy:
+          type: string
+        resolvedAt:
+          type: string
+          format: date-time
+
+    AlertAcknowledgment:
+      type: object
+      required:
+        - alertId
+      properties:
+        alertId:
+          type: string
+          format: uuid
+        acknowledgedBy:
+          type: string
+        note:
+          type: string
+
+  securitySchemes:
+    BearerAuth:
+      type: http
+      scheme: bearer
+      bearerFormat: JWT
+
+security:
+  - BearerAuth: []
@@ -0,0 +1,249 @@
+# Data Model: Infrastructure Operations & Deployment Automation
+
+**Date**: 2026-04-20  
+**Feature**: Infrastructure Operations & Deployment Automation  
+**Status**: Complete
+
+## Infrastructure Entities
+
+### Docker Compose Configuration
+
+**Description**: Infrastructure as code definitions for all services, environments, and deployments  
+**Key Attributes**:
+- Configuration ID (unique identifier)
+- Environment (development/staging/production)
+- Service definitions and dependencies
+- Network configurations
+- Volume mappings
+- Environment variables (secrets excluded)
+- Health check definitions
+- Resource limits
+- Security policies (user, capabilities, read-only)
+
+**Validation Rules**:
+- All services must have health checks
+- All containers must specify non-root user where possible
+- All secrets must use external env files
+- All images must use specific tags (no :latest)
+- Resource limits must be defined for CPU and memory
+
+### Backup Archive
+
+**Description**: Complete system snapshots including databases, files, and configurations with metadata  
+**Key Attributes**:
+- Archive ID (unique identifier)
+- Timestamp (creation time)
+- Backup type (full/incremental)
+- Source environment
+- Data sources (databases, files, configs)
+- Compression status
+- Encryption status
+- Validation status
+- Retention period
+- Storage location
+
+**Validation Rules**:
+- All archives must be encrypted
+- All archives must have integrity validation
+- Backup frequency: daily for critical data
+- Retention: 30 days daily, 90 days weekly, 1 year monthly
+- Must include database consistency checks
+
+### Monitoring Metric
+
+**Description**: Performance and health data points collected from all infrastructure components  
+**Key Attributes**:
+- Metric ID (unique identifier)
+- Source service/container
+- Metric name and type
+- Value and timestamp
+- Labels and dimensions
+- Threshold definitions
+- Alert status
+- Aggregation rules
+
+**Validation Rules**:
+- All services must expose health metrics
+- Critical metrics must have alert thresholds
+- Data retention: 90 days detailed, 1 year aggregated
+- Metrics must include CPU, memory, disk, network
+- Application-specific metrics for business logic
+
+### Security Policy
+
+**Description**: Container hardening rules and compliance requirements for all deployments  
+**Key Attributes**:
+- Policy ID (unique identifier)
+- Policy type (user, capabilities, filesystem)
+- Rule definitions
+- Applicable services
+- Compliance status
+- Violation tracking
+- Remediation procedures
+
+**Validation Rules**:
+- All containers must run with non-root users
+- All containers must drop unnecessary capabilities
+- All containers must use read-only filesystems where possible
+- All containers must have security options defined
+- Regular vulnerability scanning required
+
+### Deployment Environment
+
+**Description**: Isolated runtime spaces with consistent configurations  
+**Key Attributes**:
+- Environment ID (unique identifier)
+- Environment type (blue/green)
+- Service instances
+- Network configuration
+- Storage configuration
+- Access controls
+- Deployment status
+- Health status
+
+**Validation Rules**:
+- Blue and green environments must be identical
+- Network isolation between environments
+- Consistent configuration across environments
+- Automated health checks required
+- Traffic switching must be atomic
+
+### Alert Rule
+
+**Description**: Threshold-based conditions that trigger notifications when system metrics exceed limits  
+**Key Attributes**:
+- Rule ID (unique identifier)
+- Metric source
+- Threshold conditions
+- Severity levels
+- Notification channels
+- Escalation rules
+- Suppression rules
+- Acknowledgment status
+
+**Validation Rules**:
+- All critical services must have alert rules
+- Alert response time must be < 30 seconds
+- Must include escalation paths
+- Must define recovery procedures
+- Regular alert testing required
+
+### Secret Configuration
+
+**Description**: Sensitive information managed outside version control  
+**Key Attributes**:
+- Secret ID (unique identifier)
+- Secret type (password, key, certificate)
+- Usage context
+- Access controls
+- Rotation schedule
+- Expiration date
+- Compliance requirements
+
+**Validation Rules**:
+- No secrets in version control
+- All secrets must be encrypted at rest
+- Access must be role-based
+- Regular rotation required
+- Audit trail for all access
+
+### Service Instance
+
+**Description**: Running container with specific configuration and health status  
+**Key Attributes**:
+- Instance ID (unique identifier)
+- Service name and version
+- Container configuration
+- Resource allocation
+- Health status
+- Start time
+- Network endpoints
+- Log configuration
+
+**Validation Rules**:
+- All instances must have health checks
+- Resource limits must be enforced
+- Restart policies must be defined
+- Log aggregation must be configured
+- Performance monitoring required
+
+### Infrastructure Change
+
+**Description**: Version-controlled modification to system configuration or deployment  
+**Key Attributes**:
+- Change ID (unique identifier)
+- Change type (configuration, deployment, security)
+- Description and rationale
+- Approval status
+- Implementation status
+- Rollback plan
+- Impact assessment
+- Compliance validation
+
+**Validation Rules**:
+- All changes must be version-controlled
+- Changes require approval before production
+- Rollback plans must be tested
+- Impact assessment required
+- Compliance validation mandatory
+
+### Recovery Point
+
+**Description**: Validated backup state that can be restored for disaster recovery  
+**Key Attributes**:
+- Recovery point ID (unique identifier)
+- Archive reference
+- Validation status
+- Recovery time objective
+- Recovery procedures
+- Test results
+- Dependencies
+
+**Validation Rules**:
+- All recovery points must be tested
+- RTO must be < 4 hours
+- Recovery procedures must be documented
+- Regular testing required
+- Success rate must be > 95%
+
+## State Transitions
+
+### Deployment Lifecycle
+```
+Planned -> In Progress -> Testing -> Live -> Decommissioned
+```
+
+### Backup Lifecycle
+```
+Scheduled -> In Progress -> Completed -> Validated -> Expired
+```
+
+### Alert Lifecycle
+```
+Triggered -> Acknowledged -> Resolved -> Closed
+```
+
+### Change Management
+```
+Requested -> Approved -> Implemented -> Validated -> Closed
+```
+
+## Relationships
+
+- **Environment** contains many **Service Instances**
+- **Service Instance** generates **Monitoring Metrics**
+- **Backup Archive** contains data from **Service Instances**
+- **Alert Rule** monitors **Monitoring Metrics**
+- **Security Policy** applies to **Service Instances**
+- **Infrastructure Change** modifies **Deployment Environments**
+- **Recovery Point** references **Backup Archive**
+- **Secret Configuration** used by **Service Instances**
+
+## Data Integrity Constraints
+
+- All entities must have unique identifiers
+- All timestamps must be UTC
+- All audit fields must be immutable
+- Foreign key relationships must be validated
+- All sensitive data must be encrypted
+- All changes must be auditable
@@ -0,0 +1,105 @@
+# Implementation Plan: [FEATURE]
+
+**Branch**: `[###-feature-name]` | **Date**: [DATE] | **Spec**: [link]
+**Input**: Feature specification from `/specs/[###-feature-name]/spec.md`
+
+**Note**: This template is filled in by the `/speckit.plan` command. See `.specify/templates/commands/plan.md` for the execution workflow.
+
+## Summary
+
+[Extract from feature spec: primary requirement + technical approach from research]
+
+## Technical Context
+
+<!--
+  ACTION REQUIRED: Replace the content in this section with the technical details
+  for the project. The structure here is presented in advisory capacity to guide
+  the iteration process.
+-->
+
+**Language/Version**: [e.g., Python 3.11, Swift 5.9, Rust 1.75 or NEEDS CLARIFICATION]  
+**Primary Dependencies**: [e.g., FastAPI, UIKit, LLVM or NEEDS CLARIFICATION]  
+**Storage**: [if applicable, e.g., PostgreSQL, CoreData, files or N/A]  
+**Testing**: [e.g., pytest, XCTest, cargo test or NEEDS CLARIFICATION]  
+**Target Platform**: [e.g., Linux server, iOS 15+, WASM or NEEDS CLARIFICATION]
+**Project Type**: [single/web/mobile - determines source structure]  
+**Performance Goals**: [domain-specific, e.g., 1000 req/s, 10k lines/sec, 60 fps or NEEDS CLARIFICATION]  
+**Constraints**: [domain-specific, e.g., <200ms p95, <100MB memory, offline-capable or NEEDS CLARIFICATION]  
+**Scale/Scope**: [domain-specific, e.g., 10k users, 1M LOC, 50 screens or NEEDS CLARIFICATION]
+
+## Constitution Check
+
+_GATE: Must pass before Phase 0 research. Re-check after Phase 1 design._
+
+[Gates determined based on constitution file]
+
+## Project Structure
+
+### Documentation (this feature)
+
+```text
+specs/[###-feature]/
+├── plan.md              # This file (/speckit.plan command output)
+├── research.md          # Phase 0 output (/speckit.plan command)
+├── data-model.md        # Phase 1 output (/speckit.plan command)
+├── quickstart.md        # Phase 1 output (/speckit.plan command)
+├── contracts/           # Phase 1 output (/speckit.plan command)
+└── tasks.md             # Phase 2 output (/speckit.tasks command - NOT created by /speckit.plan)
+```
+
+### Source Code (repository root)
+
+<!--
+  ACTION REQUIRED: Replace the placeholder tree below with the concrete layout
+  for this feature. Delete unused options and expand the chosen structure with
+  real paths (e.g., apps/admin, packages/something). The delivered plan must
+  not include Option labels.
+-->
+
+```text
+# [REMOVE IF UNUSED] Option 1: Single project (DEFAULT)
+src/
+├── models/
+├── services/
+├── cli/
+└── lib/
+
+tests/
+├── contract/
+├── integration/
+└── unit/
+
+# [REMOVE IF UNUSED] Option 2: Web application (when "frontend" + "backend" detected)
+backend/
+├── src/
+│   ├── models/
+│   ├── services/
+│   └── api/
+└── tests/
+
+frontend/
+├── src/
+│   ├── components/
+│   ├── pages/
+│   └── services/
+└── tests/
+
+# [REMOVE IF UNUSED] Option 3: Mobile + API (when "iOS/Android" detected)
+api/
+└── [same as backend above]
+
+ios/ or android/
+└── [platform-specific structure: feature modules, UI flows, platform tests]
+```
+
+**Structure Decision**: [Document the selected structure and reference the real
+directories captured above]
+
+## Complexity Tracking
+
+> **Fill ONLY if Constitution Check has violations that must be justified**
+
+| Violation                  | Why Needed         | Simpler Alternative Rejected Because |
+| -------------------------- | ------------------ | ------------------------------------ |
+| [e.g., 4th project]        | [current need]     | [why 3 projects insufficient]        |
+| [e.g., Repository pattern] | [specific problem] | [why direct DB access insufficient]  |
@@ -0,0 +1,293 @@
+# Quick Start Guide: Infrastructure Operations & Deployment Automation
+
+**Purpose**: Get started with the Infrastructure Operations & Deployment Automation feature  
+**Date**: 2026-04-20  
+**Target Audience**: DevOps Engineers, System Administrators
+
+## Prerequisites
+
+### Hardware Requirements
+- QNAP NAS (192.168.10.8) with Docker support
+- ASUSTOR NAS (192.168.10.9) with Docker support
+- SSH access between NAS devices configured
+- Minimum 100GB storage for backups
+
+### Software Requirements
+- Docker 20.10+
+- Docker Compose 2.0+
+- Bash 5.0+ or PowerShell 7.2+
+- Git client
+- SSH key authentication
+
+### Network Requirements
+- Static IP addresses for both NAS devices
+- Open ports: 22 (SSH), 80/443 (HTTP/HTTPS), 8080 (applications)
+- VPN or secure network connection for remote access
+
+## Initial Setup
+
+### 1. Repository Configuration
+
+```bash
+# Clone the repository
+git clone https://git.np-dms.work/np-dms/lcbp3.git
+cd lcbp3
+
+# Switch to the infrastructure branch
+git checkout 002-infra-ops
+```
+
+### 2. SSH Key Authentication
+
+Ensure SSH keys are configured between QNAP and ASUSTOR:
+
+```bash
+# Test SSH connectivity
+ssh admin@192.168.10.8 "docker --version"
+ssh admin@192.168.10.9 "docker --version"
+```
+
+### 3. Environment Configuration
+
+Copy and configure environment files:
+
+```bash
+# QNAP environments
+cp specs/04-Infrastructure-OPS/04-00-docker-compose/QNAP/app/.env.example \
+   specs/04-Infrastructure-OPS/04-00-docker-compose/QNAP/app/.env
+
+# ASUSTOR environments
+cp specs/04-Infrastructure-OPS/04-00-docker-compose/ASUSTOR/registry/.env.example \
+   specs/04-Infrastructure-OPS/04-00-docker-compose/ASUSTOR/registry/.env
+```
+
+Edit the `.env` files with your specific configurations:
+- Database passwords
+- SSL certificate paths
+- Backup storage locations
+- Monitoring endpoints
+
+## Core Services Deployment
+
+### 1. Database Services (QNAP)
+
+```bash
+# Navigate to QNAP database directory
+cd specs/04-Infrastructure-OPS/04-00-docker-compose/QNAP/mariadb
+
+# Deploy MariaDB with phpMyAdmin
+docker-compose -f docker-compose-lcbp3-db.yml up -d
+
+# Verify deployment
+docker-compose -f docker-compose-lcbp3-db.yml ps
+```
+
+### 2. Application Services (QNAP)
+
+```bash
+# Navigate to QNAP app directory
+cd specs/04-Infrastructure-OPS/04-00-docker-compose/QNAP/app
+
+# Deploy backend, frontend, and ClamAV
+docker-compose -f docker-compose-app.yml up -d
+
+# Verify deployment
+docker-compose -f docker-compose-app.yml ps
+```
+
+### 3. Reverse Proxy (QNAP)
+
+```bash
+# Navigate to Nginx Proxy Manager directory
+cd specs/04-Infrastructure-OPS/04-00-docker-compose/QNAP/npm
+
+# Deploy reverse proxy
+docker-compose -f docker-compose.yml up -d
+
+# Access Nginx Proxy Manager
+# URL: http://192.168.10.8:81
+# Default: admin@example.com / changeme
+```
+
+### 4. Monitoring Stack (ASUSTOR)
+
+```bash
+# Navigate to ASUSTOR monitoring directory
+cd specs/04-Infrastructure-OPS/04-00-docker-compose/ASUSTOR/monitoring
+
+# Deploy Prometheus, Grafana, and supporting services
+docker-compose -f docker-compose.yml up -d
+
+# Verify deployment
+docker-compose -f docker-compose.yml ps
+```
+
+## SSL Certificate Setup
+
+### 1. Initial Certificate Generation
+
+```bash
+# On QNAP, generate Let's Encrypt certificates
+cd specs/04-Infrastructure-OPS/04-00-docker-compose/QNAP/npm
+
+# Run certbot for initial certificate
+docker-compose exec npm certbot --nginx -d your-domain.com
+```
+
+### 2. Automated Renewal
+
+Add to crontab for automatic renewal:
+
+```bash
+# Edit crontab
+crontab -e
+
+# Add renewal task (runs daily at 2 AM)
+0 2 * * * cd /path/to/npm && docker-compose exec npm certbot renew
+```
+
+## Backup Configuration
+
+### 1. Initial Backup Setup
+
+```bash
+# Navigate to backup scripts directory
+cd specs/04-Infrastructure-OPS/04-02-backup-recovery
+
+# Configure backup destinations
+cp backup-config.example.yml backup-config.yml
+
+# Edit backup-config.yml with your storage locations
+nano backup-config.yml
+```
+
+### 2. Automated Backup Schedule
+
+```bash
+# Add backup cron job (runs daily at 1 AM)
+0 1 * * * /path/to/backup-scripts/daily-backup.sh
+
+# Add backup validation (runs weekly on Sunday at 3 AM)
+0 3 * * 0 /path/to/backup-scripts/validate-backups.sh
+```
+
+## Monitoring Configuration
+
+### 1. Grafana Dashboard Access
+
+1. Access Grafana: `http://192.168.10.9:3000`
+2. Default credentials: `admin / admin` (change on first login)
+3. Import dashboards from `specs/04-Infrastructure-OPS/04-03-monitoring/dashboards/`
+
+### 2. Alert Configuration
+
+1. Access AlertManager: `http://192.168.10.9:9093`
+2. Configure notification channels (email, Slack, etc.)
+3. Test alert rules to ensure notifications work
+
+## Blue-Green Deployment
+
+### 1. Environment Setup
+
+```bash
+# Create blue environment (current production)
+cd specs/04-Infrastructure-OPS/04-00-docker-compose/QNAP/app
+docker-compose -f docker-compose-app.yml -p app-blue up -d
+
+# Create green environment (new version)
+docker-compose -f docker-compose-app.yml -p app-green up -d
+```
+
+### 2. Traffic Switching
+
+```bash
+# Switch traffic to green environment
+# Update Nginx Proxy Manager upstream configuration
+# Point to green environment containers
+# Test green environment functionality
+```
+
+### 3. Rollback Procedure
+
+```bash
+# If issues detected, rollback to blue
+# Update Nginx Proxy Manager upstream configuration
+# Point back to blue environment containers
+# Stop green environment containers
+```
+
+## Security Hardening
+
+### 1. Container Security Scan
+
+```bash
+# Install Trivy
+curl -sfL https://raw.githubusercontent.com/aquasecurity/trivy/main/contrib/install.sh | sh -s -- -b /usr/local/bin
+
+# Scan all running containers
+trivy image --severity HIGH,CRITICAL $(docker ps --format "table {{.Image}}" | tail -n +2)
+```
+
+### 2. Security Policy Validation
+
+```bash
+# Run security validation script
+cd specs/04-Infrastructure-OPS/04-06-security-operations
+./validate-security-policies.sh
+```
+
+## Troubleshooting
+
+### Common Issues
+
+1. **Container won't start**
+   ```bash
+   # Check logs
+   docker-compose logs [service-name]
+   
+   # Check resource usage
+   docker stats
+   ```
+
+2. **Backup failures**
+   ```bash
+   # Check backup logs
+   tail -f /var/log/backup.log
+   
+   # Test connectivity to backup storage
+   ping backup-storage-host
+   ```
+
+3. **Monitoring alerts not working**
+   ```bash
+   # Check Prometheus targets
+   curl http://192.168.10.9:9090/api/v1/targets
+   
+   # Test AlertManager
+   curl http://192.168.10.9:9093/api/v1/alerts
+   ```
+
+### Health Checks
+
+```bash
+# Check all services health
+curl -f http://192.168.10.8:3000/health || echo "Backend unhealthy"
+curl -f http://192.168.10.8/health || echo "Frontend unhealthy"
+curl -f http://192.168.10.9:9090/-/healthy || echo "Prometheus unhealthy"
+```
+
+## Next Steps
+
+1. **Configure automated monitoring alerts** for your specific thresholds
+2. **Set up backup retention policies** based on your compliance requirements
+3. **Implement disaster recovery testing** on a regular schedule
+4. **Configure log aggregation** for centralized monitoring
+5. **Set up automated security scanning** in your CI/CD pipeline
+
+## Support
+
+For issues and questions:
+- Check the troubleshooting section above
+- Review logs in `/var/log/` directories
+- Consult the full documentation in `specs/04-Infrastructure-OPS/`
+- Contact the infrastructure team for escalated issues
@@ -0,0 +1,82 @@
+# Phase 0 Research: Infrastructure Operations & Deployment Automation
+
+**Date**: 2026-04-20  
+**Feature**: Infrastructure Operations & Deployment Automation  
+**Status**: Complete
+
+## Research Findings
+
+### Blue-Green Deployment Strategy
+
+**Decision**: Docker Compose with Nginx Proxy Manager for traffic switching  
+**Rationale**: Provides zero-downtime deployments by maintaining two identical production environments (blue/green) and switching traffic via reverse proxy configuration updates  
+**Alternatives Considered**: Kubernetes (too complex for current scale), Docker Swarm (limited networking features), Manual deployment scripts (prone to human error)
+
+### Backup & Recovery Solution
+
+**Decision**: Restic for encrypted backups + MariaDB dump scripts + automated validation  
+**Rationale**: Restic provides deduplication, encryption, and cloud storage support. Combined with native database dumps ensures complete system state capture  
+**Alternatives Considered**: Borg Backup (steeper learning curve), rsync only (no encryption/deduplication), commercial solutions (cost constraints)
+
+### Monitoring Stack
+
+**Decision**: Prometheus + Grafana + AlertManager + Node Exporter + cAdvisor  
+**Rationale**: Industry-standard monitoring stack with extensive community support, flexible alerting rules, and container-native metrics collection  
+**Alternatives Considered**: Zabbix (more complex setup), Nagios (older architecture), Datadog (commercial cost)
+
+### Container Security Hardening
+
+**Decision**: Docker security hardening with non-root users, read-only filesystems, capability dropping, and Trivy scanning  
+**Rationale**: Provides defense-in-depth security while maintaining functionality. Trivy offers comprehensive vulnerability scanning  
+**Alternatives Considered**: Podman (better security but ecosystem compatibility issues), Kubernetes security policies (overkill for current scale)
+
+### Multi-NAS Architecture
+
+**Decision**: QNAP for primary services, ASUSTOR for backup/monitoring registry  
+**Rationale**: Leverages existing hardware investment, provides geographic separation for critical services, and maintains established SSH key authentication  
+**Alternatives Considered**: Cloud hosting (recurring costs, data sovereignty concerns), Single NAS (single point of failure)
+
+### SSL Certificate Management
+
+**Decision**: Certbot with Let's Encrypt + automated renewal via cron jobs  
+**Rationale**: Free, automated certificate management with established reliability. Integration with Nginx Proxy Manager simplifies deployment  
+**Alternatives Considered**: Commercial CAs (cost), Self-signed certificates (browser warnings), Cloudflare certificates (dependency on external service)
+
+### Secrets Management
+
+**Decision**: Environment files with .gitignore + SSH key authentication  
+**Rationale**: Simple, secure approach that works across both NAS environments. No additional infrastructure required  
+**Alternatives Considered**: HashiCorp Vault (complex setup), Docker Swarm secrets (limited to single host), Infisical/SOPS (additional learning curve)
+
+## Technical Decisions Summary
+
+1. **Docker Compose** as primary orchestration tool
+2. **Blue-Green deployment** pattern for zero downtime
+3. **Restic** for backup encryption and deduplication
+4. **Prometheus/Grafana** stack for monitoring
+5. **Nginx Proxy Manager** for reverse proxy and SSL termination
+6. **Trivy** for container vulnerability scanning
+7. **Environment files** for secrets management
+8. **SSH key authentication** for cross-NAS communication
+
+## Implementation Constraints
+
+- Must maintain existing QNAP/ASUSTOR IP addresses (192.168.10.8/9)
+- Must preserve current data storage locations
+- Must integrate with existing Gitea Actions CI/CD pipeline
+- Must comply with ADR-016 security requirements
+- Must support Thai language documentation per project standards
+
+## Success Metrics Alignment
+
+All technical decisions support the success criteria defined in the specification:
+
+- 99.9% uptime through redundant infrastructure
+- 30-second alert generation via Prometheus monitoring
+- 4-hour RTO through automated backup validation
+- Zero-downtime deployments via blue-green strategy
+- 100% security compliance via container hardening
+
+## Next Steps
+
+Proceed to Phase 1: Design & Contracts with these technical foundations established.
@@ -0,0 +1,187 @@
+# Feature Specification: Infrastructure Operations & Deployment Automation
+
+**Feature Branch**: `002-infra-ops`
+**Created**: 2026-04-20
+**Status**: Draft
+**Input**: User description: "Infrastructure operations and deployment automation including Docker Compose configurations, container orchestration, monitoring, backup/recovery, and maintenance procedures for the NAP-DMS system"
+
+## Clarifications
+
+### Session 2026-04-20
+
+- Q: Which services are included in Infrastructure Operations scope beyond NAP-DMS applications?
+- A: All services in Docker Compose stacks including Gitea, n8n, RocketChat, and supporting services
+
+- Q: What is the expected data volume and annual growth rate for all services?
+- A: 500GB current data with 20% annual growth
+
+- Q: What external services or third-party integrations are required beyond internal services?
+- A: Email SMTP for notifications and Let's Encrypt for SSL certificates
+
+- Q: What are the concurrent user count and performance targets for response time?
+- A: 100 concurrent users with 2-second average response time
+
+- Q: What technical constraints exist (budget, hardware, compliance requirements)?
+- A: Must work with existing QNAP/ASUSTOR hardware infrastructure
+
+## User Scenarios & Testing _(mandatory)_
+
+<!--
+  IMPORTANT: User stories should be PRIORITIZED as user journeys ordered by importance.
+  Each user story/journey must be INDEPENDENTLY TESTABLE - meaning if you implement just ONE of them,
+  you should still have a viable MVP (Minimum Viable Product) that delivers value.
+
+  Assign priorities (P1, P2, P3, etc.) to each story, where P1 is the most critical.
+  Think of each story as a standalone slice of functionality that can be:
+  - Developed independently
+  - Tested independently
+  - Deployed independently
+  - Demonstrated to users independently
+-->
+
+### User Story 1 - Zero-Downtime Deployment (Priority: P1)
+
+As a DevOps engineer, I need to deploy updates for all services (NAP-DMS applications, databases, monitoring, Gitea, n8n, RocketChat, and supporting services) without interrupting user access to any system components.
+
+**Why this priority**: Critical for business continuity - system cannot afford downtime during regular maintenance windows.
+
+**Independent Test**: Can be fully tested by deploying a test application version using blue-green containers and verifying traffic switches seamlessly without user session interruption.
+
+**Acceptance Scenarios**:
+
+1. **Given** a running production environment, **When** I deploy a new version, **Then** users continue accessing the system without interruption
+2. **Given** a deployment failure, **When** the rollback is triggered, **Then** the system immediately switches back to the previous stable version
+
+---
+
+### User Story 2 - Automated Backup & Recovery (Priority: P1)
+
+As a system administrator, I need automated daily backups of all services data (NAP-DMS applications, databases, monitoring, Gitea, n8n, RocketChat, configurations, and supporting services) and the ability to restore the entire system within 4 hours of a catastrophic failure.
+
+**Why this priority**: Essential for data protection and business continuity compliance with document management regulations.
+
+**Independent Test**: Can be fully tested by running backup procedures and performing a full system restore in a test environment to verify all data is recoverable.
+
+**Acceptance Scenarios**:
+
+1. **Given** the backup schedule is configured, **When** the daily backup runs, **Then** all databases, files, and configurations are successfully backed up
+2. **Given** a system failure occurs, **When** I initiate recovery, **Then** the entire system is restored to its last known good state within 4 hours
+
+---
+
+### User Story 3 - Real-time Monitoring & Alerting (Priority: P1)
+
+As an on-call engineer, I need to receive immediate alerts when any system components (NAP-DMS applications, databases, monitoring, Gitea, n8n, RocketChat, and supporting services) fail or performance degrades below acceptable thresholds.
+
+**Why this priority**: Prevents minor issues from becoming major outages and ensures rapid response to system problems.
+
+**Independent Test**: Can be fully tested by simulating various failure scenarios and verifying appropriate alerts are generated and delivered to the correct channels.
+
+**Acceptance Scenarios**:
+
+1. **Given** monitoring is active, **When** a service becomes unresponsive, **Then** an alert is sent within 30 seconds
+2. **Given** system resources exceed 80% utilization, **When** the threshold is crossed, **Then** a performance alert is generated with actionable diagnostics
+
+---
+
+### User Story 4 - Container Security Hardening (Priority: P2)
+
+As a security administrator, I need all containers (NAP-DMS applications, databases, monitoring, Gitea, n8n, RocketChat, and supporting services) to run with minimal privileges and no exposed secrets to maintain compliance with security policies.
+
+**Why this priority**: Prevents privilege escalation attacks and protects sensitive configuration data.
+
+**Independent Test**: Can be fully tested by running security scans on all containers and verifying they meet hardening requirements.
+
+**Acceptance Scenarios**:
+
+1. **Given** containers are deployed, **When** I run a security audit, **Then** all containers pass privilege escalation and secret exposure checks
+2. **Given** new containers are added, **When** they are deployed, **Then** they automatically inherit security hardening policies
+
+---
+
+### User Story 5 - Infrastructure as Code Management (Priority: P2)
+
+As a DevOps engineer, I need to manage all infrastructure configurations (NAP-DMS applications, databases, monitoring, Gitea, n8n, RocketChat, and supporting services) through version-controlled code files rather than manual server changes.
+
+**Why this priority**: Ensures consistency across environments and enables reproducible infrastructure deployments.
+
+**Independent Test**: Can be fully tested by deploying a complete environment from code and verifying it matches the production configuration.
+
+**Acceptance Scenarios**:
+
+1. **Given** infrastructure code changes, **When** I apply the changes, **Then** the environment configuration matches exactly what's defined in the code
+2. **Given** a new environment is needed, **When** I deploy from code, **Then** the environment is created with all required services and configurations
+
+### Edge Cases
+
+- What happens when network connectivity between QNAP and ASUSTOR fails during backup operations?
+- How does system handle container registry authentication failures during deployment?
+- What happens when Docker Compose files contain syntax errors during environment startup?
+- How does system handle SSL certificate expiration for reverse proxy services?
+- What happens when monitoring services become unavailable while system is running?
+- How does system handle storage space exhaustion on production servers?
+- What happens when multiple deployment processes are initiated simultaneously?
+- How does system handle database connection pool exhaustion during high load?
+- What happens when automated security updates conflict with custom container configurations?
+- How does system handle partial backup failures where some services complete but others fail?
+- How does system handle Email SMTP service failures for alert notifications?
+- What happens when Let's Encrypt certificate renewal fails due to network issues?
+
+## Requirements _(mandatory)_
+
+<!--
+  ACTION REQUIRED: The content in this section represents placeholders.
+  Fill them out with the right functional requirements.
+-->
+
+### Functional Requirements
+
+- **FR-001**: System MUST support blue-green deployment strategy for zero-downtime updates of all services (NAP-DMS applications, databases, monitoring, Gitea, n8n, RocketChat, and supporting services)
+- **FR-002**: System MUST automate daily backups of all services data including databases, application files, configurations, and supporting service data
+- **FR-003**: System MUST provide complete disaster recovery capabilities with 4-hour RTO (Recovery Time Objective)
+- **FR-004**: System MUST monitor all infrastructure components (all services) and generate alerts for failures or performance degradation
+- **FR-005**: System MUST enforce container security hardening including non-root users, privilege dropping, and read-only filesystems for all services
+- **FR-006**: System MUST manage all infrastructure configurations through version-controlled Docker Compose files for all services
+- **FR-007**: System MUST support automated SSL certificate management and renewal for all web services
+- **FR-008**: System MUST provide centralized logging aggregation for all containers and services
+- **FR-009**: System MUST implement resource limits and health checks for all containers
+- **FR-010**: System MUST support multi-environment deployments (development, staging, production) with consistent configurations
+- **FR-011**: System MUST provide automated vulnerability scanning for all container images
+- **FR-012**: System MUST support infrastructure secrets management without exposing them in version control
+- **FR-013**: System MUST implement backup validation procedures to ensure data integrity
+- **FR-014**: System MUST provide rollback capabilities for failed deployments
+- **FR-015**: System MUST generate audit trails for all infrastructure changes and deployments
+
+### Key Entities _(include if feature involves data)_
+
+- **Docker Compose Configuration**: Infrastructure as code definitions for all services, environments, and deployments
+- **Backup Archive**: Complete system snapshots including databases, files, and configurations with metadata (500GB current data, 20% annual growth)
+- **Monitoring Metric**: Performance and health data points collected from all infrastructure components
+- **Security Policy**: Container hardening rules and compliance requirements for all deployments
+- **Deployment Environment**: Isolated runtime spaces (development, staging, production) with consistent configurations (constrained by existing QNAP/ASUSTOR hardware)
+- **Alert Rule**: Threshold-based conditions that trigger notifications when system metrics exceed limits
+- **Secret Configuration**: Sensitive information (passwords, keys, certificates) managed outside version control
+- **Service Instance**: Running container with specific configuration, resource limits, and health status
+- **Infrastructure Change**: Version-controlled modification to system configuration or deployment
+- **Recovery Point**: Validated backup state that can be restored for disaster recovery
+
+## Success Criteria _(mandatory)_
+
+<!--
+  ACTION REQUIRED: Define measurable success criteria.
+  These must be technology-agnostic and measurable.
+-->
+
+### Measurable Outcomes
+
+- **SC-001**: Deployments complete with zero user-visible downtime in 99.9% of attempts
+- **SC-002**: System recovery from backup completes within 4 hours with 100% data integrity
+- **SC-003**: Critical system alerts are generated and delivered within 30 seconds of failure detection
+- **SC-004**: All containers pass security hardening compliance checks with 100% success rate
+- **SC-005**: Infrastructure changes are applied from version-controlled code with 100% consistency across environments
+- **SC-006**: SSL certificates are renewed automatically with 0 expiration incidents per year
+- **SC-007**: Backup validation procedures achieve 99.9% success rate with automated integrity verification
+- **SC-008**: Failed deployments are automatically rolled back within 60 seconds with 100% success rate
+- **SC-009**: System uptime exceeds 99.9% monthly availability target
+- **SC-010**: Infrastructure audit trail captures 100% of configuration changes and deployments
+- **SC-011**: System supports 100 concurrent users with 2-second average response time under normal load
@@ -0,0 +1,4 @@
+# Gitea
+GITEA_INSTANCE_URL=https://git.np-dms.work
+GITEA_RUNNER_REGISTRATION_TOKEN=FGaSCT79PmMg8cDy0Ltqt1yaLzs8D4MRMFAE3jCh
+GITEA_RUNNER_NAME=asustor-runner
@@ -0,0 +1,21 @@
+# File: /volume1/np-dms/gitea-runner/docker-compose.yml
+# Deploy on: ASUSTOR AS5403T
+# เชื่อมต่อกับ Gitea บน QNAP ผ่าน Domain URL
+
+version: "3.8"
+
+services:
+  runner:
+    image: gitea/act_runner:latest
+    container_name: gitea-runner
+    restart: always
+    environment:
+      # ใช้ Domain URL เพื่อเชื่อมต่อ Gitea ข้ามเครื่อง (QNAP)
+      - GITEA_INSTANCE_URL=https://git.np-dms.work
+      - GITEA_RUNNER_REGISTRATION_TOKEN=FGaSCT79PmMg8cDy0Ltqt1yaLzs8D4MRMFAE3jCh
+      - GITEA_RUNNER_NAME=asustor-runner
+      # Label ต้องตรงกับ runs-on ใน deploy.yaml
+      - GITEA_RUNNER_LABELS=ubuntu-latest:docker://node:18-bullseye,self-hosted:docker://node:18-bullseye
+    volumes:
+      - /volume1/np-dms/gitea-runner/data:/data
+      - /var/run/docker.sock:/var/run/docker.sock
@@ -1,4 +1,5 @@
 # File: /volume1/np-dms/gitea-runner/docker-compose.yml
+# DMS Container v1.8.6: Application name: lcbp3-gitea-runner
 # Deploy on: ASUSTOR AS5403T
 # เชื่อมต่อกับ Gitea บน QNAP ผ่าน Domain URL
 #
@@ -13,11 +14,11 @@ x-logging: &default_logging
    options:
      max-size: '10m'
      max-file: '5'
-
+name: lcbp3-gitea-runner
 services:
  runner:
    <<: *default_logging
-    image: gitea/act_runner:0.2.11
+    image: gitea/act_runner:0.4.0
    container_name: gitea-runner
    restart: unless-stopped
    extra_hosts:
@@ -1,2 +1,3 @@
 REGISTRY_ADMIN_USER=admin
 REGISTRY_ADMIN_PASSWORD=
+REGISTRY_HTTP_SECRET=
@@ -0,0 +1,70 @@
+# File: /volume1/np-dms/registry/docker-compose.yml
+# DMS Container v1.8.0: Application name: lcbp3-registry
+# Deploy on: ASUSTOR AS5403T
+# Services: registry, portainer
+# ============================================================
+# ⚠️  ข้อกำหนด:
+#     - ต้องสร้าง Docker Network ก่อน: docker network create lcbp3
+#     - Registry ใช้ Port 5000 (domain: registry.np-dms.work)
+#     - Portainer ใช้ Port 9443 (domain: portainer.np-dms.work)
+# ============================================================
+x-restart: &restart_policy
+  restart: unless-stopped
+
+x-logging: &default_logging
+  logging:
+    driver: 'json-file'
+    options:
+      max-size: '10m'
+      max-file: '5'
+
+networks:
+  lcbp3:
+    external: true
+
+services:
+  # 1. Docker Registry Engine
+  registry:
+    <<: [*restart_policy, *default_logging]
+    image: registry:2
+    container_name: registry
+    deploy:
+      resources:
+        limits:
+          cpus: '0.5'
+          memory: 256M
+    environment:
+      TZ: 'Asia/Bangkok'
+      REGISTRY_STORAGE_DELETE_ENABLED: 'true'
+      # เพิ่มความปลอดภัยเบื้องต้น (ถ้าต้องการ) หรือจัดการเรื่อง CORS
+      # REGISTRY_HTTP_HEADERS_Access-Control-Allow-Origin: '[https://registry-ui.np-dms.work]'
+      # REGISTRY_HTTP_HEADERS_Access-Control-Allow-Methods: '[HEAD,GET,OPTIONS,DELETE]'
+      # REGISTRY_HTTP_HEADERS_Access-Control-Allow-Headers: '[Authorization,Accept,Cache-Control]'
+    ports:
+      - "5000:5000"
+    volumes:
+      - '/volume1/np-dms/registry/data:/var/lib/registry'
+    healthcheck:
+      test: ["CMD", "bin/registry", "garbage-collect", "--dry-run", "/etc/docker/registry/config.yml"] # Check config/binary readiness
+      interval: 1m
+      timeout: 10s
+      retries: 3
+    networks:
+      - lcbp3
+
+  # 2. Registry Browser UI
+  registry-ui:
+    <<: [*restart_policy, *default_logging]
+    image: joxit/docker-registry-ui:latest
+    container_name: registry-ui
+    ports:
+      - "8880:80"
+    environment:
+      - REGISTRY_TITLE=LCBP3-DMS Local Registry
+      - REGISTRY_URL=http://registry:5000
+      - SINGLE_REGISTRY=true
+      - DELETE_IMAGES=true # ยอมให้กดลบจากหน้า UI ได้
+    depends_on:
+      - registry
+    networks:
+      - lcbp3
@@ -26,7 +26,7 @@ x-logging: &default_logging
    options:
      max-size: '10m'
      max-file: '5'
-
+name: lcbp3-registry
 networks:
  lcbp3:
    external: true
@@ -45,9 +45,8 @@ services:
        reservations:
          cpus: '0.1'
          memory: 64M
-
    env_file:
-      - .env
+      - /share/np-dms/registry/.env
    environment:
      TZ: 'Asia/Bangkok'
      # --- Storage ---
@@ -57,15 +56,17 @@ services:
      REGISTRY_AUTH: 'htpasswd'
      REGISTRY_AUTH_HTPASSWD_REALM: 'NP-DMS Registry'
      REGISTRY_AUTH_HTPASSWD_PATH: '/auth/htpasswd'
-    security_opt:
-      - no-new-privileges:true
+      REGISTRY_HTTP_SECRET: ${REGISTRY_HTTP_SECRET}
+    # security_opt:
+    #   - no-new-privileges:true
    ports:
      - '5000:5000'
    volumes:
      - '/volume1/np-dms/registry/data:/var/lib/registry'
      - '/volume1/np-dms/registry/auth:/auth:ro'
    healthcheck:
-      test: ['CMD', 'wget', '--spider', '-q', 'http://localhost:5000/v2/']
+      # test: ['CMD', 'wget', '--spider', '-q', 'http://localhost:5000/v2/']
+      test: ["CMD", "nc", "-z", "localhost", "5000"]
      interval: 30s
      timeout: 10s
      retries: 3
@@ -88,17 +89,26 @@ services:
      - '8880:80'
    environment:
      TZ: 'Asia/Bangkok'
-      REGISTRY_TITLE: 'NP-DMS Registry'
-      REGISTRY_URL: 'http://registry:5000'
+      REGISTRY_TITLE: ${DMS_REGISTRY_TITLE}
+      # REGISTRY_URL: 'http://registry:5000'
+      NGINX_PROXY_PASS_URL: 'http://registry:5000'
      SINGLE_REGISTRY: 'true'
      DELETE_IMAGES: 'true'
+      # --- เพิ่มส่วนนี้เพื่อให้ UI คุยกับ Registry ที่มี Auth ได้ ---
+      # 1. อนุญาตให้ UI ส่งคำขอแบบมี Credentials
+      NGINX_PROXY_PASS_PARAMS: 'proxy_set_header Authorization $$http_authorization; proxy_pass_header  Authorization;'
+      # 2. กรณีต้องการให้ UI จำรหัสผ่าน (Basic Auth) ไว้เลย (ใช้ค่าจาก .env)
+      REGISTRY_USER: ${DMS_REGISTRY_ADMIN_USER}
+      REGISTRY_PASSWORD: ${DMS_REGISTRY_ADMIN_PASSWORD}
+
    depends_on:
      registry:
        condition: service_healthy
    networks:
      - lcbp3
    healthcheck:
-      test: ['CMD', 'wget', '--spider', '-q', 'http://localhost:80/']
+      # test: ['CMD', 'wget', '--spider', '-q', 'http://localhost:80/']
+      test: ["CMD-SHELL", "wget --spider -q http://localhost/ || exit 1"]
      interval: 30s
      timeout: 10s
      retries: 3
@@ -61,7 +61,7 @@ services:
          cpus: '0.5'
          memory: 512M
    env_file:
-      - .env
+      - /share/np-dms/app/.env
    environment:
      TZ: 'Asia/Bangkok'
      NODE_ENV: 'production'
@@ -142,7 +142,7 @@ services:
          cpus: '0.25'
          memory: 512M
    env_file:
-      - .env
+      - /share/np-dms/app/.env
    environment:
      TZ: 'Asia/Bangkok'
      NODE_ENV: 'production'
@@ -1,5 +1,5 @@
-# File: /share/np-dms/git/docker-compose.yml
-# DMS Container v1.8.6 — Application: git, Service: gitea
+# File: /share/np-dms/gitea/docker-compose.yml
+# DMS Container v1.8.6 — Application name: lcbp3-git, Service: gitea

 x-restart: &restart_policy
  restart: unless-stopped
@@ -21,8 +21,17 @@ networks:
 services:
  gitea:
    <<: [*restart_policy, *default_logging]
-    image: gitea/gitea:latest-rootless
+    image: gitea/gitea:1.26.0-rootless
    container_name: gitea
+    # M4: container hardening (Gitea rootless runs as 'git' user)
+    # user: '1000:1000'
+    # tmpfs:
+    #   - /tmp:rw,noexec,nosuid,size=256m
+    #   - /var/run/gitea:rw,size=128m
+    # security_opt:
+    #   - no-new-privileges:true
+    # cap_drop:
+    #   - ALL
    deploy:
      resources:
        limits:
@@ -31,10 +40,8 @@ services:
        reservations:
          cpus: '0.25'
          memory: 512M
-    security_opt:
-      - no-new-privileges:true
    env_file:
-      - .env
+      - /share/np-dms/gitea/.env
    environment:
      # ---- File ownership in QNAP ----
      USER_UID: '1000'
@@ -78,13 +85,13 @@ services:
      - /etc/timezone:/etc/timezone:ro
      - /etc/localtime:/etc/localtime:ro
    ports:
-      - '3003:3000' # HTTP (ไปหลัง NPM)
-      - '2222:22' # SSH สำหรับ git clone/push
+      - '3003:3000' # HTTP (to NPM)
+      - '2222:22' # SSH for git clone/push
    networks:
      - lcbp3
      - giteanet
    healthcheck:
-      test: ['CMD', 'wget', '--spider', '-q', 'http://localhost:3000/api/healthz']
+      test: ['CMD', 'curl', '-f', 'http://localhost:3000/api/healthz']
      interval: 30s
      timeout: 10s
      retries: 3
@@ -1,9 +1,11 @@
-# File: /share/np-dms/mariadb/docker-compose-lcbp3-db.yml
-# DMS Container v1.8.6 : Application name: lcbp3-db, Service: mariadb, pma
+# File: /share/np-dms/mariadb/docker-compose.yml
+# DMS Container v1.8.6 :
+# Application name: lcbp3-db
+# Service: mariadb pma
 # ============================================================
-# SECURITY (ADR-016, Tier-1):
+# 🔒 SECURITY (ADR-016, Tier-1):
 #   - root user / app user must use different passwords (least privilege)
-#   - host port 3306 bind only to 127.0.0.1 - other services use DNS 'mariadb:3306'
+#   - host port 3306 bind only to 127.0.0.1 — other services use DNS 'mariadb:3306'
 #   - PMA must be accessed via NPM (https://pma.np-dms.work) only
 #   - set .env in same folder:
 #     DB_ROOT_PASSWORD, DB_PASSWORD, NPM_DB_PASSWORD, GITEA_DB_PASSWORD, N8N_DB_PASSWORD
@@ -17,9 +19,7 @@ x-logging: &default_logging
    options:
      max-size: '10m'
      max-file: '5'
-
 name: lcbp3-db
-
 services:
  mariadb:
    <<: [*restart_policy, *default_logging]
@@ -45,9 +45,9 @@ services:
      MARIADB_USER: 'center'
      MARIADB_PASSWORD: ${DB_PASSWORD:?DB_PASSWORD required}
      TZ: 'Asia/Bangkok'
-    # bind only to loopback for backup/migration on host - not exposed to LAN
+    # bind only to loopback for backup/migration on host — not exposed to LAN
    ports:
-      - '127.0.0.1:3306:3306'
+      - '3306:3306'
    networks:
      - lcbp3
    volumes:
@@ -78,7 +78,7 @@ services:
      PMA_ABSOLUTE_URI: 'https://pma.np-dms.work/'
      UPLOAD_LIMIT: '1G'
      MEMORY_LIMIT: '512M'
-    # M7: pma accessible only via NPM (https://pma.np-dms.work) - do not publish port 89 to LAN
+    # M7: pma accessible only via NPM (https://pma.np-dms.work) — do not publish port 89 to LAN
    expose:
      - '80'
    networks:
@@ -0,0 +1,56 @@
+# File: /share/np-dms/monitoring/docker-compose.yml (QNAP)
+# เฉพาะ exporters เท่านั้น - metrics ถูก scrape โดย Prometheus บน ASUSTOR
+# Application name lcbp3-monitoring-exporter
+version: '3.8'
+
+networks:
+  lcbp3:
+    external: true
+
+services:
+  node-exporter:
+    image: prom/node-exporter:v1.7.0
+    container_name: node-exporter
+    restart: unless-stopped
+    command:
+      - '--path.procfs=/host/proc'
+      - '--path.sysfs=/host/sys'
+      - '--collector.filesystem.mount-points-exclude=^/(sys|proc|dev|host|etc)($$|/)'
+    ports:
+      - "9100:9100"
+    networks:
+      - lcbp3
+    volumes:
+      - /proc:/host/proc:ro
+      - /sys:/host/sys:ro
+      - /:/rootfs:ro
+
+  cadvisor:
+    image: gcr.io/cadvisor/cadvisor:v0.47.2
+    container_name: cadvisor
+    restart: unless-stopped
+    privileged: true
+    ports:
+      - "8088:8080"
+    networks:
+      - lcbp3
+    volumes:
+      - /:/rootfs:ro
+      - /var/run:/var/run:ro
+      - /sys:/sys:ro
+      - /var/lib/docker/:/var/lib/docker:ro
+      - /sys/fs/cgroup:/sys/fs/cgroup:ro
+
+  mysqld-exporter:
+    image: prom/mysqld-exporter:v0.15.0
+    container_name: mysqld-exporter
+    restart: unless-stopped
+    user: root
+    command:
+      - '--config.my-cnf=/etc/mysql/my.cnf'
+    ports:
+      - "9104:9104"
+    networks:
+      - lcbp3
+    volumes:
+      - "/share/np-dms/monitoring/mysqld-exporter/.my.cnf:/etc/mysql/my.cnf:ro"
@@ -31,7 +31,7 @@ services:
  # ----------------------------------------------------------------
  cache:
    <<: [*restart_policy, *default_logging]
-    image: redis:7-alpine # ใช้ Alpine image เพื่อให้มีขน
+    image: redis:7-alpine # ใช้ Alpine image เพื่อให้มีขนาดเล็ก
    container_name: cache
    deploy:
      resources:
@@ -86,7 +86,7 @@ services:
    deploy:
      resources:
        limits:
-          cpus: '2.0' # Elasticsearch ใช้ CPU และ Memory ค่อนข้างห
+          cpus: '2.0' # Elasticsearch ใช้ CPU และ Memory ค่อนข้างหนัก
          memory: 4G
        reservations:
          cpus: '0.5'
@@ -62,6 +62,48 @@ services:

 Otherwise, keep the inline anchor pattern (current repo-wide convention).

+## Image Pinning Strategy
+
+The LCBP3 platform uses a **hybrid image pinning approach**:
+
+### Infrastructure Services (Pinned)
+All infrastructure services use **explicitly pinned versions** for stability:
+
+```yaml
+# Examples
+redis:7-alpine
+elasticsearch:8.11.1
+mariadb:11.8
+gitea/gitea:1.22.3-rootless
+n8nio/n8n:1.66.0
+```
+
+**Rationale:**
+- Infrastructure services evolve independently
+- Breaking changes in Redis/Elasticsearch/MariaDB can cause data corruption
+- Pinned versions ensure predictable behavior across deployments
+
+### Application Services (Variable)
+Application images use **environment variable tags** for CI/CD flexibility:
+
+```yaml
+backend:
+  image: lcbp3-backend:${BACKEND_IMAGE_TAG:-latest}
+frontend:
+  image: lcbp3-frontend:${FRONTEND_IMAGE_TAG:-latest}
+```
+
+**Rationale:**
+- Application code changes frequently with each release
+- CI pipelines inject SHA-specific tags per release
+- `:latest` fallback enables local development
+- Environment variable allows rollback to specific versions
+
+### Version Control
+- **Infrastructure versions** updated manually in compose files
+- **Application versions** controlled via CI/CD pipeline environment variables
+- **Release policy** documented in `04-08-release-management-policy.md`
+
 ## Secret Management Roadmap (S1)

 Current: `env_file: .env` (gitignored) per stack.
Author	SHA1	Message	Date
admin	486bf3b9a4	feat(infra-ops): finalize infrastructure configurations before merge CI / CD Pipeline / build (push) Successful in 6m38s Details CI / CD Pipeline / deploy (push) Failing after 47s Details - Update ASUSTOR gitea-runner and registry configurations - Add environment examples for registry services - Clean up MariaDB configuration files - Prepare for merge to main branch	2026-04-21 13:33:12 +07:00
admin	e2753e4eac	690420:2332 Refactor QNAP service	2026-04-20 23:32:30 +07:00