260223:1415 20260223 nextJS & nestJS Best pratices
All checks were successful
Build and Deploy / deploy (push) Successful in 4m44s

This commit is contained in:
admin
2026-02-23 14:15:06 +07:00
parent c90a664f53
commit ef16817f38
164 changed files with 24815 additions and 311 deletions

View File

@@ -0,0 +1,505 @@
# 🛠️ Section 2: System Architecture (สถาปัตยกรรมและเทคโนโลยี)
---
title: 'System Architecture'
version: 1.8.0
status: first-draft
owner: Nattanin Peancharoen
last_updated: 2026-01-26
related: -
specs/01-objectives.md
---
ชื่อกำหนด สถาปัตยกรรมแบบ Headless/API-First ที่ทันสมัย ทำงานทั้งหมดบน QNAP Server ผ่าน Container Station เพื่อความสะดวกในการจัดการและบำรุงรักษา
## **2.1 Infrastructure & Environment:**
- Domain: `np-dms.work`, `www.np-dms.work`
- IP: 159.192.126.103
- Server: QNAP TS-473A, RAM: 32GB, CPU: AMD Ryzen V1500B, HDD: 4TBx4nos. RAID 5, SSD: 1TB ใช้เป็น caching, มี port 2.5Gbps 2 port
- Server: AS5304T, RAM: 16GB, CPU: Intel Celeron CPU @ 2.00GH, HDD: 6TBx3nos. RAID 5, SSD: 1TB ใช้เป็น caching, มี port 2.5Gbps 2 port
- Rotuter: TP-LINK ER7206, WAN/LAN port 1 SFP, WAN port 2, WAN/LAN 10/100/1000 port 3-6
- Core Switch: TP-LINK TL-SG2428P, LAN port 1-24 10/100/1000, SFP port 25-28 1Gbps
- Server Switch: AMPCOM, LAN port 1-8 10/100/1000/2500, SFP+ port 9 10Gbps
- Admin Switch: TP-LINK ES205G, LAN port 1-5 10/100/1000
- CCTV Switch: TP-LINK TL-SL1226P port 1-24 PoE+ 100Mbps, SFP port 24-25 1Gbps
- IP Phone Switch: TP-LINK TL-SG1210P port 1-8 PoE+ 100Mbps , Uplink1 10/100/1000, Uplink2 SFP 1Gbps
- Controller: TP-LINK OC200
- Wireless Access point: TP-LINK EAP610 16 ตัว
- CCTV: HikVision (DS-7732NXI-K4) + กล้อง 6 ตัว
- IP Phone: YeaLink 8 ตัว
- Admin Desktop: Windows 11, LAN port 10/100/1000/2500
- Printer: Kyocera CS 3554ci, LAN port 10/100/1000
- Containerization: Container Station (Docker & Docker Compose) ใช้ UI ของ Container Station เป็นหลัก ในการ configuration และการรัน docker command
- Development Environment: VS Code/Cursor on Windows 11
- Data Storage: /share/dms-data บน QNAP
- ข้อจำกัด: ไม่สามารถใช้ .env ในการกำหนดตัวแปรภายนอกได้ ต้องกำหนดใน docker-compose.yml เท่านั้น
## **2.2 Netwrok Configuration**
**VLAN Networks**
| VLAN ID | Name | Purpose | Gateway/Subnet | DHCP | IP Range | DNS | Lease Time | ARP Detection | IGMP Snooping | MLD Snooping | Notes |
| ------- | ------ | --------- | --------------- | ---- | ------------------ | ------- | ---------- | ------------- | ------------- | ------------ | --------------- |
| 10 | SERVER | Interface | 192.168.10.1/24 | No | - | Custom | - | - | - | - | Static servers |
| 20 | MGMT | Interface | 192.168.20.1/24 | No | - | Custom | - | Enable | Enable | - | Management only |
| 30 | USER | Interface | 192.168.30.1/24 | Yes | 192.168.30.10-254 | Auto | 7 Days | - | Enable | - | User devices |
| 40 | CCTV | Interface | 192.168.40.1/24 | Yes | 192.168.40.100-150 | Auto | 7 Days | - | Enable | - | CCTV & NVR |
| 50 | VOICE | Interface | 192.168.50.1/24 | Yes | 192.168.50.201-250 | Auto | 7 Days | - | - | - | IP Phones |
| 60 | DMZ | Interface | 192.168.60.1/24 | No | - | 1.1.1.1 | - | - | - | - | Public services |
| 70 | GUEST | Interface | 192.168.70.1/24 | Yes | 192.168.70.200-250 | Auto | 1 Day | - | - | - | Guest |
**Switch Profiles**
| Profile Name | Native Network | Tagged Networks | Untagged Networks | Voice Network | Loopback Control | Usage |
| ---------------- | -------------- | --------------------- | ----------------- | ------------- | ---------------- | ----------------------- |
| 01_CORE_TRUNK | MGMT (20) | 10,30,40,50,60,70 | MGMT (20) | - | Spanning Tree | Router & switch uplinks |
| 02_MGMT_ONLY | MGMT (20) | MGMT (20) | - | - | Spanning Tree | Management only |
| 03_SERVER_ACCESS | SERVER (10) | MGMT (20) | SERVER (10) | - | Spanning Tree | QNAP / ASUSTOR |
| 04_CCTV_ACCESS | CCTV (40) | - | CCTV (40) | - | Spanning Tree | CCTV cameras |
| 05_USER_ACCESS | USER (30) | - | USER (30) | - | Spanning Tree | PC / Printer |
| 06_AP_TRUNK | MGMT (20) | USER (30), GUEST (70) | MGMT (20) | - | Spanning Tree | EAP610 Access Points |
| 07_VOICE_ACCESS | USER (30) | VOICE (50) | USER (30) | VOICE (50) | Spanning Tree | IP Phones |
**ER7206 Port Mapping**
| Port | Connected Device | Port | Description |
| ---- | ---------------- | ------------- | ----------- |
| 1 | - | - | - |
| 2 | WAN | - | Internet |
| 3 | SG2428P | PVID MGMT(20) | Core Switch |
| 4 | - | - | - |
| 5 | - | - | - |
| 6 | - | - | - |
**AMPCOM Port Aggregate Setting**
| Aggregate Group ID | Type | Member port | Aggregated Port |
| ------------------ | ---- | ----------- | --------------- |
| Trunk1 | LACP | 3,4 | 3,4 |
| Trunk2 | LACP | 5,6 | 5,6 |
**AMPCOM Port VLAN Mapping**
| Port | Connected Device | Port vlan type | Access VLAN | Native VLAN | Trunk vlan |
| ------ | ---------------- | -------------- | ----------- | ----------- | -------------------- |
| 1 | SG2428P | Trunk | - | 20 | 10,20,30,40,50,60,70 |
| 2 | - | Trunk | - | 20 | 10,20,30,40,50,60,70 |
| 7 | - | Access | 20 | - | - |
| 8 | Admin Desktop | Access | 20 | - | - |
| Trunk1 | QNAP | Trunk | - | 10 | 10,20,30,40,50,60,70 |
| Trunk2 | ASUSTOR | Trunk | - | 10 | 10,20,30,40,50,60,70 |
**NAS NIC Bonding Configuration**
| Device | Bonding Mode | Member Ports | VLAN Mode | Tagged VLAN | IP Address | Gateway | Notes |
| ------- | ------------------- | ------------ | --------- | ----------- | --------------- | ------------ | ---------------------- |
| QNAP | IEEE 802.3ad (LACP) | Adapter 1, 2 | Untagged | 10 (SERVER) | 192.168.10.8/24 | 192.168.10.1 | Primary NAS for DMS |
| ASUSTOR | IEEE 802.3ad (LACP) | Port 1, 2 | Untagged | 10 (SERVER) | 192.168.10.9/24 | 192.168.10.1 | Backup / Secondary NAS |
> **หมายเหตุ**: NAS ทั้งสองตัวใช้ LACP bonding เพื่อเพิ่ม bandwidth และ redundancy โดยต้อง config ให้ตรงกับ AMPCOM Switch (Trunk1)
**SG2428P Port Mapping**
| Port | Connected Device | Switch Profile | Description |
| ---- | ------------------------- | -------------------- | ------------- |
| 1 | ER7206 | 01_CORE_TRUNK | Internet |
| 2 | OC200 | 01_CORE_TRUNK | Controller |
| 3 | Ampcom 2.5G Switch Port 1 | LAG1 (01_CORE_TRUNK) | Uplink |
| 4 | - | LAG1 (01_CORE_TRUNK) | Reserved |
| 5 | EAP610-01 | 06_AP_TRUNK | Access Point |
| 6 | EAP610-02 | 06_AP_TRUNK | Access Point |
| 7 | EAP610-03 | 06_AP_TRUNK | Access Point |
| 8 | EAP610-04 | 06_AP_TRUNK | Access Point |
| 9 | EAP610-05 | 06_AP_TRUNK | Access Point |
| 10 | EAP610-06 | 06_AP_TRUNK | Access Point |
| 11 | EAP610-07 | 06_AP_TRUNK | Access Point |
| 12 | EAP610-08 | 06_AP_TRUNK | Access Point |
| 13 | EAP610-09 | 06_AP_TRUNK | Access Point |
| 14 | EAP610-10 | 06_AP_TRUNK | Access Point |
| 15 | EAP610-11 | 06_AP_TRUNK | Access Point |
| 16 | EAP610-12 | 06_AP_TRUNK | Access Point |
| 17 | EAP610-13 | 06_AP_TRUNK | Access Point |
| 18 | EAP610-14 | 06_AP_TRUNK | Access Point |
| 19 | EAP610-15 | 06_AP_TRUNK | Access Point |
| 20 | EAP610-16 | 06_AP_TRUNK | Access Point |
| 21 | Reserved | 01_CORE_TRUNK | |
| 22 | Reserved | 01_CORE_TRUNK | |
| 23 | Printer | 05_USER_ACCESS | Printer |
| 24 | ES205G | 01_CORE_TRUNK | Management PC |
| 25 | TL-SL1226P | 01_CORE_TRUNK | Uplink |
| 26 | SG1210P | 01_CORE_TRUNK | Uplink |
| 27 | Reserved | 01_CORE_TRUNK | |
| 28 | Reserved | 01_CORE_TRUNK | |
**ES205G Port Mapping (Admin Switch)**
| Port | Connected Device | VLAN | Description |
| ---- | ---------------- | ----------- | ----------- |
| 1 | SG2428P Port 24 | Trunk (All) | Uplink |
| 2 | Admin Desktop | MGMT (20) | Admin PC |
| 3 | Reserved | MGMT (20) | |
| 4 | Reserved | MGMT (20) | |
| 5 | Reserved | MGMT (20) | |
> **หมายเหตุ**: ES205G เป็น Unmanaged Switch ไม่รองรับ VLAN tagging ดังนั้นทุก port จะอยู่ใน Native VLAN (20) ของ uplink
**TL-SL1226P Port Mapping (CCTV Switch)**
| Port | Connected Device | PoE | VLAN | Description |
| ---- | ---------------- | ---- | --------- | ----------- |
| 1 | Camera-01 | PoE+ | CCTV (40) | CCTV Camera |
| 2 | Camera-02 | PoE+ | CCTV (40) | CCTV Camera |
| 3 | Camera-03 | PoE+ | CCTV (40) | CCTV Camera |
| 4 | Camera-04 | PoE+ | CCTV (40) | CCTV Camera |
| 5 | Camera-05 | PoE+ | CCTV (40) | CCTV Camera |
| 6 | Camera-06 | PoE+ | CCTV (40) | CCTV Camera |
| 7-23 | Reserved | PoE+ | CCTV (40) | |
| 24 | HikVision NVR | - | CCTV (40) | NVR |
| 25 | SG2428P Port 25 | - | Trunk | SFP Uplink |
| 26 | Reserved | - | Trunk | SFP |
**SG1210P Port Mapping (IP Phone Switch)**
| Port | Connected Device | PoE | Data VLAN | Voice VLAN | Description |
| ------- | ---------------- | ---- | --------- | ---------- | ----------- |
| 1 | IP Phone-01 | PoE+ | USER (30) | VOICE (50) | IP Phone |
| 2 | IP Phone-02 | PoE+ | USER (30) | VOICE (50) | IP Phone |
| 3 | IP Phone-03 | PoE+ | USER (30) | VOICE (50) | IP Phone |
| 4 | IP Phone-04 | PoE+ | USER (30) | VOICE (50) | IP Phone |
| 5 | IP Phone-05 | PoE+ | USER (30) | VOICE (50) | IP Phone |
| 6 | IP Phone-06 | PoE+ | USER (30) | VOICE (50) | IP Phone |
| 7 | IP Phone-07 | PoE+ | USER (30) | VOICE (50) | IP Phone |
| 8 | IP Phone-08 | PoE+ | USER (30) | VOICE (50) | IP Phone |
| Uplink1 | Reserved | - | Trunk | - | RJ45 Uplink |
| Uplink2 | SG2428P Port 26 | - | Trunk | - | SFP Uplink |
> **หมายเหตุ**: SG1210P รองรับ Voice VLAN ทำให้ IP Phone ใช้ VLAN 50 สำหรับ voice traffic และ passthrough VLAN 30 สำหรับ PC ที่ต่อผ่าน phone
**Static IP Allocation**
| VLAN | Device | IP Address | MAC Address | Notes |
| ---------- | --------------- | ------------------ | ----------- | ---------------- |
| SERVER(10) | QNAP | 192.168.10.8 | - | Primary NAS |
| SERVER(10) | ASUSTOR | 192.168.10.9 | - | Backup NAS |
| SERVER(10) | Docker Host | 192.168.10.10 | - | Containers |
| MGMT(20) | ER7206 | 192.168.20.1 | - | Gateway/Router |
| MGMT(20) | SG2428P | 192.168.20.2 | - | Core Switch |
| MGMT(20) | AMPCOM | 192.168.20.3 | - | Server Switch |
| MGMT(20) | TL-SL1226P | 192.168.20.4 | - | CCTV Switch |
| MGMT(20) | SG1210P | 192.168.20.5 | - | Phone Switch |
| MGMT(20) | OC200 | 192.168.20.250 | - | Omada Controller |
| MGMT(20) | Admin Desktop | 192.168.20.100 | - | Admin PC |
| USER(30) | Printer | 192.168.30.222 | - | Kyocera CS3554ci |
| CCTV(40) | NVR | 192.168.40.100 | - | HikVision NVR |
| CCTV(40) | Camera-01 to 06 | 192.168.40.101-106 | - | CCTV Cameras |
| USER(30) | Admin Desktop | 192.168.30.100 | - | Admin PC (USER) |
**2.8 DHCP Reservation (MAC Mapping)**
**CCTV MAC Address Mapping (VLAN 40)**
| Device Name | IP Address | MAC Address | Port (Switch) | Notes |
| ------------- | -------------- | ----------- | ------------- | ---------- |
| HikVision NVR | 192.168.40.100 | | Port 24 | Master NVR |
| Camera-01 | 192.168.40.101 | | Port 1 | |
| Camera-02 | 192.168.40.102 | | Port 2 | |
| Camera-03 | 192.168.40.103 | | Port 3 | |
| Camera-04 | 192.168.40.104 | | Port 4 | |
| Camera-05 | 192.168.40.105 | | Port 5 | |
| Camera-06 | 192.168.40.106 | | Port 6 | |
**IP Phone MAC Address Mapping (VLAN 50)**
| Device Name | IP Address | MAC Address | Port (Switch) | Notes |
| ----------- | -------------- | ----------- | ------------- | ------- |
| IP Phone-01 | 192.168.50.201 | | Port 1 | Yealink |
| IP Phone-02 | 192.168.50.202 | | Port 2 | Yealink |
| IP Phone-03 | 192.168.50.203 | | Port 3 | Yealink |
| IP Phone-04 | 192.168.50.204 | | Port 4 | Yealink |
| IP Phone-05 | 192.168.50.205 | | Port 5 | Yealink |
| IP Phone-06 | 192.168.50.206 | | Port 6 | Yealink |
| IP Phone-07 | 192.168.50.207 | | Port 7 | Yealink |
| IP Phone-08 | 192.168.50.208 | | Port 8 | Yealink |
**Wireless SSID Mapping (OC200 Controller)**
| SSID Name | Band | VLAN | Security | Portal Auth | Notes |
| --------- | ------- | ---------- | --------- | ----------- | ----------------------- |
| PSLCBP3 | 2.4G/5G | USER (30) | WPA2/WPA3 | No | Staff WiFi |
| GUEST | 2.4G/5G | GUEST (70) | WPA2 | Yes | Guest WiFi with Captive |
> **หมายเหตุ**: ทุก SSID broadcast ผ่าน EAP610 ทั้ง 16 ตัว โดยใช้ 06_AP_TRUNK profile ที่ tag VLAN 30 และ 70
**Gateway ACL (ER7206 Firewall Rules)**
*Inter-VLAN Routing Policy*
| # | Name | Source | Destination | Service | Action | Log | Notes |
| --- | ----------------- | --------------- | ---------------- | -------------- | ------ | --- | --------------------------- |
| 1 | MGMT-to-ALL | VLAN20 (MGMT) | Any | Any | Allow | No | Admin full access |
| 2 | SERVER-to-ALL | VLAN10 (SERVER) | Any | Any | Allow | No | Servers outbound access |
| 3 | USER-to-SERVER | VLAN30 (USER) | VLAN10 (SERVER) | HTTP/HTTPS/SSH | Allow | No | Users access web apps |
| 4 | USER-to-DMZ | VLAN30 (USER) | VLAN60 (DMZ) | HTTP/HTTPS | Allow | No | Users access DMZ services |
| 5 | USER-to-MGMT | VLAN30 (USER) | VLAN20 (MGMT) | Any | Deny | Yes | Block users from management |
| 6 | USER-to-CCTV | VLAN30 (USER) | VLAN40 (CCTV) | Any | Deny | Yes | Isolate CCTV |
| 7 | USER-to-VOICE | VLAN30 (USER) | VLAN50 (VOICE) | Any | Deny | No | Isolate Voice |
| 8 | USER-to-GUEST | VLAN30 (USER) | VLAN70 (GUEST) | Any | Deny | No | Isolate Guest |
| 9 | CCTV-to-INTERNET | VLAN40 (CCTV) | WAN | HTTPS (443) | Allow | No | NVR cloud backup (optional) |
| 10 | CCTV-to-ALL | VLAN40 (CCTV) | Any (except WAN) | Any | Deny | Yes | CCTV isolated |
| 11 | VOICE-to-SIP | VLAN50 (VOICE) | SIP Server IP | SIP/RTP | Allow | No | Voice to SIP trunk |
| 12 | VOICE-to-ALL | VLAN50 (VOICE) | Any | Any | Deny | No | Voice isolated |
| 13 | DMZ-to-ALL | VLAN60 (DMZ) | Any (internal) | Any | Deny | Yes | DMZ cannot reach internal |
| 14 | GUEST-to-INTERNET | VLAN70 (GUEST) | WAN | HTTP/HTTPS/DNS | Allow | No | Guest internet only |
| 15 | GUEST-to-ALL | VLAN70 (GUEST) | Any (internal) | Any | Deny | Yes | Guest isolated |
| 99 | DEFAULT-DENY | Any | Any | Any | Deny | Yes | Catch-all deny |
*WAN Inbound Rules (Port Forwarding)*
| # | Name | WAN Port | Internal IP | Internal Port | Protocol | Notes |
| --- | --------- | -------- | ------------ | ------------- | -------- | ------------------- |
| 1 | HTTPS-NPM | 443 | 192.168.10.8 | 443 | TCP | Nginx Proxy Manager |
| 2 | HTTP-NPM | 80 | 192.168.10.8 | 80 | TCP | HTTP redirect |
> **หมายเหตุ**: ER7206 ใช้หลักการ Default Deny - Rules ประมวลผลจากบนลงล่าง
**Switch ACL (SG2428P Layer 2 Rules)**
*Port-Based Access Control*
| # | Name | Source Port | Source MAC/VLAN | Destination | Action | Notes |
| --- | --------------- | --------------- | --------------- | ------------------- | ------ | ------------------------ |
| 1 | CCTV-Isolation | Port 25 (CCTV) | VLAN 40 | VLAN 10,20,30 | Deny | CCTV cannot reach others |
| 2 | Guest-Isolation | Port 5-20 (APs) | VLAN 70 | VLAN 10,20,30,40,50 | Deny | Guest isolation |
| 3 | Voice-QoS | Port 26 (Phone) | VLAN 50 | Any | Allow | QoS priority DSCP EF |
*Storm Control (per port)*
| Port Range | Broadcast | Multicast | Unknown Unicast | Notes |
| ---------- | --------- | --------- | --------------- | ----------------------- |
| 1-28 | 10% | 10% | 10% | Prevent broadcast storm |
*Spanning Tree Configuration*
| Setting | Value | Notes |
| -------------------- | --------- | ------------------------------ |
| STP Mode | RSTP | Rapid Spanning Tree |
| Root Bridge Priority | 4096 | SG2428P as root |
| Port Fast | Port 5-24 | Edge ports (APs, endpoints) |
| BPDU Guard | Port 5-24 | Protect against rogue switches |
> **หมายเหตุ**: SG2428P เป็น L2+ switch, ACL ทำได้จำกัด ให้ใช้ ER7206 เป็น primary firewall
**EAP ACL (Omada Controller - Wireless Rules)**
*SSID: PSLCBP3 (Staff WiFi)*
| # | Name | Source | Destination | Service | Action | Schedule | Notes |
| --- | ------------------- | ---------- | ---------------- | -------- | ------ | -------- | ----------------- |
| 1 | Allow-DNS | Any Client | 8.8.8.8, 1.1.1.1 | DNS (53) | Allow | Always | DNS resolution |
| 2 | Allow-Server | Any Client | 192.168.10.0/24 | Any | Allow | Always | Access to servers |
| 3 | Allow-Printer | Any Client | 192.168.30.222 | 9100,631 | Allow | Always | Print services |
| 4 | Allow-Internet | Any Client | WAN | Any | Allow | Always | Internet access |
| 5 | Block-MGMT | Any Client | 192.168.20.0/24 | Any | Deny | Always | No management |
| 6 | Block-CCTV | Any Client | 192.168.40.0/24 | Any | Deny | Always | No CCTV access |
| 7 | Block-Voice | Any Client | 192.168.50.0/24 | Any | Deny | Always | No Voice access |
| 8 | Block-Client2Client | Any Client | Any Client | Any | Deny | Always | Client isolation |
*SSID: GUEST (Guest WiFi)*
| # | Name | Source | Destination | Service | Action | Schedule | Notes |
| --- | ------------------- | ---------- | ---------------- | ---------- | ------ | -------- | ------------------ |
| 1 | Allow-DNS | Any Client | 8.8.8.8, 1.1.1.1 | DNS (53) | Allow | Always | DNS resolution |
| 2 | Allow-HTTP | Any Client | WAN | HTTP/HTTPS | Allow | Always | Web browsing |
| 3 | Block-RFC1918 | Any Client | 10.0.0.0/8 | Any | Deny | Always | No private IPs |
| 4 | Block-RFC1918-2 | Any Client | 172.16.0.0/12 | Any | Deny | Always | No private IPs |
| 5 | Block-RFC1918-3 | Any Client | 192.168.0.0/16 | Any | Deny | Always | No internal access |
| 6 | Block-Client2Client | Any Client | Any Client | Any | Deny | Always | Client isolation |
*Rate Limiting*
| SSID | Download Limit | Upload Limit | Notes |
| ------- | -------------- | ------------ | ----------------------- |
| PSLCBP3 | Unlimited | Unlimited | Staff full speed |
| GUEST | 10 Mbps | 5 Mbps | Guest bandwidth control |
*Captive Portal (GUEST SSID)*
| Setting | Value | Notes |
| ---------------- | --------------- | ---------------------- |
| Portal Type | Simple Password | Single shared password |
| Session Timeout | 8 Hours | Re-auth after 8 hours |
| Idle Timeout | 30 Minutes | Disconnect if idle |
| Terms of Service | Enabled | User must accept ToS |
> **หมายเหตุ**: EAP ACL ทำงานที่ Layer 3 บน Omada Controller ช่วยลด load บน ER7206
**Network Topology Diagram**
```mermaid
graph TB
subgraph Internet
WAN[("🌐 Internet<br/>WAN")]
end
subgraph Router["ER7206 Router"]
R[("🔲 ER7206<br/>192.168.20.1")]
end
subgraph CoreSwitch["SG2428P Core Switch"]
CS[("🔲 SG2428P<br/>192.168.20.2")]
end
subgraph ServerSwitch["AMPCOM 2.5G Switch"]
SS[("🔲 AMPCOM<br/>192.168.20.3")]
end
subgraph Servers["VLAN 10 - Servers"]
QNAP[("💾 QNAP<br/>192.168.10.10")]
ASUSTOR[("💾 ASUSTOR<br/>192.168.10.11")]
end
subgraph AccessPoints["EAP610 x16"]
AP[("📶 WiFi APs")]
end
subgraph OtherSwitches["Distribution"]
CCTV_SW[("🔲 TL-SL1226P<br/>CCTV")]
PHONE_SW[("🔲 SG1210P<br/>IP Phone")]
ADMIN_SW[("🔲 ES205G<br/>Admin")]
end
WAN --> R
R -->|Port 3| CS
CS -->|LAG Port 3-4| SS
SS -->|Port 3-4 LACP| QNAP
SS -->|Port 5-6 LACP| ASUSTOR
SS -->|Port 7| ADMIN_SW
CS -->|Port 5-20| AP
CS -->|SFP 25| CCTV_SW
CS -->|SFP 26| PHONE_SW
CS -->|Port 24| ADMIN_SW
```
**OC200 Omada Controller Configuration**
| Setting | Value | Notes |
| --------------- | -------------------------- | ------------------------------ |
| Controller IP | 192.168.20.10 | Static IP in MGMT VLAN |
| Controller Port | 8043 (HTTPS) | Management Web UI |
| Adoption URL | https://192.168.20.10:8043 | URL for AP adoption |
| Site Name | LCBP3 | Single site configuration |
| Managed Devices | 16x EAP610 | All APs managed centrally |
| Firmware Update | Manual | Test before production rollout |
| Backup Schedule | Weekly (Sunday 2AM) | Auto backup to QNAP |
## **2.3 การจัดการ Configuration (ปรับปรุง):**
- ใช้ docker-compose.yml สำหรับ environment variables ตามข้อจำกัดของ QNAP
- Secrets Management:
- ห้ามระบุ Sensitive Secrets (Password, Keys) ใน docker-compose.yml หลัก
- ต้องใช้ไฟล์ docker-compose.override.yml (ที่ถูก gitignore) สำหรับ Inject Environment Variables ที่เป็นความลับในแต่ละ Environment (Dev/Prod)
- ไฟล์ docker-compose.yml หลักให้ใส่ค่า Dummy หรือว่างไว้
- แต่ต้องมี mechanism สำหรับจัดการ sensitive secrets อย่างปลอดภัย โดยใช้:
- Docker secrets (ถ้ารองรับ)
- External secret management (Hashicorp Vault) หรือ
- Encrypted environment variables
- Development environment ยังใช้ .env ได้ แต่ต้องไม่ commit เข้า version control
- ต้องมี configuration validation during application startup
- ต้องแยก configuration ตาม environment (development, staging, production)
- Docker Network: ทุก Service จะเชื่อมต่อผ่านเครือข่ายกลางชื่อ lcbp3 เพื่อให้สามารถสื่อสารกันได้
## **2.4 Core Services:**
- Code Hosting: Gitea (Self-hosted on QNAP)
- Application name: git
- Service name: gitea
- Domain: git.np-dms.work
- หน้าที่: เป็นศูนย์กลางในการเก็บและจัดการเวอร์ชันของโค้ด (Source Code) สำหรับทุกส่วน
- Backend / Data Platform: NestJS
- Application name: lcbp3-backend
- Service name: backend
- Domain: backend.np-dms.work
- Framework: NestJS (Node.js, TypeScript, ESM)
- หน้าที่: จัดการโครงสร้างข้อมูล (Data Models), สร้าง API, จัดการสิทธิ์ผู้ใช้ (Roles & Permissions), และสร้าง Workflow ทั้งหมดของระบบ
- Database: MariaDB 11.8
- Application name: lcbp3-db
- Service name: mariadb
- Domain: db.np-dms.work
- หน้าที่: ฐานข้อมูลหลักสำหรับเก็บข้อมูลทั้งหมด
- Tooling: DBeaver (Community Edition), phpmyadmin สำหรับการออกแบบและจัดการฐานข้อมูล
- Database Management: phpMyAdmin
- Application name: lcbp3-db
- Service: phpmyadmin:5-apache
- Service name: pma
- Domain: pma.np-dms.work
- หน้าที่: จัดการฐานข้อมูล mariadb ผ่าน Web UI
- Frontend: Next.js
- Application name: lcbp3-frontend
- Service name: frontend
- Domain: lcbp3.np-dms.work
- Framework: Next.js (App Router, React, TypeScript, ESM)
- Styling: Tailwind CSS + PostCSS
- Component Library: shadcn/ui
- หน้าที่: สร้างหน้าตาเว็บแอปพลิเคชันสำหรับให้ผู้ใช้งานเข้ามาดู Dashboard, จัดการเอกสาร, และติดตามงาน โดยจะสื่อสารกับ Backend ผ่าน API
- Workflow Automation: n8n
- Application name: lcbp3-n8n
- Service: n8nio/n8n:latest
- Service name: n8n
- Domain: n8n.np-dms.work
- หน้าที่: จัดการ workflow ระหว่าง Backend และ Line
- Reverse Proxy: Nginx Proxy Manager
- Application name: lcbp3-npm
- Service: Nginx Proxy Manager (nginx-proxy-manage: latest)
- Service name: npm
- Domain: npm.np-dms.work
- หน้าที่: เป็นด่านหน้าในการรับ-ส่งข้อมูล จัดการโดเมนทั้งหมด, ทำหน้าที่เป็น Proxy ชี้ไปยัง Service ที่ถูกต้อง, และจัดการ SSL Certificate (HTTPS) ให้อัตโนมัติ
- Search Engine: Elasticsearch
- Cache: Redis
## **2.5 Business Logic & Consistency (ปรับปรุง):**
- 2.5.1 Unified Workflow Engine (หลัก):
- ระบบการเดินเอกสารทั้งหมด (Correspondence, RFA, Circulation) ต้อง ใช้ Engine กลางเดียวกัน โดยกำหนด Logic ผ่าน Workflow DSL (JSON Configuration) แทนการเขียน Hard-coded ลงในตาราง
- Workflow Versioning (เพิ่ม): ระบบต้องรองรับการกำหนด Version ของ Workflow Definition โดยเอกสารที่เริ่มกระบวนการไปแล้ว (In-progress instances) จะต้องใช้ Workflow Version เดิม จนกว่าจะสิ้นสุดกระบวนการ หรือได้รับคำสั่ง Migrate จาก Admin เพื่อป้องกันความขัดแย้งของ State
- 2.5.2 Separation of Concerns:
- Module ต่างๆ (Correspondence, RFA, Circulation) จะเก็บเฉพาะข้อมูลของเอกสาร (Data) ส่วนสถานะและการเปลี่ยนสถานะ (State Transition) จะถูกจัดการโดย Workflow Engine
- 2.5.3 Idempotency & Locking:
- ใช้กลไกเดิมในการป้องกันการทำรายการซ้ำ
- 2.5.4 Optimistic Locking:
- ใช้ Version Column ใน Database ควบคู่กับ Redis Lock สำหรับการสร้างเลขที่เอกสาร เพื่อเป็น Safety Net ชั้นสุดท้าย
- 2.5.5 จะไม่มีการใช้ SQL Triggers
- เพื่อป้องกันตรรกะซ่อนเร้น (Hidden Logic) และความซับซ้อนในการดีบัก
## **2.6 Data Migration และ Schema Versioning:**
- ต้องมี database migration scripts สำหรับทุก schema change โดยใช้ TypeORM migrations
- ต้องรองรับ rollback ของ migration ได้
- ต้องมี data seeding strategy สำหรับ environment ต่างๆ (development, staging, production)
- ต้องมี version compatibility between schema versions
- Migration scripts ต้องผ่านการทดสอบใน staging environment ก่อน production
- ต้องมี database backup ก่อนทำ migration ใน production
## **2.7 กลยุทธ์ความทนทานและการจัดการข้อผิดพลาด (Resilience & Error Handling Strategy)**
- 2.7.1 Circuit Breaker Pattern: ใช้สำหรับ external service calls (Email, LINE, Elasticsearch)
- 2.7.2 Retry Mechanism: ด้วย exponential backoff สำหรับ transient failures
- 2.7.3 Fallback Strategies: Graceful degradation เมื่อบริการภายนอกล้มเหลว
- 2.7.4 Error Handling: Error messages ต้องไม่เปิดเผยข้อมูล sensitive
- 2.6.5 Monitoring: Centralized error monitoring และ alerting system

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,78 @@
# 🔐 Section 4: Access Control (ข้อกำหนดด้านสิทธิ์และการเข้าถึง)
---
title: 'Access Control'
version: 1.5.0
status: first-draft
owner: Nattanin Peancharoen
last_updated: 2025-11-30
related:
- specs/02-architecture/data-model.md#correspondence
- specs/03-implementation/backend-guidelines.md#correspondencemodule
---
## 4.1. Overview:
- Users and organizations can view and edit documents based on the permissions they have. The system's permissions will be based on Role-Based Access Control (RBAC).
## 4.2. Permission Hierarchy:
- Global: The highest level of permissions in the system
- Organization: Permissions within an organization, which is the basic permission for users
- Project: Permissions specific to a project, which will be considered when the user is in that project
- Contract: Permissions specific to a contract, which will be considered when the user is in that contract
## 4.3. Permission Enforcement:
- When checking permissions, the system will consider permissions from all levels that the user has and use the most permissive permission as the decision
- Example: User A is a Viewer in the organization, but is assigned as an Editor in Project X when in Project X, User A will have the right to edit
## 4.4. Role and Scope:
| Role | Scope | Description | Key Permissions |
| :------------------- | :----------- | :------------------------- | :-------------------------------------------------------------------------------------------------------------------- |
| **Superadmin** | Global | System administrator | Do everything in the system, manage organizations, manage global data |
| **Org Admin** | Organization | Organization administrator | Manage users in the organization, manage roles/permissions within the organization, view organization reports |
| **Document Control** | Organization | Document controller | Add/edit/delete documents, set document permissions within the organization |
| **Editor** | Organization | Document editor | Edit documents that have been assigned to them |
| **Viewer** | Organization | Document viewer | View documents that have access permissions |
| **Project Manager** | Project | Project manager | Manage members in the project (add/delete/assign roles), create/manage contracts in the project, view project reports |
| **Contract Admin** | Contract | Contract administrator | Manage users in the contract, manage roles/permissions within the contract, view contract reports |
## 4.5. Token Management (ปรับปรุง)
- **Payload Optimization:** ใน JWT Access Token ให้เก็บเฉพาะ `userId` และ `scope` ปัจจุบันเท่านั้น
- **Permission Caching:** สิทธิ์ละเอียด (Permissions List) ให้เก็บใน **Redis** และดึงมาตรวจสอบเมื่อ Request เข้ามา เพื่อลดขนาด Token และเพิ่มความเร็ว
## 4.6. Onboarding Workflow
- 4.6.1. Create Organization
- **Superadmin** creates a new organization (e.g. Company A)
- **Superadmin** appoints at least 1 user as **Org Admin** or **Document Control** of Company A
- 4.6.2. Add Users to Organization
- **Org Admin** of Company A adds other users (Editor, Viewer) to the organization
- 4.6.3. Assign Users to Project
- **Project Manager** of Project X (which may come from Company A or another company) invites or assigns users from different organizations to join Project X
- In this step, **Project Manager** will assign **Project Role** (e.g. Project Member, or may use organization-level permissions)
- 4.6.4. Assign Users to Contract
- **Contract Admin** of Contract Y (which is part of Project X) selects users from Project X and assigns them to Contract Y
- In this step, **Contract Admin** will assign **Contract Role** (e.g. Contract Member) and specific permissions
- 4.6.5 Security Onboarding:
- Force users to change password for the first time
- Security awareness training for users with high permissions
- Safe password reset process
- Audit log recording every permission change
### **4.7. Master Data Management**
| Master Data | Manager | Scope |
| :-------------------------------------- | :------------------------------ | :------------------------------ |
| Document Type (Correspondence, RFA) | **Superadmin** | Global |
| Document Status (Draft, Approved, etc.) | **Superadmin** | Global |
| Shop Drawing Category | **Project Manager** | Project (สร้างใหม่ได้ภายในโครงการ) |
| Tags | **Org Admin / Project Manager** | Organization / Project |
| Custom Roles | **Superadmin / Org Admin** | Global / Organization |
| Document Numbering Formats | **Superadmin / Admin** | Global / Organization |

View File

@@ -0,0 +1,904 @@
# Infrastructure Setup
> 📍 **Document Version:** v1.8.0
> 🖥️ **Primary Server:** QNAP TS-473A (Application & Database)
> 💾 **Backup Server:** ASUSTOR AS5403T (Infrastructure & Backup)
---
## Overview
> 📖 **ดูรายละเอียด Server Roles และ Service Distribution ได้ที่:** [README.md](README.md#-hardware-infrastructure)
>
> เอกสารนี้มุ่งเน้นการตั้งค่า Technical Configuration สำหรับแต่ละ Service
---
## 1. Redis Configuration (Standalone + Persistence)
### 1.1 Docker Compose Setup
```yaml
# docker-compose-redis.yml
version: '3.8'
services:
redis:
image: 'redis:7.2-alpine'
container_name: lcbp3-redis
restart: unless-stopped
# AOF: Enabled for durability
# Maxmemory: Prevent OOM
command: redis-server --appendonly yes --requirepass ${REDIS_PASSWORD} --maxmemory 1gb --maxmemory-policy noeviction
volumes:
- ./redis/data:/data
ports:
- '6379:6379'
networks:
- lcbp3
deploy:
resources:
limits:
cpus: '2.0'
memory: 1.5G
networks:
lcbp3:
external: true
```
## 2. Database Configuration
### 2.1 MariaDB Optimization for Numbering
```sql
-- /etc/mysql/mariadb.conf.d/50-numbering.cnf
[mysqld]
# Connection pool
max_connections = 200
thread_cache_size = 50
# Query cache (disabled for InnoDB)
query_cache_type = 0
query_cache_size = 0
# InnoDB settings
innodb_buffer_pool_size = 4G
innodb_log_file_size = 512M
innodb_flush_log_at_trx_commit = 1
innodb_lock_wait_timeout = 50
# Performance Schema
performance_schema = ON
performance_schema_instrument = 'wait/lock/%=ON'
# Binary logging
log_bin = /var/log/mysql/mysql-bin.log
expire_logs_days = 7
max_binlog_size = 100M
# Slow query log
slow_query_log = 1
slow_query_log_file = /var/log/mysql/slow-query.log
long_query_time = 1
```
### 2.2 Monitoring Locks
```sql
-- Check for lock contention
SELECT
r.trx_id waiting_trx_id,
r.trx_mysql_thread_id waiting_thread,
r.trx_query waiting_query,
b.trx_id blocking_trx_id,
b.trx_mysql_thread_id blocking_thread,
b.trx_query blocking_query
FROM information_schema.innodb_lock_waits w
INNER JOIN information_schema.innodb_trx b ON b.trx_id = w.blocking_trx_id
INNER JOIN information_schema.innodb_trx r ON r.trx_id = w.requesting_trx_id;
-- Check active transactions
SELECT * FROM information_schema.innodb_trx;
-- Kill long-running transaction (if needed)
KILL <thread_id>;
```
---
## 3. Backend Service Configuration
### 3.1 Backend Service Deployment
#### Docker Compose
```yaml
# docker-compose-backend.yml
version: '3.8'
services:
backend-1:
image: lcbp3-backend:latest
container_name: lcbp3-backend-1
environment:
- NODE_ENV=production
- DB_HOST=mariadb
- REDIS_HOST=cache
- REDIS_PORT=6379
- NUMBERING_LOCK_TIMEOUT=5000
- NUMBERING_RESERVATION_TTL=300
ports:
- "3001:3000"
depends_on:
- mariadb
- cache
networks:
- lcbp3
restart: unless-stopped
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:3000/health"]
interval: 30s
timeout: 10s
retries: 3
backend-2:
image: lcbp3-backend:latest
container_name: lcbp3-backend-2
environment:
- NODE_ENV=production
- DB_HOST=mariadb
- REDIS_HOST=cache
- REDIS_PORT=6379
ports:
- "3002:3000"
depends_on:
- mariadb
- cache
networks:
- lcbp3
restart: unless-stopped
networks:
lcbp3:
external: true
```
#### Health Check Endpoint
```typescript
// health/numbering.health.ts
import { Injectable } from '@nestjs/common';
import { HealthIndicator, HealthIndicatorResult } from '@nestjs/terminus';
import { Redis } from 'ioredis';
import { DataSource } from 'typeorm';
@Injectable()
export class NumberingHealthIndicator extends HealthIndicator {
constructor(
private redis: Redis,
private dataSource: DataSource,
) {
super();
}
async isHealthy(key: string): Promise<HealthIndicatorResult> {
const checks = await Promise.all([
this.checkRedis(),
this.checkDatabase(),
this.checkSequenceIntegrity(),
]);
const isHealthy = checks.every((check) => check.status === 'up');
return this.getStatus(key, isHealthy, { checks });
}
private async checkRedis(): Promise<any> {
try {
await this.redis.ping();
return { name: 'redis', status: 'up' };
} catch (error) {
return { name: 'redis', status: 'down', error: error.message };
}
}
private async checkDatabase(): Promise<any> {
try {
await this.dataSource.query('SELECT 1');
return { name: 'database', status: 'up' };
} catch (error) {
return { name: 'database', status: 'down', error: error.message };
}
}
private async checkSequenceIntegrity(): Promise<any> {
try {
const result = await this.dataSource.query(`
SELECT COUNT(*) as count
FROM document_numbering_sequences
WHERE current_value > (
SELECT max_value FROM document_numbering_configs
WHERE id = config_id
)
`);
const hasIssue = result[0].count > 0;
return {
name: 'sequence_integrity',
status: hasIssue ? 'degraded' : 'up',
exceeded_sequences: result[0].count,
};
} catch (error) {
return { name: 'sequence_integrity', status: 'down', error: error.message };
}
}
}
```
---
## 4. Monitoring & Alerting
### 4.1 Prometheus Configuration
```yaml
# prometheus.yml
global:
scrape_interval: 15s
evaluation_interval: 15s
alerting:
alertmanagers:
- static_configs:
- targets:
- alertmanager:9093
rule_files:
- "/etc/prometheus/alerts/numbering.yml"
scrape_configs:
- job_name: 'backend'
static_configs:
- targets:
- 'backend-1:3000'
- 'backend-2:3000'
metrics_path: '/metrics'
- job_name: 'redis-numbering'
static_configs:
- targets:
- 'redis-1:6379'
- 'redis-2:6379'
- 'redis-3:6379'
metrics_path: '/metrics'
- job_name: 'mariadb'
static_configs:
- targets:
- 'mariadb-exporter:9104'
```
### 4.2 Alert Manager Configuration
```yaml
# alertmanager.yml
global:
resolve_timeout: 5m
route:
receiver: 'default'
group_by: ['alertname', 'severity']
group_wait: 10s
group_interval: 10s
repeat_interval: 12h
routes:
- match:
severity: critical
receiver: 'critical'
continue: true
- match:
severity: warning
receiver: 'warning'
receivers:
- name: 'default'
slack_configs:
- api_url: 'https://hooks.slack.com/services/YOUR/SLACK/WEBHOOK'
channel: '#lcbp3-alerts'
title: '{{ .GroupLabels.alertname }}'
text: '{{ range .Alerts }}{{ .Annotations.summary }}\n{{ end }}'
- name: 'critical'
email_configs:
- to: 'devops@lcbp3.com'
from: 'alerts@lcbp3.com'
smarthost: 'smtp.gmail.com:587'
auth_username: 'alerts@lcbp3.com'
auth_password: 'your-password'
headers:
Subject: '🚨 CRITICAL: {{ .GroupLabels.alertname }}'
pagerduty_configs:
- service_key: 'YOUR_PAGERDUTY_KEY'
- name: 'warning'
slack_configs:
- api_url: 'https://hooks.slack.com/services/YOUR/SLACK/WEBHOOK'
channel: '#lcbp3-warnings'
```
### 4.3 Grafana Dashboards
#### Import Dashboard JSON
```bash
# Download dashboard template
curl -o numbering-dashboard.json \
https://raw.githubusercontent.com/lcbp3/grafana-dashboards/main/numbering.json
# Import to Grafana
curl -X POST http://admin:admin@localhost:3000/api/dashboards/db \
-H "Content-Type: application/json" \
-d @numbering-dashboard.json
```
#### Key Panels to Monitor
1. **Numbers Generated per Minute** - Rate of number creation
2. **Sequence Utilization** - Current usage vs max (alert >90%)
3. **Lock Wait Time (p95)** - Performance indicator
4. **Lock Failures** - System health indicator
5. **Redis Health (Single instance)** - Node status
6. **Database Connection Pool** - Resource usage
---
## 5. Backup & Recovery
### 5.1 Database Backup Strategy
#### Automated Backup Script
```bash
#!/bin/bash
# scripts/backup-numbering-db.sh
DATE=$(date +%Y%m%d_%H%M%S)
BACKUP_DIR="/backups/numbering"
DB_NAME="lcbp3_production"
echo "🔄 Starting backup at $DATE"
# Create backup directory
mkdir -p $BACKUP_DIR
# Backup numbering tables only
docker exec lcbp3-mariadb mysqldump \
--single-transaction \
--routines \
--triggers \
$DB_NAME \
document_numbering_configs \
document_numbering_sequences \
document_numbering_audit_logs \
> $BACKUP_DIR/numbering_$DATE.sql
# Compress backup
gzip $BACKUP_DIR/numbering_$DATE.sql
# Keep only last 30 days
find $BACKUP_DIR -name "numbering_*.sql.gz" -mtime +30 -delete
echo "✅ Backup complete: numbering_$DATE.sql.gz"
```
#### Cron Schedule
```cron
# Run backup daily at 2 AM
0 2 * * * /opt/lcbp3/scripts/backup-numbering-db.sh >> /var/log/numbering-backup.log 2>&1
# Run integrity check weekly on Sunday at 3 AM
0 3 * * 0 /opt/lcbp3/scripts/check-sequence-integrity.sh >> /var/log/numbering-integrity.log 2>&1
```
### 5.2 Redis Backup
#### Enable RDB Persistence
```conf
# redis.conf
save 900 1 # Save if 1 key changed after 900 seconds
save 300 10 # Save if 10 keys changed after 300 seconds
save 60 10000 # Save if 10000 keys changed after 60 seconds
dbfilename dump.rdb
dir /data
# Enable AOF for durability
appendonly yes
appendfilename "appendonly.aof"
appendfsync everysec
```
#### Backup Script
```bash
#!/bin/bash
# scripts/backup-redis.sh
DATE=$(date +%Y%m%d_%H%M%S)
BACKUP_DIR="/backups/redis"
mkdir -p $BACKUP_DIR
echo "Backing up Redis..."
# Trigger BGSAVE
docker exec cache redis-cli BGSAVE
# Wait for save to complete
sleep 10
# Copy RDB file
docker cp cache:/data/dump.rdb \
$BACKUP_DIR/redis_${DATE}.rdb
# Copy AOF file
docker cp cache:/data/appendonly.aof \
$BACKUP_DIR/redis_${DATE}.aof
# Compress
tar -czf $BACKUP_DIR/redis_${DATE}.tar.gz \
$BACKUP_DIR/redis_${DATE}.rdb \
$BACKUP_DIR/redis_${DATE}.aof
# Cleanup
rm $BACKUP_DIR/redis_${DATE}.rdb $BACKUP_DIR/redis_${DATE}.aof
echo "✅ Redis backup complete: redis_${DATE}.tar.gz"
```
### 5.3 Recovery Procedures
#### Scenario 1: Restore from Database Backup
```bash
#!/bin/bash
# scripts/restore-numbering-db.sh
BACKUP_FILE=$1
if [ -z "$BACKUP_FILE" ]; then
echo "Usage: ./restore-numbering-db.sh <backup_file>"
exit 1
fi
echo "⚠️ WARNING: This will overwrite current numbering data!"
read -p "Continue? (yes/no): " confirm
if [ "$confirm" != "yes" ]; then
echo "Aborted"
exit 0
fi
# Decompress if needed
if [[ $BACKUP_FILE == *.gz ]]; then
gunzip -c $BACKUP_FILE > /tmp/restore.sql
RESTORE_FILE="/tmp/restore.sql"
else
RESTORE_FILE=$BACKUP_FILE
fi
# Restore
docker exec -i lcbp3-mariadb mysql lcbp3_production < $RESTORE_FILE
echo "✅ Restore complete"
echo "🔄 Please verify sequence integrity"
```
#### Scenario 2: Redis Failure
```bash
# Check Redis status
docker exec cache redis-cli ping
# If Redis is down, restart container
docker restart cache
# Verify Redis is running
docker exec cache redis-cli ping
# If restart fails, restore from backup
./scripts/restore-redis.sh /backups/redis/latest.tar.gz
```
---
## 6. Maintenance Procedures
### 6.1 Sequence Adjustment
#### Increase Max Value
```sql
-- Check current utilization
SELECT
dc.document_type,
ds.current_value,
dc.max_value,
ROUND((ds.current_value * 100.0 / dc.max_value), 2) as utilization
FROM document_numbering_sequences ds
JOIN document_numbering_configs dc ON ds.config_id = dc.id
WHERE ds.current_value > dc.max_value * 0.8;
-- Increase max_value for type approaching limit
UPDATE document_numbering_configs
SET max_value = max_value * 10,
updated_at = CURRENT_TIMESTAMP
WHERE document_type = 'COR'
AND max_value < 9999999;
-- Audit log
INSERT INTO document_numbering_audit_logs (
operation, document_type, old_value, new_value,
user_id, metadata
) VALUES (
'ADJUST_MAX_VALUE', 'COR', '999999', '9999999',
1, '{"reason": "Approaching limit", "automated": false}'
);
```
#### Reset Yearly Sequence
```sql
-- For document types with yearly reset
-- Run on January 1st
START TRANSACTION;
-- Create new sequence for new year
INSERT INTO document_numbering_sequences (
config_id,
scope_value,
current_value,
last_used_at
)
SELECT
id as config_id,
YEAR(CURDATE()) as scope_value,
0 as current_value,
NULL as last_used_at
FROM document_numbering_configs
WHERE scope = 'YEARLY';
-- Verify
SELECT * FROM document_numbering_sequences
WHERE scope_value = YEAR(CURDATE());
COMMIT;
```
### 6.2 Cleanup Old Audit Logs
```sql
-- Archive logs older than 2 years
-- Run monthly
START TRANSACTION;
-- Create archive table (if not exists)
CREATE TABLE IF NOT EXISTS document_numbering_audit_logs_archive
LIKE document_numbering_audit_logs;
-- Move old logs to archive
INSERT INTO document_numbering_audit_logs_archive
SELECT * FROM document_numbering_audit_logs
WHERE timestamp < DATE_SUB(CURDATE(), INTERVAL 2 YEAR);
-- Delete from main table
DELETE FROM document_numbering_audit_logs
WHERE timestamp < DATE_SUB(CURDATE(), INTERVAL 2 YEAR);
-- Optimize table
OPTIMIZE TABLE document_numbering_audit_logs;
COMMIT;
-- Export archive to file (optional)
SELECT * FROM document_numbering_audit_logs_archive
INTO OUTFILE '/tmp/audit_archive_2023.csv'
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\n';
```
### 6.3 Redis Maintenance
#### Flush Expired Reservations
```bash
#!/bin/bash
# scripts/cleanup-expired-reservations.sh
echo "🧹 Cleaning up expired reservations..."
# Get all reservation keys
KEYS=$(docker exec lcbp3-redis-1 redis-cli --cluster call 172.20.0.2:6379 KEYS "reservation:*" | grep -v "(error)")
COUNT=0
for KEY in $KEYS; do
# Check TTL
TTL=$(docker exec lcbp3-redis-1 redis-cli TTL "$KEY")
if [ "$TTL" -lt 0 ]; then
# Delete expired key
docker exec lcbp3-redis-1 redis-cli DEL "$KEY"
((COUNT++))
fi
done
echo "✅ Cleaned up $COUNT expired reservations"
```
---
## 7. Disaster Recovery
### 7.1 Total System Failure
#### Recovery Steps
```bash
#!/bin/bash
# scripts/disaster-recovery.sh
echo "🚨 Starting disaster recovery..."
# 1. Start Redis cluster
echo "1⃣ Starting Redis cluster..."
docker-compose -f docker-compose-redis.yml up -d
sleep 30
# 2. Restore Redis backups
echo "2⃣ Restoring Redis backups..."
./scripts/restore-redis.sh /backups/redis/latest.tar.gz
# 3. Start database
echo "3⃣ Starting MariaDB..."
docker-compose -f docker-compose-db.yml up -d
sleep 30
# 4. Restore database
echo "4⃣ Restoring database..."
./scripts/restore-numbering-db.sh /backups/db/latest.sql.gz
# 5. Verify sequence integrity
echo "5⃣ Verifying sequence integrity..."
./scripts/check-sequence-integrity.sh
# 6. Start backend services
echo "6⃣ Starting backend services..."
docker-compose -f docker-compose-backend.yml up -d
# 7. Run health checks
echo "7⃣ Running health checks..."
sleep 60
for i in {1..5}; do
curl -f http://localhost:3001/health || echo "Backend $i not healthy"
done
echo "✅ Disaster recovery complete"
echo "⚠️ Please verify system functionality manually"
```
### 7.2 RTO/RPO Targets
| Scenario | RTO | RPO | Priority |
| ---------------------------- | ------- | ------ | -------- |
| Single backend node failure | 0 min | 0 | P0 |
| Single Redis node failure | 0 min | 0 | P0 |
| Database primary failure | 5 min | 0 | P0 |
| Complete data center failure | 1 hour | 15 min | P1 |
| Data corruption | 4 hours | 1 day | P2 |
---
## 8. Runbooks
### 8.1 High Sequence Utilization (>90%)
**Alert**: `SequenceWarning` or `SequenceCritical`
**Steps**:
1. Check current utilization
```sql
SELECT document_type, current_value, max_value,
ROUND((current_value * 100.0 / max_value), 2) as pct
FROM document_numbering_sequences s
JOIN document_numbering_configs c ON s.config_id = c.id
WHERE current_value > max_value * 0.9;
```
2. Assess impact
- How many numbers left?
- Daily usage rate?
- Days until exhaustion?
3. Take action
```sql
-- Option A: Increase max_value
UPDATE document_numbering_configs
SET max_value = max_value * 10
WHERE document_type = 'COR';
-- Option B: Reset sequence (yearly types only)
-- Schedule for next year/month
```
4. Notify stakeholders
5. Update monitoring thresholds if needed
---
### 8.2 High Lock Wait Time
**Alert**: `HighLockWaitTime`
**Steps**:
1. Check Redis cluster health
```bash
docker exec lcbp3-redis-1 redis-cli cluster info
docker exec lcbp3-redis-1 redis-cli cluster nodes
```
2. Check database locks
```sql
SELECT * FROM information_schema.innodb_lock_waits;
SELECT * FROM information_schema.innodb_trx
WHERE trx_started < NOW() - INTERVAL 30 SECOND;
```
3. Identify bottleneck
- Redis slow?
- Database slow?
- High concurrent load?
4. Take action based on cause:
- **Redis**: Add more nodes, check network latency
- **Database**: Optimize queries, increase connection pool
- **High load**: Scale horizontally (add backend nodes)
5. Monitor improvements
---
### 8.3 Redis Down
**Alert**: `RedisUnavailable`
**Steps**:
1. Verify Redis is down
```bash
docker exec cache redis-cli ping || echo "Redis DOWN"
```
2. Check system falls back to DB-only mode
```bash
curl http://localhost:3001/health/numbering
# Should show: fallback_mode: true
```
3. Restart Redis container
```bash
docker restart cache
sleep 10
docker exec cache redis-cli ping
```
4. If restart fails, restore from backup
```bash
./scripts/restore-redis.sh /backups/redis/latest.tar.gz
```
5. Verify numbering system back to normal
```bash
curl http://localhost:3001/health/numbering
# Should show: fallback_mode: false
```
6. Review logs for root cause
```bash
docker logs cache --tail 100
```
---
## 9. Performance Tuning
### 9.1 Slow Number Generation
**Diagnosis**:
```sql
-- Check slow queries
SELECT * FROM mysql.slow_log
WHERE sql_text LIKE '%document_numbering%'
ORDER BY query_time DESC
LIMIT 10;
-- Check index usage
EXPLAIN SELECT * FROM document_numbering_sequences
WHERE config_id = 1 AND scope_value = '2025'
FOR UPDATE;
```
**Optimizations**:
```sql
-- Add missing indexes
CREATE INDEX idx_sequence_lookup
ON document_numbering_sequences(config_id, scope_value);
-- Optimize table
OPTIMIZE TABLE document_numbering_sequences;
-- Update statistics
ANALYZE TABLE document_numbering_sequences;
```
### 9.2 Redis Memory Optimization
```bash
# Check memory usage
docker exec cache redis-cli INFO memory
# If memory high, check keys
docker exec cache redis-cli --bigkeys
# Set maxmemory policy
docker exec cache redis-cli CONFIG SET maxmemory 2gb
docker exec cache redis-cli CONFIG SET maxmemory-policy allkeys-lru
```
---
## 10. Security Hardening
### 10.1 Redis Security
```conf
# redis.conf
requirepass your-strong-redis-password
bind 0.0.0.0
protected-mode yes
rename-command FLUSHDB ""
rename-command FLUSHALL ""
rename-command CONFIG "CONFIG_abc123"
```
### 10.2 Database Security
```sql
-- Create dedicated numbering user
CREATE USER 'numbering'@'%' IDENTIFIED BY 'strong-password';
-- Grant minimal permissions
GRANT SELECT, INSERT, UPDATE ON lcbp3_production.document_numbering_* TO 'numbering'@'%';
GRANT SELECT ON lcbp3_production.users TO 'numbering'@'%';
FLUSH PRIVILEGES;
```
### 10.3 Network Security
```yaml
# docker-compose-network.yml
networks:
lcbp3:
driver: bridge
ipam:
config:
- subnet: 172.20.0.0/16
driver_opts:
com.docker.network.bridge.name: lcbp3-br
com.docker.network.bridge.enable_icc: "true"
com.docker.network.bridge.enable_ip_masquerade: "true"
```
---
## 11. Compliance & Audit
### 11.1 Audit Log Retention
```sql
-- Export audit logs for compliance
SELECT *
FROM document_numbering

View File

@@ -0,0 +1,994 @@
# 🏗️ System Architecture Specification
---
**title:** 'System Architecture'
**version:** 1.7.0
**status:** first-draft
**owner:** Nattanin Peancharoen
**last_updated:** 2025-12-18
**related:**
- specs/01-requirements/01-02-architecture.md
- specs/01-requirements/01-06-non-functional.md
- specs/03-implementation/03-01-fullftack-js-v1.7.0.md
---
## 📋 ภาพรวม (Overview)
เอกสารนี้อธิบายสถาปัตยกรรมระบบ LCBP3-DMS (Laem Chabang Port Phase 3 - Document Management System) ที่ใช้แนวทาง **Headless/API-First Architecture** พร้อมการ Deploy บน QNAP Server ผ่าน Container Station
## 1. 🎯 Architecture Principles
### 1.1 Component Overview
```
┌──────────────────────────────────────────────────────┐
│ Load Balancer │
│ (Nginx Proxy Manager) │
└────────────┬─────────────────────────────────────────┘
┌────────┴────────┬──────────────┬──────────────┐
│ │ │ │
┌───▼────┐ ┌──────▼──────┐ ┌──▼───┐ ┌─────▼─────┐
│Backend │ │Backend │ │Backend│ │ Backend │
│Node 1 │ │Node 2 │ │Node 3 │ │ Node 4 │
└───┬────┘ └──────┬──────┘ └──┬────┘ └─────┬─────┘
│ │ │ │
└────────────────┴──────────────┴───────────────┘
┌───────────┼───────────┬──────────────┐
│ │ │ │
┌────▼────┐ ┌──▼───┐ ┌───▼────┐ ┌────▼─────┐
│ MariaDB │ │Redis │ │ Redis │ │ Redis │
│ Primary │ │Node 1│ │ Node 2 │ │ Node 3 │
└────┬────┘ └──────┘ └────────┘ └──────────┘
┌────▼────┐
│ MariaDB │
│Replicas │
└─────────┘
```
### 1.2 Component Responsibilities
| Component | Purpose | Critical? |
| --------------- | --------------------------------- | --------- |
| Backend Nodes | API processing, number generation | YES |
| MariaDB Primary | Persistent sequence storage | YES |
| Redis Cluster | Distributed locking, reservations | YES |
| Load Balancer | Traffic distribution | YES |
| Prometheus | Metrics collection | NO |
| Grafana | Monitoring dashboard | NO |
---
### 1.3 Core Principles
1. **Data Integrity First:** ความถูกต้องของข้อมูลต้องมาก่อนทุกอย่าง
2. **Security by Design:** รักษาความปลอดภัยที่ทุกชั้น
3. **Scalability:** รองรับการเติบโตในอนาคต
4. **Resilience:** ทนทานต่อ Failure และ Recovery ได้รวดเร็ว
5. **Observability:** ติดตามและวิเคราะห์สถานะระบบได้ง่าย
### 1.4 Architecture Style
- **Headless CMS Architecture:** แยก Frontend และ Backend เป็นอิสระ
- **API-First:** Backend เป็น API Server ที่ Frontend หรือ Third-party สามารถเรียกใช้ได้
- **Microservices-Ready:** ออกแบบเป็น Modular Architecture พร้อมแยกเป็น Microservices ในอนาคต
## 2. 🏢 Infrastructure & Deployment
### 2.1 Server Infrastructure
- **Server:** QNAP TS-473A
- CPU: AMD Ryzen V1500B
- RAM: 32GB
- Storage: /share/dms-data
- **IP Address:** 159.192.126.103
- **Domain:** np-dms.work, <www.np-dms.work>
- **Containerization:** Docker & Docker Compose via Container Station
- **Development Environment:** VS Code/Cursor on Windows 11
### 2.2 Network Architecture
```mermaid
graph TB
Internet[Internet] --> NPM[Nginx Proxy Manager<br/>npm.np-dms.work]
NPM --> Frontend[Next.js Frontend<br/>lcbp3.np-dms.work]
NPM --> Backend[NestJS Backend<br/>backend.np-dms.work]
NPM --> PMA[phpMyAdmin<br/>pma.np-dms.work]
NPM --> N8N[n8n Workflow<br/>n8n.np-dms.work]
NPM --> Gitea[Gitea Git<br/>git.np-dms.work]
Backend --> MariaDB[(MariaDB 11.8<br/>db.np-dms.work)]
Backend --> Redis[(Redis Cache)]
Backend --> Elastic[Elasticsearch]
Backend --> Storage[File Storage<br/>/share/dms-data]
N8N --> Line[LINE Notify]
Backend --> Email[Email Server]
```
**Docker Network:**
- Network Name: `lcbp3`
- ทุก Service เชื่อมต่อผ่าน Internal Docker Network เพื่อความปลอดภัย
### 2.3 Configuration Management
> [!WARNING] > **ข้อจำกัดสำคัญ:** QNAP Container Station ไม่รองรับการใช้ `.env` files ในการกำหนด Environment Variables
**Configuration Strategy:**
1. **Production/Staging:**
- ใช้ `docker-compose.yml` สำหรับกำหนด Environment Variables
- ห้ามระบุ Sensitive Secrets (Password, Keys) ใน `docker-compose.yml` หลัก
- ใช้ `docker-compose.override.yml` (gitignored) สำหรับ Secrets
- พิจารณาใช้ Docker Secrets หรือ Hashicorp Vault
2. **Development:**
- ใช้ `docker-compose.override.yml` สำหรับ Local Secrets
- ไฟล์หลักใส่ค่า Dummy/Placeholder
3. **Validation:**
- ใช้ Joi/Zod validate Environment Variables ตอน App Start
- Throw Error ทันทีหากขาด Variable สำคัญ
## 3. 🔧 Core Services
### 3.1 Service Overview
| Service | Application Name | Domain | Technology | Purpose |
| :---------------- | :--------------- | :------------------ | :----------------------- | :-------------------------- |
| **Frontend** | lcbp3-frontend | lcbp3.np-dms.work | Next.js 14+ (App Router) | Web Application UI |
| **Backend** | lcbp3-backend | backend.np-dms.work | NestJS (TypeScript) | API Server & Business Logic |
| **Database** | lcbp3-db | db.np-dms.work | MariaDB 11.8 | Primary Database |
| **DB Management** | lcbp3-db | pma.np-dms.work | phpMyAdmin | Database Admin UI |
| **Reverse Proxy** | lcbp3-npm | npm.np-dms.work | Nginx Proxy Manager | Reverse Proxy & SSL |
| **Workflow** | lcbp3-n8n | n8n.np-dms.work | n8n | Workflow Automation |
| **Git** | git | git.np-dms.work | Gitea | Self-hosted Git |
| **Cache** | - | - | Redis | Caching & Locking |
| **Search** | - | - | Elasticsearch | Full-text Search |
### 3.2 Frontend (Next.js)
**Stack:**
- **Framework:** Next.js 14+ with App Router
- **Language:** TypeScript (ESM)
- **Styling:** Tailwind CSS + PostCSS
- **Components:** shadcn/ui
- **State Management:**
- Server State: TanStack Query (React Query)
- Form State: React Hook Form + Zod
- UI State: useState/useReducer
**Responsibilities:**
- Render Web UI สำหรับผู้ใช้
- จัดการ User Interactions
- เรียก Backend API
- Client-side Validation
- Responsive Design (Desktop + Mobile)
### 3.3 Backend (NestJS)
**Stack:**
- **Framework:** NestJS (Node.js + TypeScript)
- **ORM:** TypeORM
- **Authentication:** JWT + Passport
- **Authorization:** CASL (RBAC)
- **Validation:** class-validator + class-transformer
- **Documentation:** Swagger/OpenAPI
**Responsibilities:**
- ให้บริการ RESTful API
- Business Logic Processing
- Authentication & Authorization
- Data Validation
- Database Operations
- File Upload Handling (Two-Phase Storage)
- Workflow Engine
- Background Jobs (Notifications, Cleanup)
### 3.4 Database (MariaDB 11.8)
**Features:**
- **JSON Support:** จัดเก็บ `details` fields (Dynamic Schema)
- **Virtual Columns:** Index JSON fields สำหรับ Performance
- **Partitioning:** สำหรับ `audit_logs` และ `notifications`
- **Optimistic Locking:** ใช้ `@VersionColumn()` ป้องกัน Race Condition
**Key Tables:**
- Users & Permissions: `users`, `roles`, `permissions`, `user_roles`
- Projects: `projects`, `organizations`, `contracts`, `project_parties`
- Documents: `correspondences`, `rfas`, `shop_drawings`, `contract_drawings`
- Workflow: `workflow_definitions`, `workflow_instances`, `workflow_history`
- Files: `attachments`, `correspondence_attachments`, etc.
- Audit: `audit_logs`
### 3.5 Redis
**Use Cases:**
1. **Distributed Locking:** Document Numbering, Critical Operations
2. **Session Caching:** User Permissions, Profile Data
3. **Master Data Cache:** Roles, Permissions, Organizations (TTL: 1 hour)
4. **Queue Management:** BullMQ for Background Jobs
5. **Rate Limiting:** Track API Request Counts
### 3.6 Elasticsearch
**Use Cases:**
- **Full-text Search:** Search across Correspondence, RFA, Drawings
- **Advanced Filtering:** Multi-criteria Search
- **Aggregations:** Statistics และ Dashboard Data
**Indexing Strategy:**
- Index อัตโนมัติเมื่อ Create/Update เอกสาร
- Async Indexing ผ่าน Queue (ไม่ Block Main Request)
## 4. 🧱 Backend Module Architecture
### 4.1 Modular Design
```mermaid
graph TB
subgraph "Core Modules"
Common[CommonModule<br/>Shared Services]
Auth[AuthModule<br/>JWT & Guards]
User[UserModule<br/>User Management]
end
subgraph "Business Modules"
Project[ProjectModule<br/>Projects & Contracts]
Corr[CorrespondenceModule<br/>Correspondences]
RFA[RfaModule<br/>RFA Management]
Drawing[DrawingModule<br/>Shop & Contract Drawings]
Circ[CirculationModule<br/>Circulation Sheets]
Trans[TransmittalModule<br/>Transmittals]
end
subgraph "Supporting Modules"
Workflow[WorkflowEngineModule<br/>Unified Workflow]
Numbering[DocumentNumberingModule<br/>Auto Numbering]
Search[SearchModule<br/>Elasticsearch]
Master[MasterModule<br/>Master Data]
JSON[JsonSchemaModule<br/>JSON Validation]
end
Corr --> Workflow
RFA --> Workflow
Circ --> Workflow
Corr --> Numbering
RFA --> Numbering
Search --> Corr
Search --> RFA
Search --> Drawing
```
### 4.2 Module Descriptions
#### 4.2.1 CommonModule
**Responsibilities:**
- Database Configuration
- FileStorageService (Two-Phase Upload)
- AuditLogService
- NotificationService
- Shared DTOs, Guards, Interceptors
#### 4.2.2 AuthModule
**Responsibilities:**
- JWT Token Management
- Authentication Guards
- 4-Level Permission Checking:
- Global (Superadmin)
- Organization
- Project
- Contract
- Token Refresh & Revocation
#### 4.2.3 UserModule
**Responsibilities:**
- User CRUD Operations
- Role Assignment
- Permission Management
- User Profile Management
#### 4.2.4 ProjectModule
**Responsibilities:**
- Project Management
- Contract Management
- Organization Management
- Project Parties & Contract Parties
#### 4.2.5 CorrespondenceModule
**Responsibilities:**
- Correspondence CRUD
- Revision Management
- Attachment Handling
- Workflow Integration (Routing)
#### 4.2.6 RfaModule
**Responsibilities:**
- RFA CRUD
- RFA Item Management
- Workflow Integration (Approval Process)
- Respond/Approve Actions
#### 4.2.7 DrawingModule
**Responsibilities:**
- Shop Drawing Management
- Contract Drawing Management
- Drawing Categories
- Revision Tracking
- Drawing References
#### 4.2.8 CirculationModule
**Responsibilities:**
- Circulation Sheet Management
- Circulation Templates
- Assignees Management
- Workflow Integration (Internal Circulation)
#### 4.2.9 WorkflowEngineModule (Core)
> [!IMPORTANT] > **Unified Workflow Engine** - ระบบกลางสำหรับจัดการ Workflow ทั้งหมด
**Features:**
- DSL-Based Workflow Definitions (JSON Configuration)
- State Machine Management
- Workflow Instance Tracking
- History & Audit Trail
- Workflow Versioning
**Entities:**
- `WorkflowDefinition`: กำหนด Workflow Template
- `WorkflowInstance`: Instance ที่กำลังรัน
- `WorkflowHistory`: ประวัติการเปลี่ยน State
**Integration:**
- CorrespondenceModule → Routing Workflow
- RfaModule → Approval Workflow
- CirculationModule → Internal Circulation Workflow
#### 4.2.10 DocumentNumberingModule (Internal)
**Responsibilities:**
- Auto-generate Document Numbers
- Token-Based Generator: `{CONTRACT}-{TYPE}-{DISCIPLINE}-{SEQ:4}`
- **Double-Lock Mechanism:**
- Layer 1: Redis Distributed Lock
- Layer 2: Optimistic Database Lock (`@VersionColumn()`)
**Algorithm:**
1. Parse Template → Identify Required Tokens
2. Acquire Redis Lock (Key: `project_id:type_id:discipline_id:year`)
3. Query `document_number_counters` Table
4. Increment Counter (Check Version)
5. Generate Final Number
6. Release Lock
7. Retry on Conflict (Exponential Backoff)
#### 4.2.11 SearchModule
**Responsibilities:**
- Elasticsearch Integration
- Full-text Search across Documents
- Advanced Filtering
- Search Result Aggregation
#### 4.2.12 JsonSchemaModule (Internal)
**Responsibilities:**
- JSON Schema Validation (AJV)
- Schema Versioning & Migration
- Dynamic Schema Generation
- Data Transformation
## 5. 📊 Data Flow Architecture
### 5.1 Main Request Flow
```mermaid
sequenceDiagram
participant Client as Client (Browser)
participant NPM as Nginx Proxy
participant BE as Backend (NestJS)
participant Redis as Redis Cache
participant DB as MariaDB
participant ES as Elasticsearch
Client->>NPM: HTTP Request + JWT
NPM->>BE: Forward Request
BE->>BE: Rate Limit Check
BE->>BE: Validate Input (DTO)
BE->>BE: JWT Auth Guard
BE->>Redis: Get User Permissions
Redis-->>BE: Permission Data
BE->>BE: RBAC Guard (Check Permission)
BE->>DB: Query Data
DB-->>BE: Return Data
BE->>BE: Business Logic Processing
BE->>DB: Save Changes (Transaction)
BE->>ES: Index for Search
BE->>Redis: Invalidate Cache
BE->>DB: Audit Log
BE-->>Client: JSON Response
```
### 5.2 File Upload Flow (Two-Phase Storage)
> [!IMPORTANT] > **Two-Phase Storage** ป้องกัน Orphan Files และรักษา Data Integrity
```mermaid
sequenceDiagram
participant Client
participant Backend
participant ClamAV as Virus Scanner
participant TempStorage as Temp Storage
participant PermStorage as Permanent Storage
participant DB as Database
Client->>Backend: Upload File
Backend->>ClamAV: Scan Virus
ClamAV-->>Backend: Scan Result (CLEAN/INFECTED)
alt File is CLEAN
Backend->>TempStorage: Save to temp/
Backend-->>Client: Return temp_id
Client->>Backend: POST Create Document (include temp_id)
Backend->>DB: BEGIN Transaction
Backend->>DB: Create Document Record
Backend->>PermStorage: Move temp/ → permanent/{YYYY}/{MM}/
Backend->>DB: Create Attachment Record
Backend->>DB: COMMIT Transaction
Backend-->>Client: Success Response
else File is INFECTED
Backend-->>Client: Error: Virus Detected
end
Note over Backend,TempStorage: Cron Job: Delete temp files > 24h
```
### 5.3 Document Numbering Flow
```mermaid
sequenceDiagram
participant Service as Correspondence Service
participant Numbering as Numbering Service
participant Redis
participant DB as MariaDB
Service->>Numbering: generateNextNumber(context)
Numbering->>Numbering: Parse Template
Numbering->>Redis: ACQUIRE Lock (project:type:year)
alt Lock Acquired
Redis-->>Numbering: Lock Success
Numbering->>DB: SELECT counter (with version)
DB-->>Numbering: current_number, version
Numbering->>DB: UPDATE counter SET last_number = X, version = version + 1 WHERE version = old_version
alt Update Success
DB-->>Numbering: Success
Numbering->>Numbering: Generate Final Number
Numbering->>Redis: RELEASE Lock
Numbering-->>Service: Document Number
else Version Conflict (Race Condition)
DB-->>Numbering: Update Failed
Numbering->>Redis: RELEASE Lock
Numbering->>Numbering: Retry (Exponential Backoff)
end
else Lock Failed
Redis-->>Numbering: Lock Timeout
Numbering->>Numbering: Retry or Fail
end
```
### 5.4 Workflow Execution Flow
```mermaid
sequenceDiagram
participant User
participant Module as Correspondence Module
participant Engine as Workflow Engine
participant DB
participant Notify as Notification Service
User->>Module: Create Correspondence
Module->>Engine: createWorkflowInstance(definition_id, entity_id)
Engine->>DB: Create workflow_instance
Engine->>DB: Set initial state
Engine-->>Module: Instance Created
Module-->>User: Success
User->>Module: Execute Action (e.g., "Send")
Module->>Engine: executeTransition(instance_id, action)
Engine->>DB: Check current state
Engine->>Engine: Validate transition (DSL)
alt Transition Valid
Engine->>DB: Update state
Engine->>DB: Create workflow_history
Engine->>Notify: Trigger Notification
Notify->>Notify: Queue Email/Line
Engine-->>Module: Transition Success
Module-->>User: Action Completed
else Invalid Transition
Engine-->>Module: Error: Invalid State Transition
Module-->>User: Error Response
end
```
## 6. 🛡️ Security Architecture
### 6.1 Security Layers
```mermaid
graph TB
subgraph "Layer 1: Network Security"
SSL[SSL/TLS<br/>Nginx Proxy Manager]
Firewall[Firewall Rules<br/>QNAP]
end
subgraph "Layer 2: Application Security"
RateLimit[Rate Limiting]
CSRF[CSRF Protection]
XSS[XSS Prevention]
Input[Input Validation]
end
subgraph "Layer 3: Authentication"
JWT[JWT Tokens]
Refresh[Token Refresh]
Revoke[Token Revocation]
end
subgraph "Layer 4: Authorization"
RBAC[4-Level RBAC]
Guards[Permission Guards]
CASL[CASL Rules]
end
subgraph "Layer 5: Data Security"
Encrypt[Data Encryption]
Audit[Audit Logs]
Backup[Backups]
end
subgraph "Layer 6: File Security"
Virus[Virus Scanning]
FileType[Type Validation]
FileAccess[Access Control]
end
```
### 6.2 Authentication & Authorization Details
**JWT Token Structure:**
```json
{
"sub": "user_id",
"scope": "organization_id|project_id|contract_id",
"iat": 1638360000,
"exp": 1638388800
}
```
**Permission Checking Logic:**
1. Extract JWT from `Authorization: Bearer <token>`
2. Validate Token (Signature, Expiration)
3. Get User Permissions from Redis Cache (Key: `user:{user_id}:permissions`)
4. Check Permission based on Context:
- Global Permission (Superadmin)
- Organization Permission
- Project Permission (if in project context)
- Contract Permission (if in contract context)
5. Allow if **any level** grants permission (Most Permissive)
### 6.3 Rate Limiting
| Endpoint Category | Limit | Tracking |
| :---------------- | :------------ | :--------- |
| Anonymous | 100 req/hour | IP Address |
| Authentication | 10 req/min | IP Address |
| File Upload | 50 req/hour | User ID |
| Search | 500 req/hour | User ID |
| Viewer | 500 req/hour | User ID |
| Editor | 1000 req/hour | User ID |
| Document Control | 2000 req/hour | User ID |
| Admin | 5000 req/hour | User ID |
**Implementation:** `rate-limiter-flexible` library with Redis backend
### 6.4 Input Validation
**Frontend (Client-Side):**
- React Hook Form + Zod Schema Validation
- Sanitize User Inputs before Display
**Backend (Server-Side):**
- class-validator DTOs
- Whitelist Validation (`@ValidateIf`, `@IsEnum`, etc.)
- Transform Pipes
**File Upload Validation:**
1. **File Type Validation:**
- White-list: PDF, DWG, DOCX, XLSX, ZIP
- Magic Number Verification (ไม่ใช่แค่ extension)
2. **File Size Validation:**
- Maximum: 50MB per file
3. **Virus Scanning:**
- ClamAV Integration
- Scan before saving to temp storage
### 6.5 OWASP Top 10 Protection
| Vulnerability | Protection Measure |
| :-------------------------------- | :----------------------------------- |
| SQL Injection | Parameterized Queries (TypeORM) |
| XSS | Input Sanitization + Output Encoding |
| CSRF | CSRF Tokens (State-changing ops) |
| Broken Auth | JWT + Secure Token Management |
| Security Misconfiguration | Security Headers (Helmet.js) |
| Sensitive Data Exposure | Encryption + Secure Storage |
| Insufficient Logging | Comprehensive Audit Logs |
| Insecure Deserialization | Input Validation |
| Using Known Vulnerable Components | Regular Dependency Updates |
## 7. 📈 Performance & Scalability
### 7.1 Caching Strategy
| Data Type | Cache Location | TTL | Invalidation |
| :--------------- | :------------- | :------ | :------------------------ |
| User Permissions | Redis | 30 min | On role/permission change |
| Master Data | Redis | 1 hour | On update |
| Search Results | Redis | 15 min | Time-based |
| File Metadata | Redis | 1 hour | On file update |
| Session Data | Redis | 8 hours | On logout |
### 7.2 Database Optimization
**Indexes:**
- Foreign Keys (Auto-indexed)
- Search Columns (`idx_cor_project`, `idx_rfa_status`, etc.)
- JSON Virtual Columns (for frequently queried JSON fields)
**Partitioning:**
- `audit_logs`: Partitioned by Year
- `notifications`: Partitioned by Month
- Automated Partition Creation (Cron Job)
**Query Optimization:**
- Use Views for Complex Queries (`v_current_correspondences`, `v_user_tasks`)
- Pagination for Large Datasets
- Eager/Lazy Loading Strategy
### 7.3 Performance Targets
| Metric | Target | Measurement |
| :---------------------------------- | :------ | :------------- |
| API Response Time (90th percentile) | < 200ms | Simple CRUD |
| Search Query Performance | < 500ms | Complex Search |
| File Upload Processing | < 30s | 50MB file |
| Concurrent Users | 100+ | Simultaneous |
| Cache Hit Ratio | > 80% | Master Data |
| Application Startup | < 30s | Cold Start |
## 8. 🔄 Resilience & Error Handling
### 8.1 Resilience Patterns
**Circuit Breaker:**
- Applied to: Elasticsearch, Email Service, LINE Notify
- Threshold: 5 failures in 1 minute
- Timeout: 30 seconds
- Recovery: Half-open after 1 minute
**Retry Mechanism:**
- Strategy: Exponential Backoff
- Max Retries: 3
- Applied to: External API Calls, Document Numbering
**Graceful Degradation:**
- Search Service Down → Return cached results or basic search
- Email Service Down → Queue for later retry
- LINE Notify Down → Log error, continue operation
### 8.2 Error Handling
**Backend:**
- Global Exception Filter
- Structured Error Response Format
- Error Logging with Context (Winston)
- Don't Expose Internal Details in Error Messages
**Frontend:**
- Error Boundaries (React)
- Toast Notifications
- Fallback UI Components
- Retry Mechanisms for Failed Requests
## 9. 📊 Monitoring & Observability
### 9.1 Health Checks
**Endpoints:**
```
GET /health # Overall health
GET /health/ready # Readiness probe
GET /health/live # Liveness probe
```
**Checks:**
- Database Connection
- Redis Connection
- Elasticsearch Connection
- Disk Space
- Memory Usage
### 9.2 Metrics Collection
**Application Metrics:**
- Request Rate (req/sec)
- Response Time (p50, p90, p99)
- Error Rate
- Active Connections
**Business Metrics:**
- Documents Created per Day
- Workflow Completion Rate
- User Activity
- Search Query Performance
**Infrastructure Metrics:**
- CPU Usage
- Memory Usage
- Disk I/O
- Network Throughput
### 9.3 Logging Strategy
> [!WARNING] > **QNAP Storage Constraints:** ต้องจำกัดปริมาณ Logs
**Log Levels:**
- **Production:** WARN and ERROR only
- **Staging:** INFO for critical business flows
- **Development:** DEBUG allowed
**Structured Logging:**
```json
{
"timestamp": "2025-11-30T13:48:20Z",
"level": "INFO",
"service": "backend",
"module": "CorrespondenceModule",
"action": "create",
"user_id": 1,
"ip_address": "192.168.1.100",
"duration_ms": 45,
"message": "Correspondence created successfully"
}
```
**Log Rotation:**
- Rotate Daily
- Keep 7 days
- Compress Old Logs
### 9.4 Audit Logging
**Scope:**
- All CRUD Operations on Critical Data
- Permission Changes
- Login Attempts (Success/Failure)
- File Downloads
- Workflow State Changes
**Audit Log Fields:**
- `user_id`
- `action` (e.g., `correspondence.create`)
- `entity_type`, `entity_id`
- `old_values`, `new_values` (for updates)
- `ip_address`, `user_agent`
- `timestamp`
## 10. 💾 Backup & Disaster Recovery
### 10.1 Backup Strategy
**Database Backup:**
- **Frequency:** Daily (Automated)
- **Method:** Full Backup + Transaction Logs
- **Retention:** 30 days
- **Tool:** QNAP HBS 3 or mysqldump
**File Storage Backup:**
- **Frequency:** Daily
- **Path:** `/share/dms-data`
- **Method:** Incremental Backup
- **Retention:** 30 days
- **Tool:** QNAP Snapshot or rsync
### 10.2 Disaster Recovery
**Recovery Objectives:**
- **RTO (Recovery Time Objective):** < 4 hours
- **RPO (Recovery Point Objective):** < 1 hour
**Recovery Procedures:**
1. **Database Restoration:**
- Restore latest full backup
- Apply transaction logs to point-in-time
- Verify data integrity
2. **File Storage Restoration:**
- Restore from QNAP snapshot
- Verify file permissions
3. **Application Redeployment:**
- Deploy from known-good Docker images
- Verify health checks
4. **Data Integrity Verification:**
- Run consistency checks
- Verify critical business data
## 11. 🏗️ Deployment Architecture
### 11.1 Container Deployment
**Docker Compose Services:**
```yaml
services:
frontend:
image: lcbp3-frontend:latest
networks: [lcbp3]
depends_on: [backend]
backend:
image: lcbp3-backend:latest
networks: [lcbp3]
depends_on: [mariadb, redis, elasticsearch]
mariadb:
image: mariadb:10.11
networks: [lcbp3]
volumes: [/share/dms-data/mysql:/var/lib/mysql]
redis:
image: redis:7-alpine
networks: [lcbp3]
elasticsearch:
image: elasticsearch:8.x
networks: [lcbp3]
nginx-proxy-manager:
image: jc21/nginx-proxy-manager:latest
networks: [lcbp3]
ports: [80:80, 443:443]
```
### 11.2 CI/CD Pipeline (Future)
```mermaid
graph LR
Git[Gitea Repository] --> Build[Build & Test]
Build --> StagingDeploy[Deploy to Staging]
StagingDeploy --> Test[Run E2E Tests]
Test --> Manual[Manual Approval]
Manual --> ProdDeploy[Deploy to Production]
ProdDeploy --> Monitor[Monitor & Alert]
```
## 12.🎯 Future Enhancements
### 12.1 Scalability Improvements
- [ ] Separate into Microservices (when needed)
- [ ] Add Load Balancer (HAProxy/Nginx)
- [ ] Database Replication (Master-Slave)
- [ ] Message Queue (RabbitMQ/Kafka) for async processing
### 12.2 Advanced Features
- [ ] AI-Powered Document Classification
- [ ] Advanced Analytics & Reporting
- [ ] Mobile Native Apps
- [ ] Blockchain Integration for Document Integrity
### 12.3 Infrastructure Enhancements
- [ ] Multi-Region Deployment
- [ ] CDN for Static Assets
- [ ] Automated Failover
- [ ] Blue-Green Deployment
---
**Document Control:**
- **Version:** 1.6.2
- **Status:** Active
- **Last Updated:** 2025-12-17
- **Owner:** Nattanin Peancharoen
```
```

View File

@@ -0,0 +1,552 @@
# 🌐 API Design Specification
---
**title:** 'API Design'
**version:** 1.7.0
**status:** active
**owner:** Nattanin Peancharoen
**last_updated:** 2025-12-18
**related:**
- specs/01-requirements/01-02-architecture.md
- specs/02-architecture/02-01-system-architecture.md
- specs/03-implementation/03-01-fullftack-js-v1.7.0.md
---
## 📋ภาพรวม (Overview)
เอกสารนี้กำหนดมาตรฐานการออกแบบ API สำหรับระบบ LCBP3-DMS โดยใช้แนวทาง **API-First Design** ที่เน้นความชัดเจน ความสอดคล้อง และความปลอดภัย
## 🎯 หลักการออกแบบ API (API Design Principles)
### 1.1 API-First Approach
- **ออกแบบ API ก่อนการ Implement:** ทำการออกแบบ API Endpoint และ Data Contract ให้ชัดเจนก่อนเริ่มเขียนโค้ด
- **Documentation-Driven:** ใช้ OpenAPI/Swagger เป็นเอกสารอ้างอิงหลัก
- **Contract Testing:** ทดสอบ API ตาม Contract ที่กำหนดไว้
### 1.2 RESTful Principles
- ใช้ HTTP Methods อย่างถูกต้อง: `GET`, `POST`, `PUT`, `PATCH`, `DELETE`
- ใช้ HTTP Status Codes ที่เหมาะสม
- Resource-Based URL Design
- Stateless Communication
### 1.3 Consistency & Predictability
- **Naming Conventions:** ใช้ `kebab-case` สำหรับ URL paths
- **Property Naming:** ใช้ `camelCase` สำหรับ JSON properties และ query parameters (สอดคล้องกับ TypeScript/JavaScript conventions)
- **Database Columns:** Database ใช้ `snake_case` (mapped via TypeORM decorators)
- **Versioning:** รองรับการ Version API ผ่าน URL path (`/api/v1/...`)
## 🔐 Authentication & Authorization
### 2.1 Authentication
- **JWT-Based Authentication:** ใช้ JSON Web Token สำหรับการยืนยันตัวตน
- **Token Management:**
- Access Token Expiration: 8 ชั่วโมง
- Refresh Token Expiration: 7 วัน
- Token Rotation: รองรับการหมุนเวียน Refresh Token
- Token Revocation: บันทึก Revoked Tokens จนกว่าจะหมดอายุ
**Endpoints:**
```typescript
POST / api / v1 / auth / login;
POST / api / v1 / auth / logout;
POST / api / v1 / auth / refresh;
POST / api / v1 / auth / change - password;
```
### 2.2 Authorization (RBAC)
- **4-Level Permission Hierarchy:**
1. **Global Level:** System-wide permissions (Superadmin)
2. **Organization Level:** Organization-specific permissions
3. **Project Level:** Project-specific permissions
4. **Contract Level:** Contract-specific permissions
- **Permission Checking:** ใช้ Decorator `@RequirePermission('resource.action')`
**Example:**
```typescript
@RequirePermission('correspondence.create')
@Post('correspondences')
async createCorrespondence(@Body() dto: CreateCorrespondenceDto) {
// Implementation
}
```
### 2.3 Token Payload Optimization
- JWT Payload เก็บเฉพาะ `userId` และ `scope` ปัจจุบัน
- **Permissions Caching:** เก็บ Permission List ใน Redis และดึงมาตรวจสอบเมื่อมี Request
## 📡 API Conventions
### 3.1 Base URL Structure
```
https://backend.np-dms.work/api/v1/{resource}
```
### 3.2 HTTP Methods & Usage
| Method | Usage | Idempotent | Example |
| :------- | :--------------------------- | :--------- | :----------------------------------- |
| `GET` | ดึงข้อมูล (Read) | ✅ Yes | `GET /api/v1/correspondences` |
| `POST` | สร้างข้อมูลใหม่ (Create) | ❌ No\* | `POST /api/v1/correspondences` |
| `PUT` | อัปเดตทั้งหมด (Full Update) | ✅ Yes | `PUT /api/v1/correspondences/:id` |
| `PATCH` | อัปเดตบางส่วน (Partial Update) | ✅ Yes | `PATCH /api/v1/correspondences/:id` |
| `DELETE` | ลบข้อมูล (Soft Delete) | ✅ Yes | `DELETE /api/v1/correspondences/:id` |
**Note:** `POST` เป็น Idempotent ได้เมื่อใช้ `Idempotency-Key` Header
### 3.3 HTTP Status Codes
| Status Code | Usage |
| :-------------------------- | :----------------------------- |
| `200 OK` | Request สำเร็จ (GET, PUT, PATCH) |
| `201 Created` | สร้างข้อมูลสำเร็จ (POST) |
| `204 No Content` | ลบสำเร็จ (DELETE) |
| `400 Bad Request` | ข้อมูล Request ไม่ถูกต้อง |
| `401 Unauthorized` | ไม่มี Token หรือ Token หมดอายุ |
| `403 Forbidden` | ไม่มีสิทธิ์เข้าถึง |
| `404 Not Found` | ไม่พบข้อมูล |
| `409 Conflict` | ข้อมูลซ้ำ หรือ State Conflict |
| `422 Unprocessable Entity` | Validation Error |
| `429 Too Many Requests` | Rate Limit Exceeded |
| `500 Internal Server Error` | Server Error |
| `503 Service Unavailable` | Maintenance Mode |
### 3.4 Request & Response Format
**Request Headers:**
```http
Content-Type: application/json
Authorization: Bearer <access_token>
Idempotency-Key: <uuid> # POST/PUT/DELETE
```
**Success Response Format:**
```typescript
{
"success": true,
"data": {
// Resource data
},
"message": "Operation completed successfully"
}
```
**Error Response Format:**
```typescript
{
"success": false,
"error": {
"code": "VALIDATION_ERROR",
"message": "Validation failed",
"details": [
{
"field": "email",
"message": "Invalid email format"
}
]
},
"timestamp": "2025-11-30T13:48:20Z",
"path": "/api/v1/users"
}
```
## 🔄 Idempotency
### 4.1 Implementation
- **ทุก Critical Operation** (Create, Update, Delete) ต้องรองรับ Idempotency
- Client ส่ง Header: `Idempotency-Key: <uuid>`
- Server เช็คว่า Key นี้เคยประมวลผลสำเร็จแล้วหรือไม่
- ถ้าเคยทำแล้ว: ส่งผลลัพธ์เดิมกลับไป (ไม่ทำซ้ำ)
**Example:**
```http
POST /api/v1/correspondences
Idempotency-Key: 550e8400-e29b-41d4-a716-446655440000
Content-Type: application/json
{
"title": "New Correspondence",
"type_id": 1
}
```
## 📊 Pagination, Filtering & Sorting
### 5.1 Pagination (Server-Side)
**Query Parameters:**
```
GET /api/v1/correspondences?page=1&page_size=20
```
**Response:**
```typescript
{
"success": true,
"data": [...],
"meta": {
"current_page": 1,
"page_size": 20,
"total_items": 150,
"total_pages": 8,
"has_next_page": true,
"has_previous_page": false
}
}
```
### 5.2 Filtering
```
GET /api/v1/correspondences?project_id=1&status=PENDING
```
### 5.3 Sorting
```
GET /api/v1/correspondences?sort=createdAt&order=desc
```
### 5.4 Combined Example
```
GET /api/v1/correspondences?project_id=1&status=PENDING&page=1&page_size=20&sort=createdAt&order=desc
```
## 🛡️ Security Features
### 6.1 Rate Limiting
| Endpoint Type | Limit | Scope |
| :------------------ | :----------------- | :---- |
| Anonymous Endpoints | 100 requests/hour | IP |
| Viewer | 500 requests/hour | User |
| Editor | 1000 requests/hour | User |
| Document Control | 2000 requests/hour | User |
| Admin/Superadmin | 5000 requests/hour | User |
| File Upload | 50 requests/hour | User |
| Search | 500 requests/hour | User |
| Authentication | 10 requests/minute | IP |
**Rate Limit Headers:**
```http
X-RateLimit-Limit: 1000
X-RateLimit-Remaining: 999
X-RateLimit-Reset: 1638360000
```
### 6.2 Input Validation
- **DTOs with Class Validator:** ทุก Request ต้องผ่าน Validation
- **XSS Protection:** Input Sanitization
- **SQL Injection Prevention:** ใช้ ORM (TypeORM) Parameterized Queries
- **CSRF Protection:** CSRF Tokens สำหรับ State-Changing Operations
### 6.3 File Upload Security
**Endpoint:**
```
POST /api/v1/files/upload
```
**Security Measures:**
- **Virus Scanning:** ใช้ ClamAV scan ทุกไฟล์
- **File Type Validation:** White-list (PDF, DWG, DOCX, XLSX, ZIP)
- **File Size Limit:** 50MB per file
- **Two-Phase Storage:**
1. Upload to `temp/` folder
2. Commit to `permanent/{YYYY}/{MM}/` when operation succeeds
**Response:**
```typescript
{
"success": true,
"data": {
"temp_id": "uuid",
"filename": "document.pdf",
"size": 1024000,
"mime_type": "application/pdf",
"scan_status": "CLEAN"
}
}
```
## 📦 Core Module APIs
### 7.1 Correspondence Module
**Base Path:** `/api/v1/correspondences`
| Method | Endpoint | Permission | Description |
| :----- | :--------------------------------- | :---------------------- | :-------------------- |
| GET | `/correspondences` | `correspondence.view` | รายการ Correspondence |
| GET | `/correspondences/:id` | `correspondence.view` | รายละเอียด |
| POST | `/correspondences` | `correspondence.create` | สร้างใหม่ |
| PUT | `/correspondences/:id` | `correspondence.update` | อัปเดตทั้งหมด |
| PATCH | `/correspondences/:id` | `correspondence.update` | อัปเดตบางส่วน |
| DELETE | `/correspondences/:id` | `correspondence.delete` | ลบ (Soft Delete) |
| POST | `/correspondences/:id/revisions` | `correspondence.update` | สร้าง Revision ใหม่ |
| GET | `/correspondences/:id/revisions` | `correspondence.view` | ดู Revisions ทั้งหมด |
| POST | `/correspondences/:id/attachments` | `correspondence.update` | เพิ่มไฟล์แนบ |
### 7.2 RFA Module
**Base Path:** `/api/v1/rfas`
| Method | Endpoint | Permission | Description |
| :----- | :-------------------- | :------------- | :---------------- |
| GET | `/rfas` | `rfas.view` | รายการ RFA |
| GET | `/rfas/:id` | `rfas.view` | รายละเอียด |
| POST | `/rfas` | `rfas.create` | สร้างใหม่ |
| PUT | `/rfas/:id` | `rfas.update` | อัปเดต |
| DELETE | `/rfas/:id` | `rfas.delete` | ลบ |
| POST | `/rfas/:id/respond` | `rfas.respond` | ตอบกลับ RFA |
| POST | `/rfas/:id/approve` | `rfas.approve` | อนุมัติ RFA |
| POST | `/rfas/:id/revisions` | `rfas.update` | สร้าง Revision |
| GET | `/rfas/:id/workflow` | `rfas.view` | ดู Workflow Status |
### 7.3 Drawing Module
**Base Path:** `/api/v1/drawings`
**Shop Drawings:**
| Method | Endpoint | Permission | Description |
| :----- | :----------------------------- | :---------------- | :------------------ |
| GET | `/shop-drawings` | `drawings.view` | รายการ Shop Drawing |
| POST | `/shop-drawings` | `drawings.upload` | อัปโหลดใหม่ |
| GET | `/shop-drawings/:id/revisions` | `drawings.view` | ดู Revisions |
**Contract Drawings:**
| Method | Endpoint | Permission | Description |
| :----- | :------------------- | :---------------- | :---------------------- |
| GET | `/contract-drawings` | `drawings.view` | รายการ Contract Drawing |
| POST | `/contract-drawings` | `drawings.upload` | อัปโหลดใหม่ |
### 7.4 Project Module
**Base Path:** `/api/v1/projects`
| Method | Endpoint | Permission | Description |
| :----- | :------------------------ | :----------------------- | :---------------- |
| GET | `/projects` | `projects.view` | รายการโครงการ |
| GET | `/projects/:id` | `projects.view` | รายละเอียด |
| POST | `/projects` | `projects.create` | สร้างโครงการใหม่ |
| PUT | `/projects/:id` | `projects.update` | อัปเดต |
| POST | `/projects/:id/contracts` | `contracts.create` | สร้าง Contract |
| GET | `/projects/:id/parties` | `projects.view` | ดู Project Parties |
| POST | `/projects/:id/parties` | `project_parties.manage` | เพิ่ม Party |
### 7.5 User & Auth Module
**Base Path:** `/api/v1/users`, `/api/v1/auth`
**Authentication:**
```typescript
POST / api / v1 / auth / login;
POST / api / v1 / auth / logout;
POST / api / v1 / auth / refresh;
POST / api / v1 / auth / change - password;
POST / api / v1 / auth / reset - password;
```
**User Management:**
```typescript
GET /api/v1/users # List users
GET /api/v1/users/:id # User details
POST /api/v1/users # Create user
PUT /api/v1/users/:id # Update user
DELETE /api/v1/users/:id # Delete user
POST /api/v1/users/:id/roles # Assign roles
GET /api/v1/users/me # Current user info
GET /api/v1/users/me/permissions # Current user permissions
```
### 7.6 Search Module
**Base Path:** `/api/v1/search`
```typescript
GET /api/v1/search?q=<query>&type=<correspondence|rfa|drawing>&project_id=<id>
```
**Response:**
```typescript
{
"success": true,
"data": {
"results": [...],
"aggregations": {
"by_type": { "correspondence": 10, "rfa": 5 },
"by_status": { "PENDING": 8, "APPROVED": 7 }
}
},
"meta": {
"total": 15,
"took_ms": 45
}
}
```
## 🔔 Notification API
**Base Path:** `/api/v1/notifications`
```typescript
GET /api/v1/notifications # List notifications
GET /api/v1/notifications/:id # Notification details
PATCH /api/v1/notifications/:id/read # Mark as read
DELETE /api/v1/notifications/:id # Delete notification
```
## 📈 Reporting & Export APIs
**Base Path:** `/api/v1/reports`
```typescript
GET /api/v1/reports/correspondences?format=csv&project_id=1&from=2025-01-01&to=2025-12-31
GET /api/v1/reports/rfas?format=excel&project_id=1
GET /api/v1/reports/dashboard # Dashboard KPIs
```
**Supported Formats:**
- `csv` - CSV file
- `excel` - Excel file (.xlsx)
- `pdf` - PDF file
## 🔧 Workflow Engine API
**Base Path:** `/api/v1/workflows`
```typescript
GET /api/v1/workflows/definitions # List workflow definitions
GET /api/v1/workflows/definitions/:id # Definition details
POST /api/v1/workflows/instances # Create workflow instance
GET /api/v1/workflows/instances/:id # Instance details
POST /api/v1/workflows/instances/:id/transition # Execute transition
GET /api/v1/workflows/instances/:id/history # View history
```
## ⚡ Performance Optimization
### 11.1 Caching Strategy
**Cache Headers:**
```http
Cache-Control: max-age=3600, private
ETag: "33a64df551425fcc55e4d42a148795d9f25f89d4"
```
**Cache TTL:**
- Master Data: 1 hour
- User Sessions: 30 minutes
- Search Results: 15 minutes
- File Metadata: 1 hour
### 11.2 Response Compression
```http
Accept-Encoding: gzip, deflate, br
Content-Encoding: gzip
```
## 🧪 Testing & Documentation
### 12.1 API Documentation
- **Swagger/OpenAPI:** Auto-generated จาก NestJS Decorators
- **URL:** `https://backend.np-dms.work/api/docs`
### 12.2 Testing Strategy
- **Unit Tests:** Test individual controllers & services
- **Integration Tests:** Test API endpoints with database
- **E2E Tests:** Test complete user flows
- **Contract Tests:** Verify API contracts
## 🚦 Health Check & Monitoring
```typescript
GET /health # Health check endpoint
GET /health/ready # Readiness probe
GET /health/live # Liveness probe
GET /metrics # Prometheus metrics
```
**Response:**
```typescript
{
"status": "ok",
"uptime": 86400,
"checks": {
"database": "ok",
"redis": "ok",
"elasticsearch": "ok"
}
}
```
## 📝 API Versioning Strategy
### 14.1 Versioning Approach
- **URL-Based Versioning:** `/api/v1/...`, `/api/v2/...`
- **Backward Compatibility:** รองรับ API เวอร์ชันเก่าอย่างน้อย 1 เวอร์ชัน
- **Deprecation Headers:**
```http
X-API-Deprecation-Warning: This endpoint will be deprecated on 2026-01-01
X-API-Deprecation-Info: https://docs.np-dms.work/migration/v2
```
## 🎯 Best Practices Summary
1. **ใช้ DTOs สำหรับ Validation ทุก Request**
2. **ส่ง Idempotency-Key สำหรับ Critical Operations**
3. **ใช้ Proper HTTP Status Codes**
4. **Implement Rate Limiting บน Client Side**
5. **Handle Errors Gracefully**
6. **Cache Frequently-Accessed Data**
7. **Use Pagination สำหรับ Large Datasets**
8. **Document ทุก Endpoint ด้วย Swagger**
---
**Document Control:**
- **Version:** 1.7.0
- **Status:** Active
- **Last Updated:** 2025-12-18
- **Owner:** Nattanin Peancharoen

View File

@@ -0,0 +1,138 @@
# 02.2 Network Infrastructure & Security (โครงสร้างเครือข่ายและความปลอดภัย)
---
title: 'Network Infrastructure'
version: 1.8.0
status: first-draft
owner: Nattanin Peancharoen
last_updated: 2026-02-23
related:
- specs/02-Architecture/00-01-system-context.md
---
## 1. 🌐 Network Configuration Details
### 1.1 VLAN Networks
| VLAN ID | Name | Purpose | Gateway/Subnet | DHCP | IP Range | DNS | Lease Time | Notes |
| ------- | ------ | --------- | --------------- | ---- | ------------------ | ------- | ---------- | --------------- |
| 10 | SERVER | Interface | 192.168.10.1/24 | No | - | Custom | - | Static servers |
| 20 | MGMT | Interface | 192.168.20.1/24 | No | - | Custom | - | Management only |
| 30 | USER | Interface | 192.168.30.1/24 | Yes | 192.168.30.10-254 | Auto | 7 Days | User devices |
| 40 | CCTV | Interface | 192.168.40.1/24 | Yes | 192.168.40.100-150 | Auto | 7 Days | CCTV & NVR |
| 50 | VOICE | Interface | 192.168.50.1/24 | Yes | 192.168.50.201-250 | Auto | 7 Days | IP Phones |
| 60 | DMZ | Interface | 192.168.60.1/24 | No | - | 1.1.1.1 | - | Public services |
| 70 | GUEST | Interface | 192.168.70.1/24 | Yes | 192.168.70.200-250 | Auto | 1 Day | Guest |
### 1.2 Switch Profiles
| Profile Name | Native Network | Tagged Networks | Untagged Networks | Usage |
| ---------------- | -------------- | --------------------- | ----------------- | ----------------------- |
| 01_CORE_TRUNK | MGMT (20) | 10,30,40,50,60,70 | MGMT (20) | Router & switch uplinks |
| 02_MGMT_ONLY | MGMT (20) | MGMT (20) | - | Management only |
| 03_SERVER_ACCESS | SERVER (10) | MGMT (20) | SERVER (10) | QNAP / ASUSTOR |
| 04_CCTV_ACCESS | CCTV (40) | - | CCTV (40) | CCTV cameras |
| 05_USER_ACCESS | USER (30) | - | USER (30) | PC / Printer |
| 06_AP_TRUNK | MGMT (20) | USER (30), GUEST (70) | MGMT (20) | EAP610 Access Points |
| 07_VOICE_ACCESS | USER (30) | VOICE (50) | USER (30) | IP Phones |
### 1.3 NAS NIC Bonding Configuration
| Device | Bonding Mode | Member Ports | VLAN Mode | Tagged VLAN | IP Address | Gateway | Notes |
| ------- | ------------------- | ------------ | --------- | ----------- | --------------- | ------------ | ---------------------- |
| QNAP | IEEE 802.3ad (LACP) | Adapter 1, 2 | Untagged | 10 (SERVER) | 192.168.10.8/24 | 192.168.10.1 | Primary NAS for DMS |
| ASUSTOR | IEEE 802.3ad (LACP) | Port 1, 2 | Untagged | 10 (SERVER) | 192.168.10.9/24 | 192.168.10.1 | Backup / Secondary NAS |
> **หมายเหตุ**: NAS ทั้งสองตัวใช้ LACP bonding เพื่อเพิ่ม bandwidth และ redundancy โดยต้อง config ให้ตรงกับ Switch
## 2. 🛡️ Network ACLs & Firewall Rules
### 2.1 Gateway ACL (ER7206 Firewall Rules)
*Inter-VLAN Routing Policy*
| # | Name | Source | Destination | Service | Action | Log | Notes |
| --- | ----------------- | --------------- | ---------------- | -------------- | ------ | --- | --------------------------- |
| 1 | MGMT-to-ALL | VLAN20 (MGMT) | Any | Any | Allow | No | Admin full access |
| 2 | SERVER-to-ALL | VLAN10 (SERVER) | Any | Any | Allow | No | Servers outbound access |
| 3 | USER-to-SERVER | VLAN30 (USER) | VLAN10 (SERVER) | HTTP/HTTPS/SSH | Allow | No | Users access web apps |
| 4 | USER-to-DMZ | VLAN30 (USER) | VLAN60 (DMZ) | HTTP/HTTPS | Allow | No | Users access DMZ services |
| 5 | USER-to-MGMT | VLAN30 (USER) | VLAN20 (MGMT) | Any | Deny | Yes | Block users from management |
| 6 | USER-to-CCTV | VLAN30 (USER) | VLAN40 (CCTV) | Any | Deny | Yes | Isolate CCTV |
| 7 | USER-to-VOICE | VLAN30 (USER) | VLAN50 (VOICE) | Any | Deny | No | Isolate Voice |
| 8 | USER-to-GUEST | VLAN30 (USER) | VLAN70 (GUEST) | Any | Deny | No | Isolate Guest |
| 9 | CCTV-to-INTERNET | VLAN40 (CCTV) | WAN | HTTPS (443) | Allow | No | NVR cloud backup (optional) |
| 10 | CCTV-to-ALL | VLAN40 (CCTV) | Any (except WAN) | Any | Deny | Yes | CCTV isolated |
| 11 | DMZ-to-ALL | VLAN60 (DMZ) | Any (internal) | Any | Deny | Yes | DMZ cannot reach internal |
| 12 | GUEST-to-INTERNET | VLAN70 (GUEST) | WAN | HTTP/HTTPS/DNS | Allow | No | Guest internet only |
| 13 | GUEST-to-ALL | VLAN70 (GUEST) | Any (internal) | Any | Deny | Yes | Guest isolated |
| 99 | DEFAULT-DENY | Any | Any | Any | Deny | Yes | Catch-all deny |
*WAN Inbound Rules (Port Forwarding)*
| # | Name | WAN Port | Internal IP | Internal Port | Protocol | Notes |
| --- | --------- | -------- | ------------ | ------------- | -------- | ------------------- |
| 1 | HTTPS-NPM | 443 | 192.168.10.8 | 443 | TCP | Nginx Proxy Manager |
| 2 | HTTP-NPM | 80 | 192.168.10.8 | 80 | TCP | HTTP redirect |
### 2.2 Switch ACL (Layer 2 Rules)
*Port-Based Access Control*
| # | Name | Source Port | Source MAC/VLAN | Destination | Action | Notes |
| --- | --------------- | --------------- | --------------- | ------------------- | ------ | ------------------------ |
| 1 | CCTV-Isolation | Port 25 (CCTV) | VLAN 40 | VLAN 10,20,30 | Deny | CCTV cannot reach others |
| 2 | Guest-Isolation | Port 5-20 (APs) | VLAN 70 | VLAN 10,20,30,40,50 | Deny | Guest isolation |
### 2.3 EAP ACL (Wireless Rules)
*SSID: PSLCBP3 (Staff WiFi)*
| # | Name | Source | Destination | Service | Action | Notes |
| --- | ------------------- | ---------- | ---------------- | -------- | ------ | ----------------- |
| 1 | Allow-DNS | Any Client | 8.8.8.8, 1.1.1.1 | DNS (53) | Allow | DNS resolution |
| 2 | Allow-Server | Any Client | 192.168.10.0/24 | Any | Allow | Access to servers |
| 3 | Allow-Printer | Any Client | 192.168.30.222 | 9100,631 | Allow | Print services |
| 4 | Allow-Internet | Any Client | WAN | Any | Allow | Internet access |
| 5 | Block-MGMT | Any Client | 192.168.20.0/24 | Any | Deny | No management |
| 6 | Block-CCTV | Any Client | 192.168.40.0/24 | Any | Deny | No CCTV access |
| 8 | Block-Client2Client | Any Client | Any Client | Any | Deny | Client isolation |
*SSID: GUEST (Guest WiFi)*
| # | Name | Source | Destination | Service | Action | Notes |
| --- | ------------------- | ---------- | ---------------- | ---------- | ------ | ------------------ |
| 1 | Allow-DNS | Any Client | 8.8.8.8, 1.1.1.1 | DNS (53) | Allow | DNS resolution |
| 2 | Allow-HTTP | Any Client | WAN | HTTP/HTTPS | Allow | Web browsing |
| 3 | Block-RFC1918 | Any Client | 10.0.0.0/8 | Any | Deny | No private IPs |
| 4 | Block-RFC1918-2 | Any Client | 172.16.0.0/12 | Any | Deny | No private IPs |
| 5 | Block-RFC1918-3 | Any Client | 192.168.0.0/16 | Any | Deny | No internal access |
| 6 | Block-Client2Client | Any Client | Any Client | Any | Deny | Client isolation |
## 3. 📈 Network Topology Diagram
```mermaid
graph TB
subgraph Internet
WAN[("🌐 Internet<br/>WAN")]
end
subgraph Router["ER7206 Router"]
R[("🔲 ER7206<br/>192.168.20.1")]
end
subgraph CoreSwitch["SG2428P Core Switch"]
CS[("🔲 SG2428P<br/>192.168.20.2")]
end
subgraph ServerSwitch["AMPCOM 2.5G Switch"]
SS[("🔲 AMPCOM<br/>192.168.20.3")]
end
subgraph Servers["VLAN 10 - Servers"]
QNAP[("💾 QNAP<br/>192.168.10.8")]
ASUSTOR[("💾 ASUSTOR<br/>192.168.10.9")]
end
subgraph AccessPoints["EAP610 x16"]
AP[("📶 WiFi APs")]
end
WAN --> R
R -->|Port 3| CS
CS -->|LAG Port 3-4| SS
SS -->|Port 3-4 LACP| QNAP
SS -->|Port 5-6 LACP| ASUSTOR
CS -->|Port 5-20| AP
```

View File

@@ -0,0 +1,652 @@
# Data Model Architecture
---
title: 'Data Model Architecture'
version: 1.5.0
status: first-draft
owner: Nattanin Peancharoen
last_updated: 2025-11-30
related:
- specs/01-requirements/02-architecture.md
- specs/01-requirements/03-functional-requirements.md
- docs/4_Data_Dictionary_V1_4_5.md
- docs/8_lcbp3_v1_4_5.sql
---
## 📋 Overview
เอกสารนี้อธิบายสถาปัตยกรรมของ Data Model สำหรับระบบ LCBP3-DMS โดยครอบคลุมโครงสร้างฐานข้อมูล, ความสัมพันธ์ระหว่างตาราง, และหลักการออกแบบที่สำคัญ
## 🎯 Design Principles
### 1. Separation of Concerns
- **Master-Revision Pattern**: แยกข้อมูลที่ไม่เปลี่ยนแปลง (Master) จากข้อมูลที่มีการแก้ไข (Revisions)
- `correspondences` (Master) ↔ `correspondence_revisions` (Revisions)
- `rfas` (Master) ↔ `rfa_revisions` (Revisions)
- `shop_drawings` (Master) ↔ `shop_drawing_revisions` (Revisions)
### 2. Data Integrity
- **Foreign Key Constraints**: ใช้ FK ทุกความสัมพันธ์เพื่อรักษาความสมบูรณ์ของข้อมูล
- **Soft Delete**: ใช้ `deleted_at` แทนการลบข้อมูลจริง เพื่อรักษาประวัติ
- **Optimistic Locking**: ใช้ `version` column ใน `document_number_counters` ป้องกัน Race Condition
### 3. Flexibility & Extensibility
- **JSON Details Field**: เก็บข้อมูลเฉพาะประเภทใน `correspondence_revisions.details`
- **Virtual Columns**: สร้าง Index จาก JSON fields สำหรับ Performance
- **Master Data Tables**: แยกข้อมูล Master (Types, Status, Codes) เพื่อความยืดหยุ่น
### 4. Security & Audit
- **RBAC (Role-Based Access Control)**: ระบบสิทธิ์แบบ Hierarchical Scope
- **Audit Trail**: บันทึกผู้สร้าง/แก้ไข และเวลาในทุกตาราง
- **Two-Phase File Upload**: ป้องกันไฟล์ขยะด้วย Temporary Storage
## 🗂️ Database Schema Overview
### Entity Relationship Diagram
```mermaid
erDiagram
%% Core Entities
organizations ||--o{ users : "employs"
projects ||--o{ contracts : "contains"
projects ||--o{ correspondences : "manages"
%% RBAC
users ||--o{ user_assignments : "has"
roles ||--o{ user_assignments : "assigned_to"
roles ||--o{ role_permissions : "has"
permissions ||--o{ role_permissions : "granted_by"
%% Correspondences
correspondences ||--o{ correspondence_revisions : "has_revisions"
correspondence_types ||--o{ correspondences : "categorizes"
correspondence_status ||--o{ correspondence_revisions : "defines_state"
disciplines ||--o{ correspondences : "classifies"
%% RFAs
rfas ||--o{ rfa_revisions : "has_revisions"
rfa_types ||--o{ rfas : "categorizes"
rfa_status_codes ||--o{ rfa_revisions : "defines_state"
rfa_approve_codes ||--o{ rfa_revisions : "defines_result"
disciplines ||--o{ rfas : "classifies"
%% Drawings
shop_drawings ||--o{ shop_drawing_revisions : "has_revisions"
shop_drawing_main_categories ||--o{ shop_drawings : "categorizes"
shop_drawing_sub_categories ||--o{ shop_drawings : "sub_categorizes"
%% Attachments
attachments ||--o{ correspondence_attachments : "attached_to"
correspondences ||--o{ correspondence_attachments : "has"
```
## 📊 Data Model Categories
### 1. 🏢 Core & Master Data
#### 1.1 Organizations & Projects
**Tables:**
- `organization_roles` - บทบาทขององค์กร (OWNER, DESIGNER, CONSULTANT, CONTRACTOR)
- `organizations` - องค์กรทั้งหมดในระบบ
- `projects` - โครงการ
- `contracts` - สัญญาภายใต้โครงการ
- `project_organizations` - M:N ระหว่าง Projects และ Organizations
- `contract_organizations` - M:N ระหว่าง Contracts และ Organizations พร้อม Role
**Key Relationships:**
```
projects (1) ──→ (N) contracts
projects (N) ←→ (N) organizations [via project_organizations]
contracts (N) ←→ (N) organizations [via contract_organizations]
```
**Business Rules:**
- Organization code ต้องไม่ซ้ำกันในระบบ
- Contract ต้องผูกกับ Project เสมอ (ON DELETE CASCADE)
- Soft delete ใช้ `is_active` flag
---
### 2. 👥 Users & RBAC
#### 2.1 User Management
**Tables:**
- `users` - ผู้ใช้งานระบบ
- `roles` - บทบาทพร้อม Scope (Global, Organization, Project, Contract)
- `permissions` - สิทธิ์การใช้งาน (49 permissions)
- `role_permissions` - M:N mapping
- `user_assignments` - การมอบหมายบทบาทพร้อม Scope Context
**Scope Hierarchy:**
```
Global (ทั้งระบบ)
Organization (ระดับองค์กร)
Project (ระดับโครงการ)
Contract (ระดับสัญญา)
```
**Key Features:**
- **Hierarchical Scope**: User สามารถมีหลาย Role ในหลาย Scope
- **Scope Inheritance**: สิทธิ์ระดับบนครอบคลุมระดับล่าง
- **Account Security**: Failed login tracking, Account locking, Password hashing (bcrypt)
**Example User Assignment:**
```sql
-- User A เป็น Editor ในองค์กร TEAM
INSERT INTO user_assignments (user_id, role_id, organization_id)
VALUES (1, 4, 3);
-- User B เป็น Project Manager ในโครงการ LCBP3
INSERT INTO user_assignments (user_id, role_id, project_id)
VALUES (2, 6, 1);
```
---
### 3. ✉️ Correspondences (เอกสารโต้ตอบ)
#### 3.1 Master-Revision Pattern
**Master Table: `correspondences`**
เก็บข้อมูลที่ไม่เปลี่ยนแปลง:
- `correspondence_number` - เลขที่เอกสาร (Unique per Project)
- `correspondence_type_id` - ประเภทเอกสาร (RFA, RFI, TRANSMITTAL, etc.)
- `discipline_id` - สาขางาน (GEN, STR, ARC, etc.) [NEW v1.4.5]
- `project_id`, `originator_id` - โครงการและองค์กรผู้ส่ง
**Revision Table: `correspondence_revisions`**
เก็บข้อมูลที่เปลี่ยนแปลงได้:
- `revision_number` - หมายเลข Revision (0, 1, 2...)
- `is_current` - Flag สำหรับ Revision ปัจจุบัน (UNIQUE constraint)
- `title`, `description` - เนื้อหาเอกสาร
- `correspondence_status_id` - สถานะ (DRAFT, SUBOWN, REPCSC, etc.)
- `details` - JSON field สำหรับข้อมูลเฉพาะประเภท
- Virtual Columns: `v_ref_project_id`, `v_ref_type`, `v_doc_subtype` (Indexed)
**Supporting Tables:**
- `correspondence_types` - Master ประเภทเอกสาร (10 types)
- `correspondence_status` - Master สถานะ (23 status codes)
- `correspondence_sub_types` - ประเภทย่อยสำหรับ Document Numbering [NEW v1.4.5]
- `disciplines` - สาขางาน (GEN, STR, ARC, etc.) [NEW v1.4.5]
- `correspondence_recipients` - M:N ผู้รับ (TO/CC)
- `correspondence_tags` - M:N Tags
- `correspondence_references` - M:N Cross-references
**Example Query - Get Current Revision:**
```sql
SELECT c.correspondence_number, cr.title, cr.revision_label, cs.status_name
FROM correspondences c
JOIN correspondence_revisions cr ON c.id = cr.correspondence_id
JOIN correspondence_status cs ON cr.correspondence_status_id = cs.id
WHERE cr.is_current = TRUE
AND c.deleted_at IS NULL;
```
---
### 4. 📐 RFAs (Request for Approval)
#### 4.1 RFA Structure
**Master Table: `rfas`**
- `rfa_type_id` - ประเภท RFA (DWG, DOC, MAT, SPC, etc.)
- `discipline_id` - สาขางาน [NEW v1.4.5]
**Revision Table: `rfa_revisions`**
- `correspondence_id` - Link กับ Correspondence (RFA เป็น Correspondence ประเภทหนึ่ง)
- `rfa_status_code_id` - สถานะ (DFT, FAP, FRE, FCO, ASB, OBS, CC)
- `rfa_approve_code_id` - ผลการอนุมัติ (1A, 1C, 1N, 1R, 3C, 3R, 4X, 5N)
- `approved_date` - วันที่อนุมัติ
**Supporting Tables:**
- `rfa_types` - 11 ประเภท (Shop Drawing, Document, Material, etc.)
- `rfa_status_codes` - 7 สถานะ
- `rfa_approve_codes` - 8 รหัสผลการอนุมัติ
- `rfa_items` - M:N เชื่อม RFA (ประเภท DWG) กับ Shop Drawing Revisions
**RFA Workflow States:**
```
DFT (Draft)
FAP (For Approve) / FRE (For Review)
[Approval Process]
FCO (For Construction) / ASB (As-Built) / 3R (Revise) / 4X (Reject)
```
---
### 5. 📐 Drawings (แบบก่อสร้าง)
#### 5.1 Contract Drawings (แบบคู่สัญญา)
**Tables:**
- `contract_drawing_volumes` - เล่มแบบ
- `contract_drawing_cats` - หมวดหมู่หลัก
- `contract_drawing_sub_cats` - หมวดหมู่ย่อย
- `contract_drawing_subcat_cat_maps` - M:N Mapping
- `contract_drawings` - แบบคู่สัญญา
**Hierarchy:**
```
Volume (เล่ม) - has volume_page
└─ Category (หมวดหมู่หลัก)
└─ Sub-Category (หมวดหมู่ย่อย) -- Linked via Map Table
└─ Drawing (แบบ)
```
#### 5.2 Shop Drawings (แบบก่อสร้าง)
**Tables:**
- `shop_drawing_main_categories` - หมวดหมู่หลัก (Project Specific)
- `shop_drawing_sub_categories` - หมวดหมู่ย่อย (Project Specific)
- `shop_drawings` - Master แบบก่อสร้าง (No title, number only)
- `shop_drawing_revisions` - Revisions (Holds Title & Legacy Number)
- `shop_drawing_revision_contract_refs` - M:N อ้างอิงแบบคู่สัญญา
**Revision Tracking:**
```sql
-- Get latest revision of a shop drawing
SELECT sd.shop_drawing_number, sdr.revision_label, sdr.revision_date
FROM shop_drawings sd
JOIN shop_drawing_revisions sdr ON sd.id = sdr.shop_drawing_id
WHERE sd.shop_drawing_number = 'SD-STR-001'
ORDER BY sdr.revision_number DESC
LIMIT 1;
```
#### 5.3 As Built Drawings (แบบสร้างจริง) [NEW v1.7.0]
**Tables:**
- `asbuilt_drawings` - Master แบบสร้างจริง
- `asbuilt_drawing_revisions` - Revisions history
- `asbuilt_revision_shop_revisions_refs` - เชื่อมโยงกับ Shop Drawing Revision source
- `asbuilt_drawing_revision_attachments` - ไฟล์แนบ (PDF/DWG)
**Business Rules:**
- As Built 1 ใบ อาจมาจาก Shop Drawing หลายใบ (Many-to-Many via refs table)
- แยก Counter distinct จาก Shop Drawing และ Contract Drawing
- รองรับไฟล์แนบหลายประเภท (PDF, DWG, SOURCE)
---
### 6. 🔄 Circulations & Transmittals
#### 6.1 Circulations (ใบเวียนภายใน)
**Tables:**
- `circulation_status_codes` - สถานะ (OPEN, IN_REVIEW, COMPLETED, CANCELLED)
- `circulations` - ใบเวียน (1:1 กับ Correspondence)
**Workflow:**
```
OPEN → IN_REVIEW → COMPLETED
CANCELLED
```
#### 6.2 Transmittals (เอกสารนำส่ง)
**Tables:**
- `transmittals` - ข้อมูล Transmittal (1:1 กับ Correspondence)
- `transmittal_items` - M:N รายการเอกสารที่นำส่ง
**Purpose Types:**
- FOR_APPROVAL
- FOR_INFORMATION
- FOR_REVIEW
- OTHER
---
### 7. 📎 File Management
#### 7.1 Two-Phase Storage Pattern
**Table: `attachments`**
**Phase 1: Temporary Upload**
```sql
INSERT INTO attachments (
original_filename, stored_filename, file_path,
mime_type, file_size, is_temporary, temp_id,
uploaded_by_user_id, expires_at, checksum
)
VALUES (
'document.pdf', 'uuid-document.pdf', '/temp/uuid-document.pdf',
'application/pdf', 1024000, TRUE, 'temp-uuid-123',
1, NOW() + INTERVAL 1 HOUR, 'sha256-hash'
);
```
**Phase 2: Commit to Permanent**
```sql
-- Update attachment to permanent
UPDATE attachments
SET is_temporary = FALSE, expires_at = NULL
WHERE temp_id = 'temp-uuid-123';
-- Link to correspondence
INSERT INTO correspondence_attachments (correspondence_id, attachment_id, is_main_document)
VALUES (1, 123, TRUE);
```
**Junction Tables:**
- `correspondence_attachments` - M:N
- `circulation_attachments` - M:N
- `shop_drawing_revision_attachments` - M:N (with file_type)
- `contract_drawing_attachments` - M:N (with file_type)
**Security Features:**
- Checksum validation (SHA-256)
- Automatic cleanup of expired temporary files
- File type validation via `mime_type`
---
### 8. 🔢 Document Numbering
#### 8.1 Format & Counter System
**Tables:**
- `document_number_formats` - Template รูปแบบเลขที่เอกสาร
- `document_number_counters` - Running Number Counter with Optimistic Locking
**Format Template Example:**
```
{ORG_CODE}-{TYPE_CODE}-{DISCIPLINE_CODE}-{YEAR}-{SEQ:4}
→ TEAM-RFA-STR-2025-0001
```
**Counter Table Structure:**
```sql
CREATE TABLE document_number_counters (
project_id INT,
originator_organization_id INT,
correspondence_type_id INT,
discipline_id INT DEFAULT 0, -- NEW v1.4.5
current_year INT,
version INT DEFAULT 0, -- Optimistic Lock
last_number INT DEFAULT 0,
PRIMARY KEY (
project_id,
originator_organization_id,
recipient_organization_id,
correspondence_type_id,
sub_type_id,
rfa_type_id,
discipline_id,
reset_scope
)
);
```
**Optimistic Locking Pattern:**
```sql
-- Get next number with version check
UPDATE document_number_counters
SET last_number = last_number + 1,
version = version + 1
WHERE project_id = 1
AND originator_organization_id = 3
AND correspondence_type_id = 1
AND discipline_id = 2
AND current_year = 2025
AND version = @current_version; -- Optimistic lock check
-- If affected rows = 0, retry (conflict detected)
```
---
## 🔐 Security & Audit
### 1. Audit Logging
**Table: `audit_logs`**
บันทึกการเปลี่ยนแปลงสำคัญ:
- User actions (CREATE, UPDATE, DELETE)
- Entity type และ Entity ID
- Old/New values (JSON)
- IP Address, User Agent
### 2. User Preferences
**Table: `user_preferences`**
เก็บการตั้งค่าส่วนตัว:
- Language preference
- Notification settings
- UI preferences (JSON)
### 3. JSON Schema Validation
**Table: `json_schemas`**
เก็บ Schema สำหรับ Validate JSON fields:
- `correspondence_revisions.details`
- `user_preferences.preferences`
---
## 📈 Performance Optimization
### 1. Indexing Strategy
**Primary Indexes:**
- Primary Keys (AUTO_INCREMENT)
- Foreign Keys (automatic in InnoDB)
- Unique Constraints (business keys)
**Secondary Indexes:**
```sql
-- Correspondence search
CREATE INDEX idx_corr_type_status ON correspondence_revisions(correspondence_type_id, correspondence_status_id);
CREATE INDEX idx_corr_date ON correspondence_revisions(document_date);
-- Virtual columns for JSON
CREATE INDEX idx_v_ref_project ON correspondence_revisions(v_ref_project_id);
CREATE INDEX idx_v_doc_subtype ON correspondence_revisions(v_doc_subtype);
-- User lookup
CREATE INDEX idx_user_email ON users(email);
CREATE INDEX idx_user_org ON users(primary_organization_id, is_active);
```
### 2. Virtual Columns
ใช้ Virtual Columns สำหรับ Index JSON fields:
```sql
ALTER TABLE correspondence_revisions
ADD COLUMN v_ref_project_id INT GENERATED ALWAYS AS (JSON_UNQUOTE(JSON_EXTRACT(details, '$.ref_project_id'))) VIRTUAL,
ADD INDEX idx_v_ref_project(v_ref_project_id);
```
### 3. Partitioning (Future)
พิจารณา Partition ตาราง `audit_logs` ตามปี:
```sql
ALTER TABLE audit_logs
PARTITION BY RANGE (YEAR(created_at)) (
PARTITION p2024 VALUES LESS THAN (2025),
PARTITION p2025 VALUES LESS THAN (2026),
PARTITION p_future VALUES LESS THAN MAXVALUE
);
```
---
## 🔄 Migration Strategy
### 1. TypeORM Migrations
ใช้ TypeORM Migration สำหรับ Schema Changes:
```typescript
// File: backend/src/migrations/1234567890-AddDisciplineToCorrespondences.ts
import { MigrationInterface, QueryRunner } from 'typeorm';
export class AddDisciplineToCorrespondences1234567890
implements MigrationInterface
{
public async up(queryRunner: QueryRunner): Promise<void> {
await queryRunner.query(`
ALTER TABLE correspondences
ADD COLUMN discipline_id INT NULL COMMENT 'สาขางาน (ถ้ามี)'
AFTER correspondence_type_id
`);
await queryRunner.query(`
ALTER TABLE correspondences
ADD CONSTRAINT fk_corr_discipline
FOREIGN KEY (discipline_id) REFERENCES disciplines(id)
ON DELETE SET NULL
`);
}
public async down(queryRunner: QueryRunner): Promise<void> {
await queryRunner.query(
`ALTER TABLE correspondences DROP FOREIGN KEY fk_corr_discipline`
);
await queryRunner.query(
`ALTER TABLE correspondences DROP COLUMN discipline_id`
);
}
}
```
### 2. Data Seeding
ใช้ Seed Scripts สำหรับ Master Data:
```typescript
// File: backend/src/seeds/1-organizations.seed.ts
export class OrganizationSeeder implements Seeder {
public async run(dataSource: DataSource): Promise<void> {
const repository = dataSource.getRepository(Organization);
await repository.save([
{
organization_code: 'กทท.',
organization_name: 'Port Authority of Thailand',
},
{
organization_code: 'TEAM',
organization_name: 'TEAM Consulting Engineering',
},
// ...
]);
}
}
```
---
## 📚 Best Practices
### 1. Naming Conventions
- **Tables**: `snake_case`, plural (e.g., `correspondences`, `users`)
- **Columns**: `snake_case` (e.g., `correspondence_number`, `created_at`)
- **Foreign Keys**: `{referenced_table_singular}_id` (e.g., `project_id`, `user_id`)
- **Junction Tables**: `{table1}_{table2}` (e.g., `correspondence_tags`)
### 2. Timestamp Columns
ทุกตารางควรมี:
- `created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP`
- `updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP`
### 3. Soft Delete
ใช้ `deleted_at DATETIME NULL` แทนการลบจริง:
```sql
-- Soft delete
UPDATE correspondences SET deleted_at = NOW() WHERE id = 1;
-- Query active records
SELECT * FROM correspondences WHERE deleted_at IS NULL;
```
### 4. JSON Field Guidelines
- ใช้สำหรับข้อมูลที่ไม่ต้อง Query บ่อย
- สร้าง Virtual Columns สำหรับ fields ที่ต้อง Index
- Validate ด้วย JSON Schema
- Document structure ใน Data Dictionary
---
## 🔗 Related Documentation
- [System Architecture](../01-requirements/01-02-architecture.md) - สถาปัตยกรรมระบบโดยรวม
- [API Design](02-02-api-design.md) - การออกแบบ API
- [Data Dictionary v1.4.5](../../docs/4_Data_Dictionary_V1_4_5.md) - รายละเอียดตารางทั้งหมด
- [SQL Schema v1.4.5](../../docs/8_lcbp3_v1_4_5.sql) - SQL Script สำหรับสร้างฐานข้อมูล
- [Functional Requirements](../01-requirements/01-03-functional-requirements.md) - ความต้องการด้านฟังก์ชัน
---
## 📝 Version History
| Version | Date | Author | Changes |
| ------- | ---------- | -------------------- | ---------------------------------------------- |
| 1.5.0 | 2025-11-30 | Nattanin Peancharoen | Initial data model documentation |
| 1.4.5 | 2025-11-29 | System | Added disciplines and correspondence_sub_types |

View File

@@ -0,0 +1,290 @@
# 🗺️ แผนผัง Network Architecture & Container Services (LCBP3-DMS)
แผนผังนี้แสดงการแบ่งส่วนเครือข่าย (VLANs), การเชื่อมต่อ Firewall (ACLs) และบทบาทของ Server ทั้งสองตัว (QNAP: Application, ASUSTOR: Infrastructure)
> 📖 **ดูรายละเอียด Server Roles และ Service Distribution ได้ที่:** [README.md](README.md#-hardware-infrastructure)
---
## 1. Data Flow Diagram
```mermaid
flowchart TB
subgraph Internet["🌐 Internet"]
User[("👤 User")]
end
subgraph QNAP["💾 QNAP TS-473A (App Server)"]
NPM["🔲 NPM<br/>(Reverse Proxy)"]
Frontend["📱 Next.js<br/>(Frontend)"]
Backend["⚙️ NestJS<br/>(Backend API)"]
DB["🗄️ MariaDB"]
Redis["📦 Redis"]
ES["🔍 Elasticsearch"]
end
subgraph ASUSTOR["💾 ASUSTOR AS5403T (Infra Server)"]
Portainer["🐳 Portainer"]
Registry["📦 Registry"]
Prometheus["📊 Prometheus"]
Grafana["📈 Grafana"]
Uptime["⏱️ Uptime Kuma"]
Backup["💾 Restic/Borg"]
NFS["📁 NFS Storage"]
end
User -->|HTTPS 443| NPM
NPM --> Frontend
NPM --> Backend
Frontend --> Backend
Backend --> DB
Backend --> Redis
Backend --> ES
DB -.->|Scheduled Backup| Backup
Backup --> NFS
Portainer -.->|Manage| QNAP
Prometheus -.->|Collect Metrics| Backend
Prometheus -.->|Collect Metrics| DB
Uptime -.->|Health Check| NPM
```
---
## 2. Docker Management View
```mermaid
flowchart TB
subgraph Portainer["🐳 Portainer (ASUSTOR - Central Management)"]
direction TB
subgraph LocalStack["📦 Local Infra Stack"]
Registry["Docker Registry"]
Prometheus["Prometheus"]
Grafana["Grafana"]
Uptime["Uptime Kuma"]
Backup["Restic/Borg"]
Loki["Loki (Logs)"]
ClamAV["ClamAV"]
end
subgraph RemoteStack["🔗 Remote: QNAP App Stack"]
Frontend["Next.js"]
Backend["NestJS"]
MariaDB["MariaDB"]
Redis["Redis"]
ES["Elasticsearch"]
NPM["NPM"]
Gitea["Gitea"]
N8N["n8n"]
PMA["phpMyAdmin"]
end
end
```
---
## 3. Security Zones Diagram
```mermaid
flowchart TB
subgraph PublicZone["🌐 PUBLIC ZONE"]
direction LR
NPM["NPM (Reverse Proxy)"]
SSL["SSL/TLS Termination"]
end
subgraph AppZone["📱 APPLICATION ZONE (QNAP)"]
direction LR
Frontend["Next.js"]
Backend["NestJS"]
N8N["n8n"]
Gitea["Gitea"]
end
subgraph DataZone["💾 DATA ZONE (QNAP - Internal Only)"]
direction LR
MariaDB["MariaDB"]
Redis["Redis"]
ES["Elasticsearch"]
end
subgraph InfraZone["🛠️ INFRASTRUCTURE ZONE (ASUSTOR)"]
direction LR
Backup["Backup Services"]
Registry["Docker Registry"]
Monitoring["Prometheus + Grafana"]
Logs["Loki / Syslog"]
end
PublicZone -->|HTTPS Only| AppZone
AppZone -->|Internal API| DataZone
DataZone -.->|Backup| InfraZone
AppZone -.->|Metrics| InfraZone
```
---
## 4. แผนผังการเชื่อมต่อเครือข่าย (Network Flow)
```mermaid
graph TD
direction TB
subgraph Flow1["การเชื่อมต่อจากภายนอก (Public WAN)"]
User["ผู้ใช้งานภายนอก (Internet)"]
end
subgraph Router["Router (ER7206) - Gateway"]
User -- "Port 80/443 (HTTPS/HTTP)" --> ER7206
ER7206["Port Forwarding<br/>TCP 80 → 192.168.10.8:80<br/>TCP 443 → 192.168.10.8:443"]
end
subgraph VLANs["เครือข่ายภายใน (VLANs & Firewall Rules)"]
direction LR
subgraph VLAN10["VLAN 10: Servers<br/>192.168.10.x"]
QNAP["QNAP NAS<br/>(192.168.10.8)"]
ASUSTOR["ASUSTOR NAS<br/>(192.168.10.9)"]
end
subgraph VLAN20["VLAN 20: MGMT<br/>192.168.20.x"]
AdminPC["Admin PC / Switches"]
end
subgraph VLAN30["VLAN 30: USER<br/>192.168.30.x"]
OfficePC["PC พนักงาน/Wi-Fi"]
end
subgraph VLAN70["VLAN 70: GUEST<br/>192.168.70.x"]
GuestPC["Guest Wi-Fi"]
end
subgraph Firewall["Firewall ACLs (OC200/ER7206)"]
direction TB
rule1["Rule 1: DENY<br/>Guest (VLAN 70) → All VLANs"]
rule2["Rule 2: DENY<br/>Server (VLAN 10) → User (VLAN 30)"]
rule3["Rule 3: ALLOW<br/>User (VLAN 30) → QNAP<br/>Ports: 443, 80"]
rule4["Rule 4: ALLOW<br/>MGMT (VLAN 20) → All"]
end
GuestPC -.x|rule1| QNAP
QNAP -.x|rule2| OfficePC
OfficePC -- "https://lcbp3.np-dms.work" -->|rule3| QNAP
AdminPC -->|rule4| QNAP
AdminPC -->|rule4| ASUSTOR
end
ER7206 --> QNAP
subgraph DockerQNAP["Docker 'lcbp3' (QNAP - Applications)"]
direction TB
subgraph PublicServices["Services ที่ NPM เปิดสู่ภายนอก"]
direction LR
NPM["NPM (Nginx Proxy Manager)"]
FrontendC["frontend:3000"]
BackendC["backend:3000"]
GiteaC["gitea:3000"]
PMAC["pma:80"]
N8NC["n8n:5678"]
end
subgraph InternalServices["Internal Services (Backend Only)"]
direction LR
DBC["mariadb:3306"]
CacheC["cache:6379"]
SearchC["search:9200"]
end
NPM -- "lcbp3.np-dms.work" --> FrontendC
NPM -- "backend.np-dms.work" --> BackendC
NPM -- "git.np-dms.work" --> GiteaC
NPM -- "pma.np-dms.work" --> PMAC
NPM -- "n8n.np-dms.work" --> N8NC
BackendC -- "lcbp3 Network" --> DBC
BackendC -- "lcbp3 Network" --> CacheC
BackendC -- "lcbp3 Network" --> SearchC
end
subgraph DockerASUSTOR["Docker 'lcbp3' (ASUSTOR - Infrastructure)"]
direction TB
subgraph InfraServices["Infrastructure Services"]
direction LR
PortainerC["portainer:9443"]
RegistryC["registry:5000"]
PrometheusC["prometheus:9090"]
GrafanaC["grafana:3000"]
UptimeC["uptime-kuma:3001"]
end
subgraph BackupServices["Backup & Storage"]
direction LR
ResticC["restic/borg"]
NFSC["NFS Share"]
end
PortainerC -.->|"Remote Endpoint"| NPM
PrometheusC -.->|"Scrape Metrics"| BackendC
ResticC --> NFSC
end
QNAP --> NPM
ASUSTOR --> PortainerC
DBC -.->|"Scheduled Backup"| ResticC
```
---
## 5. Firewall & Security Configuration
> 📖 **ดูรายละเอียด Firewall ACLs และ Port Forwarding ได้ที่:** [03_Securities.md](03_Securities.md)
ไฟล์ `03_Securities.md` ประกอบด้วย:
- 🌐 VLAN Segmentation
- 🔥 Firewall Rules (IP Groups, Port Groups, Switch ACL, Gateway ACL)
- 🚪 Port Forwarding Configuration
---
## 6. Container Service Distribution
> 📖 **ดูรายละเอียด Container Services, Ports, และ Domain Mapping ได้ที่:** [README.md](README.md#-domain-mapping-npm-proxy)
---
## 7. Backup Flow
```mermaid
flowchart LR
subgraph QNAP["💾 QNAP TS-473A (Source)"]
direction TB
DB["🗄️ MariaDB<br/>(mysqldump)"]
Redis["📦 Redis<br/>(RDB + AOF)"]
Config["⚙️ App Config<br/>+ Volumes"]
end
subgraph ASUSTOR["💾 ASUSTOR AS5403T (Target)"]
direction TB
BackupDB["📁 /volume1/backup/db/<br/>(Restic Repository)"]
BackupRedis["📁 /volume1/backup/redis/"]
BackupConfig["📁 /volume1/backup/config/"]
end
DB -->|"Daily 2AM"| BackupDB
Redis -->|"Daily 3AM"| BackupRedis
Config -->|"Weekly Sun 4AM"| BackupConfig
subgraph Retention["📋 Retention Policy"]
R1["Daily: 7 days"]
R2["Weekly: 4 weeks"]
R3["Monthly: 6 months"]
end
```
---
> 📝 **หมายเหตุ**: เอกสารนี้อ้างอิงจาก Architecture Document **v1.8.0** - Last updated: 2026-01-28

View File

@@ -0,0 +1,821 @@
# Document Numbering Implementation Guide (Combined)
---
title: 'Implementation Guide: Document Numbering System'
version: 1.6.2
status: APPROVED
owner: Development Team
last_updated: 2025-12-17
related:
- specs/01-requirements/03.11-document-numbering.md
- specs/04-operations/document-numbering-operations.md
- specs/05-decisions/ADR-002-document-numbering-strategy.md
---
## Overview
เอกสารนี้รวบรวม implementation details สำหรับระบบ Document Numbering โดยผนวกข้อมูลจาก:
- `document-numbering.md` - Core implementation และ database schema
- `document-numbering-add.md` - Extended features (Reservation, Manual Override, Monitoring)
---
## Technology Stack
| Component | Technology |
| ----------------- | -------------------- |
| Backend Framework | NestJS 10.x |
| ORM | TypeORM 0.3.x |
| Database | MariaDB 11.8 |
| Cache/Lock | Redis 7.x + Redlock |
| Message Queue | BullMQ |
| Monitoring | Prometheus + Grafana |
---
## 1. Module Structure
```
backend/src/modules/document-numbering/
├── document-numbering.module.ts
├── controllers/
│ ├── document-numbering.controller.ts # General endpoints
│ ├── document-numbering-admin.controller.ts # Admin endpoints
│ └── numbering-metrics.controller.ts # Metrics endpoints
├── services/
│ ├── document-numbering.service.ts # Main orchestration
│ ├── document-numbering-lock.service.ts # Redis Lock
│ ├── counter.service.ts # Sequence counter logic
│ ├── reservation.service.ts # Two-phase commit
│ ├── manual-override.service.ts # Manual number handling
│ ├── format.service.ts # Template formatting
│ ├── template.service.ts # Template CRUD
│ ├── audit.service.ts # Audit logging
│ ├── metrics.service.ts # Prometheus metrics
│ └── migration.service.ts # Legacy import
├── entities/
│ ├── document-number-counter.entity.ts
│ ├── document-number-format.entity.ts
│ ├── document-number-audit.entity.ts
│ ├── document-number-error.entity.ts
│ └── document-number-reservation.entity.ts
├── dto/
│ ├── generate-number.dto.ts
│ ├── preview-number.dto.ts
│ ├── reserve-number.dto.ts
│ ├── confirm-reservation.dto.ts
│ ├── manual-override.dto.ts
│ ├── void-document.dto.ts
│ └── bulk-import.dto.ts
├── validators/
│ └── template.validator.ts
├── guards/
│ └── manual-override.guard.ts
├── decorators/
│ └── audit-numbering.decorator.ts
├── jobs/
│ └── counter-reset.job.ts
└── tests/
├── unit/
├── integration/
└── e2e/
```
---
## 2. Database Schema
### 2.1 Format Template Table
```sql
CREATE TABLE document_number_formats (
id INT AUTO_INCREMENT PRIMARY KEY,
project_id INT NOT NULL,
correspondence_type_id INT NULL, -- NULL = default format for project
format_template VARCHAR(100) NOT NULL,
reset_sequence_yearly TINYINT(1) DEFAULT 1,
description VARCHAR(255),
created_at DATETIME(6) DEFAULT CURRENT_TIMESTAMP(6),
updated_at DATETIME(6) DEFAULT CURRENT_TIMESTAMP(6) ON UPDATE CURRENT_TIMESTAMP(6),
UNIQUE KEY idx_unique_project_type (project_id, correspondence_type_id),
FOREIGN KEY (project_id) REFERENCES projects(id) ON DELETE CASCADE,
FOREIGN KEY (correspondence_type_id) REFERENCES correspondence_types(id) ON DELETE CASCADE
) ENGINE=InnoDB COMMENT='Document Number Format Templates';
```
### 2.2 Counter Table
```sql
CREATE TABLE document_number_counters (
project_id INT NOT NULL,
correspondence_type_id INT NULL,
originator_organization_id INT NOT NULL,
recipient_organization_id INT NOT NULL DEFAULT 0, -- 0 = no recipient (RFA)
sub_type_id INT DEFAULT 0,
rfa_type_id INT DEFAULT 0,
discipline_id INT DEFAULT 0,
reset_scope VARCHAR(20) NOT NULL,
last_number INT DEFAULT 0 NOT NULL,
version INT DEFAULT 0 NOT NULL,
created_at DATETIME(6) DEFAULT CURRENT_TIMESTAMP(6),
updated_at DATETIME(6) DEFAULT CURRENT_TIMESTAMP(6) ON UPDATE CURRENT_TIMESTAMP(6),
PRIMARY KEY (
project_id,
originator_organization_id,
recipient_organization_id,
correspondence_type_id,
sub_type_id,
rfa_type_id,
discipline_id,
reset_scope
),
FOREIGN KEY (project_id) REFERENCES projects(id) ON DELETE CASCADE,
FOREIGN KEY (originator_organization_id) REFERENCES organizations(id) ON DELETE CASCADE,
FOREIGN KEY (correspondence_type_id) REFERENCES correspondence_types(id) ON DELETE CASCADE,
INDEX idx_counter_lookup (project_id, correspondence_type_id, reset_scope),
INDEX idx_counter_org (originator_organization_id, reset_scope),
INDEX idx_counter_updated (updated_at),
CONSTRAINT chk_last_number_positive CHECK (last_number >= 0),
CONSTRAINT chk_reset_scope_format CHECK (
reset_scope = 'NONE' OR
reset_scope LIKE 'YEAR_%'
)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_general_ci
COMMENT='Running Number Counters';
```
### 2.3 Audit Table
```sql
CREATE TABLE document_number_audit (
id BIGINT AUTO_INCREMENT PRIMARY KEY,
document_id INT NULL COMMENT 'FK to documents (NULL initially)',
document_type VARCHAR(50),
document_number VARCHAR(100) NOT NULL,
operation ENUM('RESERVE', 'CONFIRM', 'CANCEL', 'MANUAL_OVERRIDE', 'VOID', 'GENERATE') NOT NULL,
status ENUM('RESERVED', 'CONFIRMED', 'CANCELLED', 'VOID', 'MANUAL'),
counter_key JSON NOT NULL COMMENT 'Counter key used (JSON format)',
reservation_token VARCHAR(36) NULL,
originator_organization_id INT NULL,
recipient_organization_id INT NULL,
template_used VARCHAR(200) NOT NULL,
old_value TEXT NULL,
new_value TEXT NULL,
user_id INT NULL COMMENT 'FK to users (Allow NULL for system generation)',
ip_address VARCHAR(45),
user_agent TEXT,
is_success BOOLEAN DEFAULT TRUE,
retry_count INT DEFAULT 0,
lock_wait_ms INT COMMENT 'Lock acquisition time in milliseconds',
total_duration_ms INT COMMENT 'Total generation time',
fallback_used ENUM('NONE', 'DB_LOCK', 'RETRY') DEFAULT 'NONE',
metadata JSON NULL,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
INDEX idx_document_id (document_id),
INDEX idx_user_id (user_id),
INDEX idx_status (status),
INDEX idx_operation (operation),
INDEX idx_document_number (document_number),
INDEX idx_created_at (created_at),
FOREIGN KEY (document_id) REFERENCES documents(id) ON DELETE CASCADE,
FOREIGN KEY (user_id) REFERENCES users(id)
) ENGINE=InnoDB COMMENT='Document Number Generation Audit Trail';
```
### 2.4 Error Log Table
```sql
CREATE TABLE document_number_errors (
id BIGINT AUTO_INCREMENT PRIMARY KEY,
error_type ENUM(
'LOCK_TIMEOUT',
'VERSION_CONFLICT',
'DB_ERROR',
'REDIS_ERROR',
'VALIDATION_ERROR',
'SEQUENCE_EXHAUSTED',
'RESERVATION_EXPIRED',
'DUPLICATE_NUMBER'
) NOT NULL,
error_message TEXT,
stack_trace TEXT,
context_data JSON COMMENT 'Request context (user, project, etc.)',
user_id INT,
ip_address VARCHAR(45),
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
resolved_at TIMESTAMP NULL,
INDEX idx_error_type (error_type),
INDEX idx_created_at (created_at),
INDEX idx_user_id (user_id)
) ENGINE=InnoDB COMMENT='Document Numbering Error Log';
```
### 2.5 Reservation Table
```sql
CREATE TABLE document_number_reservations (
id BIGINT AUTO_INCREMENT PRIMARY KEY,
-- Reservation Details
token VARCHAR(36) NOT NULL UNIQUE COMMENT 'UUID v4',
document_number VARCHAR(100) NOT NULL UNIQUE,
status ENUM('RESERVED', 'CONFIRMED', 'CANCELLED', 'VOID') NOT NULL DEFAULT 'RESERVED',
-- Linkage
document_id INT NULL COMMENT 'FK to documents (NULL until confirmed)',
-- Context (for debugging)
project_id INT NOT NULL,
correspondence_type_id INT NOT NULL,
originator_organization_id INT NOT NULL,
recipient_organization_id INT DEFAULT 0,
user_id INT NOT NULL,
-- Timestamps
reserved_at DATETIME(6) DEFAULT CURRENT_TIMESTAMP(6),
expires_at DATETIME(6) NOT NULL,
confirmed_at DATETIME(6) NULL,
cancelled_at DATETIME(6) NULL,
-- Audit
ip_address VARCHAR(45),
user_agent TEXT,
metadata JSON NULL COMMENT 'Additional context',
-- Indexes
INDEX idx_token (token),
INDEX idx_status (status),
INDEX idx_status_expires (status, expires_at),
INDEX idx_document_id (document_id),
INDEX idx_user_id (user_id),
INDEX idx_reserved_at (reserved_at),
-- Foreign Keys
FOREIGN KEY (document_id) REFERENCES documents(id) ON DELETE SET NULL,
FOREIGN KEY (project_id) REFERENCES projects(id) ON DELETE CASCADE,
FOREIGN KEY (correspondence_type_id) REFERENCES correspondence_types(id) ON DELETE CASCADE,
FOREIGN KEY (user_id) REFERENCES users(id) ON DELETE CASCADE
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_general_ci
COMMENT='Document Number Reservations - Two-Phase Commit';
```
---
## 3. Core Services
### 3.1 Number Generation Process
```mermaid
sequenceDiagram
participant C as Client
participant S as NumberingService
participant L as LockService
participant CS as CounterService
participant DB as Database
participant R as Redis
C->>S: generateDocumentNumber(dto)
S->>L: acquireLock(counterKey)
L->>R: REDLOCK acquire
R-->>L: lock acquired
L-->>S: lock handle
S->>CS: incrementCounter(counterKey)
CS->>DB: BEGIN TRANSACTION
CS->>DB: SELECT FOR UPDATE
CS->>DB: UPDATE last_number
CS->>DB: COMMIT
DB-->>CS: newNumber
CS-->>S: sequence
S->>S: formatNumber(template, seq)
S->>L: releaseLock()
L->>R: REDLOCK release
S-->>C: documentNumber
```
### 3.2 Two-Phase Commit (Reserve/Confirm)
```mermaid
sequenceDiagram
participant C as Client
participant RS as ReservationService
participant SS as SequenceService
participant R as Redis
Note over C,R: Phase 1: Reserve
C->>RS: reserve(documentType)
RS->>SS: getNextSequence()
SS-->>RS: documentNumber
RS->>R: SETEX reservation:{token} (TTL: 5min)
RS-->>C: {token, documentNumber, expiresAt}
Note over C,R: Phase 2: Confirm
C->>RS: confirm(token)
RS->>R: GET reservation:{token}
R-->>RS: reservationData
RS->>R: DEL reservation:{token}
RS-->>C: documentNumber (confirmed)
```
### 3.3 Counter Service Implementation
```typescript
// services/counter.service.ts
@Injectable()
export class CounterService {
private readonly logger = new Logger(CounterService.name);
constructor(
@InjectRepository(DocumentNumberCounter)
private counterRepo: Repository<DocumentNumberCounter>,
private dataSource: DataSource,
) {}
async incrementCounter(counterKey: CounterKey): Promise<number> {
const MAX_RETRIES = 2;
for (let attempt = 0; attempt < MAX_RETRIES; attempt++) {
try {
return await this.dataSource.transaction(async (manager) => {
// ใช้ Optimistic Locking
const counter = await manager.findOne(DocumentNumberCounter, {
where: this.buildWhereClause(counterKey),
});
if (!counter) {
const newCounter = manager.create(DocumentNumberCounter, {
...counterKey,
lastNumber: 1,
version: 0,
});
await manager.save(newCounter);
return 1;
}
counter.lastNumber += 1;
await manager.save(counter); // Auto-check version
return counter.lastNumber;
});
} catch (error) {
if (error instanceof OptimisticLockVersionMismatchError) {
this.logger.warn(`Version conflict, retry ${attempt + 1}/${MAX_RETRIES}`);
if (attempt === MAX_RETRIES - 1) {
throw new ConflictException('เลขที่เอกสารถูกเปลี่ยน กรุณาลองใหม่');
}
continue;
}
throw error;
}
}
}
}
```
### 3.4 Redis Lock Service
```typescript
// services/document-numbering-lock.service.ts
@Injectable()
export class DocumentNumberingLockService {
private readonly logger = new Logger(DocumentNumberingLockService.name);
private redlock: Redlock;
constructor(@InjectRedis() private readonly redis: Redis) {
this.redlock = new Redlock([redis], {
driftFactor: 0.01,
retryCount: 5,
retryDelay: 100,
retryJitter: 50,
});
}
async acquireLock(counterKey: CounterKey): Promise<Redlock.Lock> {
const lockKey = this.buildLockKey(counterKey);
const ttl = 5000; // 5 seconds
try {
const lock = await this.redlock.acquire([lockKey], ttl);
this.logger.debug(`Acquired lock: ${lockKey}`);
return lock;
} catch (error) {
this.logger.error(`Failed to acquire lock: ${lockKey}`, error);
throw error;
}
}
async releaseLock(lock: Redlock.Lock): Promise<void> {
try {
await lock.release();
} catch (error) {
this.logger.warn('Failed to release lock (may have expired)', error);
}
}
private buildLockKey(key: CounterKey): string {
return `lock:docnum:${key.projectId}:${key.originatorOrgId}:` +
`${key.recipientOrgId ?? 0}:${key.correspondenceTypeId}:` +
`${key.subTypeId}:${key.rfaTypeId}:${key.disciplineId}:${key.year}`;
}
}
```
### 3.5 Reservation Service
```typescript
// services/reservation.service.ts
@Injectable()
export class ReservationService {
private readonly TTL = 300; // 5 minutes
constructor(
private redis: Redis,
private sequenceService: SequenceService,
private auditService: AuditService,
) {}
async reserve(
documentType: string,
scopeValue?: string,
metadata?: Record<string, any>,
): Promise<Reservation> {
// 1. Generate next number
const documentNumber = await this.sequenceService.getNextSequence(
documentType,
scopeValue,
);
// 2. Generate reservation token
const token = uuidv4();
const expiresAt = new Date(Date.now() + this.TTL * 1000);
// 3. Save to Redis
const reservation: Reservation = {
token,
document_number: documentNumber,
document_type: documentType,
scope_value: scopeValue,
expires_at: expiresAt,
metadata,
};
await this.redis.setex(
`reservation:${token}`,
this.TTL,
JSON.stringify(reservation),
);
// 4. Audit log
await this.auditService.log({
operation: 'RESERVE',
document_type: documentType,
document_number: documentNumber,
metadata: { token, scope_value: scopeValue },
});
return reservation;
}
async confirm(token: string, userId: number): Promise<string> {
const reservation = await this.getReservation(token);
if (!reservation) {
throw new ReservationExpiredError(
'Reservation not found or expired. Please reserve a new number.',
);
}
await this.redis.del(`reservation:${token}`);
await this.auditService.log({
operation: 'CONFIRM',
document_type: reservation.document_type,
document_number: reservation.document_number,
user_id: userId,
metadata: { token },
});
return reservation.document_number;
}
async cancel(token: string, userId: number): Promise<void> {
const reservation = await this.getReservation(token);
if (reservation) {
await this.redis.del(`reservation:${token}`);
await this.auditService.log({
operation: 'CANCEL',
document_type: reservation.document_type,
document_number: reservation.document_number,
user_id: userId,
metadata: { token },
});
}
}
@Cron('0 */5 * * * *') // Every 5 minutes
async cleanupExpired(): Promise<void> {
const keys = await this.redis.keys('reservation:*');
for (const key of keys) {
const ttl = await this.redis.ttl(key);
if (ttl <= 0) {
await this.redis.del(key);
}
}
}
}
```
---
## 4. Template System
### 4.1 Supported Tokens
| Token | Description | Example Output |
| -------------- | ---------------------------- | -------------- |
| `{PROJECT}` | Project Code | `LCBP3` |
| `{ORIGINATOR}` | Originator Organization Code | `คคง.` |
| `{RECIPIENT}` | Recipient Organization Code | `สคฉ.3` |
| `{CORR_TYPE}` | Correspondence Type Code | `L` |
| `{SUB_TYPE}` | Sub Type Code | `TD` |
| `{RFA_TYPE}` | RFA Type Code | `RFA` |
| `{DISCIPLINE}` | Discipline Code | `CV` |
| `{SEQ:n}` | Sequence Number (n digits) | `0001` |
| `{YEAR:CE}` | Year (Common Era) | `2025` |
| `{YEAR:BE}` | Year (Buddhist Era) | `2568` |
| `{REV}` | Revision Number | `A` |
### 4.2 Template Validation
```typescript
// validators/template.validator.ts
@Injectable()
export class TemplateValidator {
private readonly ALLOWED_TOKENS = [
'PROJECT', 'ORIGINATOR', 'RECIPIENT', 'CORR_TYPE',
'SUB_TYPE', 'RFA_TYPE', 'DISCIPLINE', 'SEQ', 'YEAR', 'REV',
];
validate(template: string, correspondenceType: string): ValidationResult {
const tokens = this.extractTokens(template);
const errors: string[] = [];
// ตรวจสอบ Token ที่ไม่รู้จัก
for (const token of tokens) {
if (!this.ALLOWED_TOKENS.includes(token.name)) {
errors.push(`Unknown token: {${token.name}}`);
}
}
// กฎพิเศษสำหรับแต่ละประเภท
if (correspondenceType === 'RFA') {
if (!tokens.some((t) => t.name === 'PROJECT')) {
errors.push('RFA template ต้องมี {PROJECT}');
}
if (!tokens.some((t) => t.name === 'DISCIPLINE')) {
errors.push('RFA template ต้องมี {DISCIPLINE}');
}
}
if (correspondenceType === 'TRANSMITTAL') {
if (!tokens.some((t) => t.name === 'SUB_TYPE')) {
errors.push('TRANSMITTAL template ต้องมี {SUB_TYPE}');
}
}
// ทุก template ต้องมี {SEQ}
if (!tokens.some((t) => t.name.startsWith('SEQ'))) {
errors.push('Template ต้องมี {SEQ:n}');
}
return { valid: errors.length === 0, errors };
}
}
```
---
## 5. API Endpoints
### 5.1 General Endpoints (`/document-numbering`)
| Endpoint | Method | Permission | Description |
| --------------- | ------ | ------------------------ | --------------------------------- |
| `/logs/audit` | GET | `system.view_logs` | Get audit logs |
| `/logs/errors` | GET | `system.view_logs` | Get error logs |
| `/sequences` | GET | `correspondence.read` | Get counter sequences |
| `/counters/:id` | PATCH | `system.manage_settings` | Update counter value |
| `/preview` | POST | `correspondence.read` | Preview number without generating |
| `/reserve` | POST | `correspondence.create` | Reserve a document number |
| `/confirm` | POST | `correspondence.create` | Confirm a reservation |
| `/cancel` | POST | `correspondence.create` | Cancel a reservation |
### 5.2 Admin Endpoints (`/admin/document-numbering`)
| Endpoint | Method | Permission | Description |
| ------------------- | ------ | ------------------------ | ----------------------- |
| `/templates` | GET | `system.manage_settings` | Get all templates |
| `/templates` | POST | `system.manage_settings` | Create/update template |
| `/templates/:id` | DELETE | `system.manage_settings` | Delete template |
| `/metrics` | GET | `system.view_logs` | Get metrics |
| `/manual-override` | POST | `system.manage_settings` | Override counter value |
| `/void-and-replace` | POST | `system.manage_settings` | Void and replace number |
| `/cancel` | POST | `system.manage_settings` | Cancel a number |
| `/bulk-import` | POST | `system.manage_settings` | Bulk import counters |
---
## 6. Monitoring & Observability
### 6.1 Prometheus Metrics
```typescript
@Injectable()
export class NumberingMetrics {
// Counter: Total numbers generated
private readonly numbersGenerated = new Counter({
name: 'numbering_sequences_total',
help: 'Total document numbers generated',
labelNames: ['document_type'],
});
// Gauge: Sequence utilization (%)
private readonly sequenceUtilization = new Gauge({
name: 'numbering_sequence_utilization',
help: 'Sequence utilization percentage',
labelNames: ['document_type'],
});
// Histogram: Lock wait time
private readonly lockWaitTime = new Histogram({
name: 'numbering_lock_wait_seconds',
help: 'Time spent waiting for lock acquisition',
labelNames: ['document_type'],
buckets: [0.01, 0.05, 0.1, 0.5, 1, 2, 5],
});
// Counter: Lock failures
private readonly lockFailures = new Counter({
name: 'numbering_lock_failures_total',
help: 'Total lock acquisition failures',
labelNames: ['document_type', 'reason'],
});
}
```
### 6.2 Alert Rules
| Alert | Condition | Severity | Action |
| ------------------ | ------------------ | -------- | ---------------------- |
| `SequenceCritical` | Utilization > 95% | Critical | Extend max_value |
| `SequenceWarning` | Utilization > 90% | Warning | Plan extension |
| `HighLockWaitTime` | p95 > 1s | Warning | Check Redis health |
| `RedisUnavailable` | Redis cluster down | Critical | Switch to DB-only mode |
| `HighErrorRate` | > 10 errors/sec | Warning | Check logs |
---
## 7. Error Handling
### 7.1 Error Codes
| Code | Name | Description |
| ----- | --------------------------- | -------------------------- |
| NB001 | CONFIG_NOT_FOUND | Config not found for type |
| NB002 | SEQUENCE_EXHAUSTED | Sequence reached max value |
| NB003 | LOCK_TIMEOUT | Failed to acquire lock |
| NB004 | RESERVATION_EXPIRED | Reservation token expired |
| NB005 | DUPLICATE_NUMBER | Number already exists |
| NB006 | INVALID_FORMAT | Number format invalid |
| NB007 | MANUAL_OVERRIDE_NOT_ALLOWED | Manual override disabled |
| NB008 | REDIS_UNAVAILABLE | Redis connection failed |
### 7.2 Fallback Strategy
```mermaid
flowchart TD
A[Generate Number Request] --> B{Redis Available?}
B -->|Yes| C[Acquire Redlock]
B -->|No| D[Use DB-only Lock]
C --> E{Lock Acquired?}
E -->|Yes| F[Increment Counter]
E -->|No| G{Retry < 3?}
G -->|Yes| C
G -->|No| H[Fallback to DB Lock]
D --> F
H --> F
F --> I[Format Number]
I --> J[Return Number]
```
---
## 8. Testing
### 8.1 Unit Tests
```bash
# Run unit tests
pnpm test:watch -- --testPathPattern=document-numbering
```
### 8.2 Integration Tests
```bash
# Run integration tests
pnpm test:e2e -- --testPathPattern=numbering
```
### 8.3 Concurrency Test
```typescript
// tests/load/concurrency.spec.ts
it('should handle 1000 concurrent requests without duplicates', async () => {
const promises = Array.from({ length: 1000 }, () =>
request(app.getHttpServer())
.post('/document-numbering/reserve')
.send({ document_type: 'COR' })
);
const results = await Promise.all(promises);
const numbers = results.map(r => r.body.data.document_number);
const uniqueNumbers = new Set(numbers);
expect(uniqueNumbers.size).toBe(1000);
});
```
---
## 9. Best Practices
### 9.1 DO's ✅
- ✅ Always use two-phase commit (reserve + confirm)
- ✅ Implement fallback to DB-only if Redis fails
- ✅ Log every operation to audit trail
- ✅ Monitor sequence utilization (alert at 90%)
- ✅ Test under concurrent load (1000+ req/s)
- ✅ Use pessimistic locking in database
- ✅ Set reasonable TTL for reservations (5 min)
- ✅ Validate manual override format
- ✅ Skip cancelled numbers (never reuse)
### 9.2 DON'Ts ❌
- ❌ Never skip validation for manual override
- ❌ Never reuse cancelled numbers
- ❌ Never trust client-generated numbers
- ❌ Never increase sequence without transaction
- ❌ Never deploy without load testing
- ❌ Never modify sequence table directly
- ❌ Never skip audit logging
---
## 10. Environment Variables
```bash
# Redis Configuration
REDIS_HOST=localhost
REDIS_PORT=6379
REDIS_PASSWORD=
REDIS_CLUSTER_NODES=redis-1:6379,redis-2:6379,redis-3:6379
# Database
DB_HOST=localhost
DB_PORT=3306
DB_USERNAME=lcbp3
DB_PASSWORD=
DB_DATABASE=lcbp3_db
DB_POOL_SIZE=20
# Numbering Configuration
NUMBERING_LOCK_TIMEOUT=5000 # 5 seconds
NUMBERING_RESERVATION_TTL=300 # 5 minutes
NUMBERING_RETRY_ATTEMPTS=3
NUMBERING_RETRY_DELAY=200 # milliseconds
# Monitoring
PROMETHEUS_PORT=9090
GRAFANA_PORT=3000
```
---
## References
- [Requirements](../01-requirements/01-03.11-document-numbering.md)
- [Operations Guide](../04-operations/04-08-document-numbering-operations.md)
- [ADR-018 Document Numbering](file:///d:/nap-dms.lcbp3/specs/05-decisions/adr-018-document-numbering.md)
- [Backend Guidelines](03-02-backend-guidelines.md)
---
**Document Version**: 2.0.0
**Created By**: Development Team
**Last Updated**: 2025-12-17

View File

@@ -0,0 +1,149 @@
# การตั้งค่า Network Segmentation และ Firewall Rules
สำหรับอุปกรณ์ Omada (ER7206 + OC200) กลยุทธ์หลักคือการใช้ **VLANs (Virtual LANs)** เพื่อแบ่งกลุ่มอุปกรณ์ และใช้ **Firewall ACLs (Access Control Lists)** เพื่อควบคุมการจราจรระหว่างกลุ่มเหล่านั้น
นี่คือคำแนะนำตามแนวทาง "Zero Trust" ที่ปรับให้เข้ากับสถาปัตยกรรมของคุณครับ
---
## 1. 🌐 การแบ่งส่วนเครือข่าย (VLAN Segmentation)
ใน Omada Controller (OC200) ให้คุณไปที่ `Settings > Wired Networks > LAN` และสร้างเครือข่ายย่อย (VLANs) ดังนี้:
* **VLAN 10: SERVER**
* **IP Range:** 192.168.10.x
* **วัตถุประสงค์:** ใช้สำหรับอุปกรณ์ Server (QNAP และ ASUSTOR)
* **VLAN 20 MGMT(Default): Management**
* **IP Range:** 192.168.20.x
* **วัตถุประสงค์:** ใช้สำหรับอุปกรณ์ Network (ER7206, OC200, Switches) และ PC ของผู้ดูแลระบบ (Admin) เท่านั้น
* **VLAN 30: USER**
* **IP Range:** 192.168.30.x
* **วัตถุประสงค์:** สำหรับ PC, Notebook, และ Wi-Fi ของพนักงานทั่วไปที่ต้องเข้าใช้งานระบบ (เช่น `lcbp3.np-dms.work`)
* **VLAN 40: CCTV**
* **IP Range:** 192.168.40.x
* **วัตถุประสงค์:** ใช้สำหรับอุปกรณ์ CCTV เท่านั้น
* **VLAN 50 VOICEt**
* **IP Range:** 192.168.50.x
* **วัตถุประสงค์:** ใช้สำหรับอุปกรณ์ IP Phone เท่านั้น
* **VLAN 60 DMZ**
* **IP Range:** 192.168.60.x
* **วัตถุประสงค์:** ใช้สำหรับ Network DMZ เท่านั้น
* **VLAN 70: GUEST / Untrusted** (สำหรับ Wi-Fi แขก)
* **IP Range:** 192.168.70.x
* **วัตถุประสงค์:** สำหรับ Wi-Fi แขก (Guest) ห้ามเข้าถึงเครือข่ายภายในโดยเด็ดขาด
**การตั้งค่า Port Switch:**
หลังจากสร้าง VLANs แล้ว ให้ไปที่ `Devices` > เลือก Switch ของคุณ > `Ports` > กำหนด Port Profile:
* Port ที่เสียบ QNAP NAS: ตั้งค่า Profile เป็น **VLAN 10**
* Port ที่เสียบ PC พนักงาน: ตั้งค่า Profile เป็น **VLAN 20**
---
## 2. 🔥 Firewall Rules (ACLs)
นี่คือหัวใจสำคัญครับ ไปที่ `Settings > Network Security > ACL (Access Control)`
กฎของ Firewall จะทำงานจากบนลงล่าง (ข้อ 1 ทำก่อนข้อ 2)
---
### 2.1 IP Groups & Port Groups
**IP Groups** (สร้างใน `Settings > Network Security > Groups`):
| Group Name | Members |
| :----------------- | :------------------------------------------------ |
| `Server` | 192.168.10.8, 192.168.10.9, 192.168.10.111 |
| `Omada-Controller` | 192.168.20.250 (OC200 IP) |
| `DHCP-Gateways` | 192.168.30.1, 192.168.70.1 |
| `QNAP_Services` | 192.168.10.8 |
| `Internal` | 192.168.10.0/24, 192.168.20.0/24, 192.168.30.0/24 |
| `Blacklist` | (Add malicious IPs as needed) |
**Port Groups**:
| Group Name | Ports |
| :----------- | :-------------------------------------- |
| `Web` | TCP 443, 8443, 80, 81, 2222 |
| `Omada-Auth` | TCP 443, 8043, 8088, 8843, 29810-29814 |
| `VoIP` | UDP 5060, 5061, 10000-20000 (SIP + RTP) |
| `DHCP` | UDP 67, 68 |
---
### 2.2 Switch ACL (สำหรับ Omada OC200)
| ลำดับ | Name | Policy | Source | Destination | Ports |
| :--- | :------------------------ | :----- | :---------------- | :---------------------------- | :---------------------------------------- |
| 1 | 01 Allow-User-DHCP | Allow | Network → VLAN 30 | IP → 192.168.30.1 | Port Group → DHCP |
| 2 | 02 Allow-Guest-DHCP | Allow | Network → VLAN 70 | IP → 192.168.70.1 | Port Group → DHCP |
| 3 | 03 Allow-WiFi-Auth | Allow | Network → VLAN 30 | IP Group → Omada-Controller | Port Group → Omada-Auth |
| 4 | 04 Allow-Guest-WiFi-Auth | Allow | Network → VLAN 70 | IP Group → Omada-Controller | Port Group → Omada-Auth |
| 5 | 05 Isolate-Guests | Deny | Network → VLAN 70 | Network → VLAN 10, 20, 30, 60 | All |
| 6 | 06 Isolate-Servers | Deny | Network → VLAN 10 | Network → VLAN 30 (USER) | All |
| 7 | 07 Block-User-to-Mgmt | Deny | Network → VLAN 30 | Network → VLAN 20 (MGMT) | All |
| 8 | 08 Allow-User-to-Services | Allow | Network → VLAN 30 | IP → QNAP (192.168.10.8) | Port Group → Web (443,8443, 80, 81, 2222) |
| 9 | 09 Allow-Voice-to-User | Allow | Network → VLAN 50 | Network → VLAN 30,50 | All |
| 10 | 10 Allow-MGMT-to-All | Allow | Network → VLAN 20 | Any | All |
| 11 | 11 Allow-Server-Internal | Allow | IP Group : Server | IP Group : Server | All |
| 12 | 12 Allow-Server → CCTV | Allow | IP Group : Server | Network → VLAN 40 (CCTV) | All |
| 13 | 100 (Default) | Deny | Any | Any | All |
> ⚠️ **หมายเหตุสำคัญ - ลำดับ ACL:**
> 1. **Allow rules ก่อน** - DHCP (#1-2) และ WiFi-Auth (#3-4) ต้องอยู่ **บนสุด**
> 2. **Isolate/Deny rules ถัดมา** - (#5-7) block traffic ที่ไม่ต้องการ
> 3. **Allow specific rules** - (#8-12) อนุญาต traffic ที่เหลือ
> 4. **Default Deny ล่าสุด** - (#13) block ทุกอย่างที่ไม่ match
---
### 2.3 Gateway ACL (สำหรับ Omada ER7206)
| ลำดับ | Name | Policy | Direction | PROTOCOLS | Source | Destination |
| :--- | :---------------------- | :----- | :-------- | :-------- | :------------------- | :--------------------------- |
| 1 | 01 Blacklist | Deny | [WAN2] IN | All | IP Group:Blacklist | IP Group:Internal |
| 2 | 02 Geo | Permit | [WAN2] IN | All | Location Group:Allow | IP Group:Internal |
| 3 | 03 Allow-Voice-Internet | Permit | LAN->WAN | UDP | Network → VLAN 50 | Any |
| 4 | 04 Internal → Internet | Permit | LAN->WAN | All | IP Group:Internal | Domain Group:DomainGroup_Any |
> 💡 **หมายเหตุ:** Rule #3 `Allow-Voice-Internet` อนุญาต IP Phone (VLAN 50) เชื่อมต่อ Cloud PBX ภายนอก ผ่าน Port Group → VoIP (UDP 5060, 5061, 10000-20000)
---
## 3. 🚪 Port Forwarding (การเปิด Service สู่สาธารณะ)
ส่วนนี้ไม่ใช่ Firewall ACL แต่จำเป็นเพื่อให้คนนอกเข้าใช้งานได้ครับ
ไปที่ `Settings > Transmission > Port Forwarding`
สร้างกฎเพื่อส่งต่อการจราจรจาก WAN (อินเทอร์เน็ต) ไปยัง Nginx Proxy Manager (NPM) ที่อยู่บน QNAP (VLAN 10)
* **Name:** Allow-NPM-HTTPS
* **External Port:** 443
* **Internal Port:** 443
* **Internal IP:** `192.168.10.8` (IP ของ QNAP)
* **Protocol:** TCP
* **Name:** Allow-NPM-HTTP (สำหรับ Let's Encrypt)
* **External Port:** 80
* **Internal Port:** 80
* **Internal IP:** `192.168.10.8` (IP ของ QNAP)
* **Protocol:** TCP
### สรุปผังการเชื่อมต่อ
1. **ผู้ใช้ภายนอก** -> `https://lcbp3.np-dms.work`
2. **ER7206** รับที่ Port 443
3. **Port Forwarding** ส่งต่อไปยัง `192.168.10.8:443` (QNAP NPM)
4. **NPM** (บน QNAP) ส่งต่อไปยัง `backend:3000` หรือ `frontend:3000` ภายใน Docker
5. **ผู้ใช้ภายใน (Office)** -> `https://lcbp3.np-dms.work`
6. **Firewall ACL** (กฎข้อ 4) อนุญาตให้ VLAN 30 คุยกับ `192.168.10.8:443`
7. (ขั้นตอนที่ 3-4 ทำงานเหมือนเดิม)
การตั้งค่าตามนี้จะช่วยแยกส่วน Server ของคุณออกจากเครือข่ายพนักงานอย่างชัดเจน ซึ่งปลอดภัยกว่าการวางทุกอย่างไว้ในวง LAN เดียวกันมากครับ

View File

@@ -0,0 +1,937 @@
# Deployment Guide: LCBP3-DMS
---
**Project:** LCBP3-DMS (Laem Chabang Port Phase 3 - Document Management System)
**Version:** 1.6.0
**Last Updated:** 2025-12-02
**Owner:** Operations Team
**Status:** Active
---
## 📋 Overview
This guide provides step-by-step instructions for deploying the LCBP3-DMS system on QNAP Container Station using Docker Compose with Blue-Green deployment strategy.
### Deployment Strategy
- **Platform:** QNAP TS-473A with Container Station
- **Orchestration:** Docker Compose
- **Deployment Method:** Blue-Green Deployment
- **Zero Downtime:** Yes
- **Rollback Capability:** Instant rollback via NGINX switch
---
## 🎯 Prerequisites
### Hardware Requirements
| Component | Minimum Specification |
| ---------- | -------------------------- |
| CPU | 4 cores @ 2.0 GHz |
| RAM | 16 GB |
| Storage | 500 GB SSD (System + Data) |
| Network | 1 Gbps Ethernet |
| QNAP Model | TS-473A or equivalent |
### Software Requirements
| Software | Version | Purpose |
| ----------------- | ------- | ------------------------ |
| QNAP QTS | 5.x+ | Operating System |
| Container Station | 3.x+ | Docker Management |
| Docker | 20.10+ | Container Runtime |
| Docker Compose | 2.x+ | Multi-container Orchestr |
### Network Requirements
- Static IP address for QNAP server
- Domain name (e.g., `lcbp3-dms.example.com`)
- SSL certificate (Let's Encrypt or commercial)
- Firewall rules:
- Port 80 (HTTP → HTTPS redirect)
- Port 443 (HTTPS)
- Port 22 (SSH for management)
---
## 🏗️ Infrastructure Setup
### 1. Directory Structure
Create the following directory structure on QNAP:
```bash
# SSH into QNAP
ssh admin@qnap-ip
# Create base directory
mkdir -p /volume1/lcbp3
# Create blue-green environments
mkdir -p /volume1/lcbp3/blue
mkdir -p /volume1/lcbp3/green
# Create shared directories
mkdir -p /volume1/lcbp3/shared/uploads
mkdir -p /volume1/lcbp3/shared/logs
mkdir -p /volume1/lcbp3/shared/backups
# Create persistent volumes
mkdir -p /volume1/lcbp3/volumes/mariadb-data
mkdir -p /volume1/lcbp3/volumes/redis-data
mkdir -p /volume1/lcbp3/volumes/elastic-data
# Create NGINX proxy directory
mkdir -p /volume1/lcbp3/nginx-proxy
# Set permissions
chmod -R 755 /volume1/lcbp3
chown -R admin:administrators /volume1/lcbp3
```
**Final Structure:**
```
/volume1/lcbp3/
├── blue/ # Blue environment
│ ├── docker-compose.yml
│ ├── .env.production
│ └── nginx.conf
├── green/ # Green environment
│ ├── docker-compose.yml
│ ├── .env.production
│ └── nginx.conf
├── nginx-proxy/ # Main reverse proxy
│ ├── docker-compose.yml
│ ├── nginx.conf
│ └── ssl/
│ ├── cert.pem
│ └── key.pem
├── shared/ # Shared across blue/green
│ ├── uploads/
│ ├── logs/
│ └── backups/
├── volumes/ # Persistent data
│ ├── mariadb-data/
│ ├── redis-data/
│ └── elastic-data/
├── scripts/ # Deployment scripts
│ ├── deploy.sh
│ ├── rollback.sh
│ └── health-check.sh
└── current # File containing "blue" or "green"
```
### 2. SSL Certificate Setup
```bash
# Option 1: Let's Encrypt (Recommended)
# Install certbot on QNAP
opkg install certbot
# Generate certificate
certbot certonly --standalone \
-d lcbp3-dms.example.com \
--email admin@example.com \
--agree-tos
# Copy to nginx-proxy
cp /etc/letsencrypt/live/lcbp3-dms.example.com/fullchain.pem \
/volume1/lcbp3/nginx-proxy/ssl/cert.pem
cp /etc/letsencrypt/live/lcbp3-dms.example.com/privkey.pem \
/volume1/lcbp3/nginx-proxy/ssl/key.pem
# Option 2: Commercial Certificate
# Upload cert.pem and key.pem to /volume1/lcbp3/nginx-proxy/ssl/
```
---
## 📝 Configuration Files
### 1. Environment Variables (.env.production)
Create `.env.production` in both `blue/` and `green/` directories:
```bash
# File: /volume1/lcbp3/blue/.env.production
# DO NOT commit this file to Git!
# Application
NODE_ENV=production
APP_NAME=LCBP3-DMS
APP_URL=https://lcbp3-dms.example.com
# Database
DB_HOST=lcbp3-mariadb
DB_PORT=3306
DB_USERNAME=lcbp3_user
DB_PASSWORD=<CHANGE_ME_STRONG_PASSWORD>
DB_DATABASE=lcbp3_dms
DB_POOL_SIZE=20
# Redis
REDIS_HOST=lcbp3-redis
REDIS_PORT=6379
REDIS_PASSWORD=<CHANGE_ME_STRONG_PASSWORD>
REDIS_DB=0
# JWT Authentication
JWT_SECRET=<CHANGE_ME_RANDOM_64_CHAR_STRING>
JWT_EXPIRES_IN=8h
JWT_REFRESH_EXPIRES_IN=7d
# File Storage
UPLOAD_PATH=/app/uploads
MAX_FILE_SIZE=52428800
ALLOWED_FILE_TYPES=.pdf,.doc,.docx,.xls,.xlsx,.dwg,.zip
# Email (SMTP)
SMTP_HOST=smtp.gmail.com
SMTP_PORT=587
SMTP_SECURE=false
SMTP_USERNAME=<YOUR_EMAIL>
SMTP_PASSWORD=<YOUR_APP_PASSWORD>
SMTP_FROM=noreply@example.com
# Elasticsearch
ELASTICSEARCH_NODE=http://lcbp3-elasticsearch:9200
ELASTICSEARCH_USERNAME=elastic
ELASTICSEARCH_PASSWORD=<CHANGE_ME>
# Rate Limiting
THROTTLE_TTL=60
THROTTLE_LIMIT=100
# Logging
LOG_LEVEL=info
LOG_FILE_PATH=/app/logs
# ClamAV (Virus Scanning)
CLAMAV_HOST=lcbp3-clamav
CLAMAV_PORT=3310
```
### 2. Docker Compose - Blue Environment
```yaml
# File: /volume1/lcbp3/blue/docker-compose.yml
version: '3.8'
services:
backend:
image: lcbp3-backend:latest
container_name: lcbp3-blue-backend
restart: unless-stopped
env_file:
- .env.production
volumes:
- /volume1/lcbp3/shared/uploads:/app/uploads
- /volume1/lcbp3/shared/logs:/app/logs
depends_on:
- mariadb
- redis
- elasticsearch
networks:
- lcbp3-network
healthcheck:
test: ['CMD', 'curl', '-f', 'http://localhost:3000/health']
interval: 30s
timeout: 10s
retries: 3
start_period: 40s
frontend:
image: lcbp3-frontend:latest
container_name: lcbp3-blue-frontend
restart: unless-stopped
environment:
- NEXT_PUBLIC_API_URL=https://lcbp3-dms.example.com/api
depends_on:
- backend
networks:
- lcbp3-network
healthcheck:
test: ['CMD', 'curl', '-f', 'http://localhost:3000']
interval: 30s
timeout: 10s
retries: 3
mariadb:
image: mariadb:11.8
container_name: lcbp3-mariadb
restart: unless-stopped
environment:
MYSQL_ROOT_PASSWORD: ${DB_PASSWORD}
MYSQL_DATABASE: ${DB_DATABASE}
MYSQL_USER: ${DB_USERNAME}
MYSQL_PASSWORD: ${DB_PASSWORD}
volumes:
- /volume1/lcbp3/volumes/mariadb-data:/var/lib/mysql
networks:
- lcbp3-network
command: >
--character-set-server=utf8mb4
--collation-server=utf8mb4_unicode_ci
--max_connections=200
--innodb_buffer_pool_size=2G
healthcheck:
test: ['CMD', 'mysqladmin', 'ping', '-h', 'localhost']
interval: 10s
timeout: 5s
retries: 5
redis:
image: redis:7-alpine
container_name: lcbp3-redis
restart: unless-stopped
command: >
redis-server
--requirepass ${REDIS_PASSWORD}
--appendonly yes
--appendfsync everysec
--maxmemory 2gb
--maxmemory-policy allkeys-lru
volumes:
- /volume1/lcbp3/volumes/redis-data:/data
networks:
- lcbp3-network
healthcheck:
test: ['CMD', 'redis-cli', 'ping']
interval: 10s
timeout: 3s
retries: 3
elasticsearch:
image: elasticsearch:8.11.0
container_name: lcbp3-elasticsearch
restart: unless-stopped
environment:
- discovery.type=single-node
- xpack.security.enabled=true
- ELASTIC_PASSWORD=${ELASTICSEARCH_PASSWORD}
- "ES_JAVA_OPTS=-Xms2g -Xmx2g"
volumes:
- /volume1/lcbp3/volumes/elastic-data:/usr/share/elasticsearch/data
networks:
- lcbp3-network
healthcheck:
test: ['CMD-SHELL', 'curl -f http://localhost:9200/_cluster/health || exit 1']
interval: 30s
timeout: 10s
retries: 5
networks:
lcbp3-network:
name: lcbp3-blue-network
driver: bridge
```
### 3. Docker Compose - NGINX Proxy
```yaml
# File: /volume1/lcbp3/nginx-proxy/docker-compose.yml
version: '3.8'
services:
nginx:
image: nginx:alpine
container_name: lcbp3-nginx
restart: unless-stopped
ports:
- "80:80"
- "443:443"
volumes:
- ./nginx.conf:/etc/nginx/nginx.conf:ro
- ./ssl:/etc/nginx/ssl:ro
- /volume1/lcbp3/shared/logs/nginx:/var/log/nginx
networks:
- lcbp3-blue-network
- lcbp3-green-network
healthcheck:
test: ['CMD', 'nginx', '-t']
interval: 30s
timeout: 10s
retries: 3
networks:
lcbp3-blue-network:
external: true
lcbp3-green-network:
external: true
```
### 4. NGINX Configuration
```nginx
# File: /volume1/lcbp3/nginx-proxy/nginx.conf
user nginx;
worker_processes auto;
error_log /var/log/nginx/error.log warn;
pid /var/run/nginx.pid;
events {
worker_connections 1024;
}
http {
include /etc/nginx/mime.types;
default_type application/octet-stream;
log_format main '$remote_addr - $remote_user [$time_local] "$request" '
'$status $body_bytes_sent "$http_referer" '
'"$http_user_agent" "$http_x_forwarded_for"';
access_log /var/log/nginx/access.log main;
sendfile on;
tcp_nopush on;
tcp_nodelay on;
keepalive_timeout 65;
types_hash_max_size 2048;
client_max_body_size 50M;
# Gzip compression
gzip on;
gzip_vary on;
gzip_proxied any;
gzip_comp_level 6;
gzip_types text/plain text/css text/xml text/javascript
application/json application/javascript application/xml+rss;
# Upstream backends (switch between blue/green)
upstream backend {
server lcbp3-blue-backend:3000 max_fails=3 fail_timeout=30s;
keepalive 32;
}
upstream frontend {
server lcbp3-blue-frontend:3000 max_fails=3 fail_timeout=30s;
keepalive 32;
}
# HTTP to HTTPS redirect
server {
listen 80;
server_name lcbp3-dms.example.com;
return 301 https://$server_name$request_uri;
}
# HTTPS server
server {
listen 443 ssl http2;
server_name lcbp3-dms.example.com;
# SSL configuration
ssl_certificate /etc/nginx/ssl/cert.pem;
ssl_certificate_key /etc/nginx/ssl/key.pem;
ssl_protocols TLSv1.2 TLSv1.3;
ssl_ciphers HIGH:!aNULL:!MD5;
ssl_prefer_server_ciphers on;
ssl_session_cache shared:SSL:10m;
ssl_session_timeout 10m;
# Security headers
add_header Strict-Transport-Security "max-age=31536000; includeSubDomains" always;
add_header X-Frame-Options "SAMEORIGIN" always;
add_header X-Content-Type-Options "nosniff" always;
add_header X-XSS-Protection "1; mode=block" always;
# Frontend (Next.js)
location / {
proxy_pass http://frontend;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection 'upgrade';
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_cache_bypass $http_upgrade;
}
# Backend API
location /api {
proxy_pass http://backend;
proxy_http_version 1.1;
proxy_set_header Connection "";
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
# Timeouts for file uploads
proxy_connect_timeout 300s;
proxy_send_timeout 300s;
proxy_read_timeout 300s;
}
# Health check endpoint (no logging)
location /health {
proxy_pass http://backend/health;
access_log off;
}
# Static files caching
location ~* \.(jpg|jpeg|png|gif|ico|css|js|svg|woff|woff2|ttf|eot)$ {
proxy_pass http://frontend;
expires 1y;
add_header Cache-Control "public, immutable";
}
}
}
```
---
## 🚀 Initial Deployment
### Step 1: Prepare Docker Images
```bash
# Build images (on development machine)
cd /path/to/lcbp3/backend
docker build -t lcbp3-backend:1.0.0 .
docker tag lcbp3-backend:1.0.0 lcbp3-backend:latest
cd /path/to/lcbp3/frontend
docker build -t lcbp3-frontend:1.0.0 .
docker tag lcbp3-frontend:1.0.0 lcbp3-frontend:latest
# Save images to tar files
docker save lcbp3-backend:latest | gzip > lcbp3-backend-latest.tar.gz
docker save lcbp3-frontend:latest | gzip > lcbp3-frontend-latest.tar.gz
# Transfer to QNAP
scp lcbp3-backend-latest.tar.gz admin@qnap-ip:/volume1/lcbp3/
scp lcbp3-frontend-latest.tar.gz admin@qnap-ip:/volume1/lcbp3/
# Load images on QNAP
ssh admin@qnap-ip
cd /volume1/lcbp3
docker load < lcbp3-backend-latest.tar.gz
docker load < lcbp3-frontend-latest.tar.gz
```
### Step 2: Initialize Database
```bash
# Start MariaDB only
cd /volume1/lcbp3/blue
docker-compose up -d mariadb
# Wait for MariaDB to be ready
docker exec lcbp3-mariadb mysqladmin ping -h localhost
# Run migrations
docker-compose up -d backend
docker exec lcbp3-blue-backend npm run migration:run
# Seed initial data (if needed)
docker exec lcbp3-blue-backend npm run seed
```
### Step 3: Start Blue Environment
```bash
cd /volume1/lcbp3/blue
# Start all services
docker-compose up -d
# Check status
docker-compose ps
# View logs
docker-compose logs -f
# Wait for health checks
sleep 30
# Test health endpoint
curl http://localhost:3000/health
```
### Step 4: Start NGINX Proxy
```bash
cd /volume1/lcbp3/nginx-proxy
# Create networks (if not exist)
docker network create lcbp3-blue-network
docker network create lcbp3-green-network
# Start NGINX
docker-compose up -d
# Test NGINX configuration
docker exec lcbp3-nginx nginx -t
# Check NGINX logs
docker logs lcbp3-nginx
```
### Step 5: Set Current Environment
```bash
# Mark blue as current
echo "blue" > /volume1/lcbp3/current
```
### Step 6: Verify Deployment
```bash
# Test HTTPS endpoint
curl -k https://lcbp3-dms.example.com/health
# Test API
curl -k https://lcbp3-dms.example.com/api/health
# Check all containers
docker ps --filter "name=lcbp3"
# Check logs for errors
docker-compose -f /volume1/lcbp3/blue/docker-compose.yml logs --tail=100
```
---
## 🔄 Blue-Green Deployment Process
### Deployment Script
```bash
# File: /volume1/lcbp3/scripts/deploy.sh
#!/bin/bash
set -e # Exit on error
# Configuration
LCBP3_DIR="/volume1/lcbp3"
CURRENT=$(cat $LCBP3_DIR/current)
TARGET=$([[ "$CURRENT" == "blue" ]] && echo "green" || echo "blue")
echo "========================================="
echo "LCBP3-DMS Blue-Green Deployment"
echo "========================================="
echo "Current environment: $CURRENT"
echo "Target environment: $TARGET"
echo "========================================="
# Step 1: Backup database
echo "[1/9] Creating database backup..."
BACKUP_FILE="$LCBP3_DIR/shared/backups/db-backup-$(date +%Y%m%d-%H%M%S).sql"
docker exec lcbp3-mariadb mysqldump -u root -p${DB_PASSWORD} lcbp3_dms > $BACKUP_FILE
gzip $BACKUP_FILE
echo "✓ Backup created: $BACKUP_FILE.gz"
# Step 2: Pull latest images
echo "[2/9] Pulling latest Docker images..."
cd $LCBP3_DIR/$TARGET
docker-compose pull
echo "✓ Images pulled"
# Step 3: Update configuration
echo "[3/9] Updating configuration..."
# Copy .env if changed
if [ -f "$LCBP3_DIR/.env.production.new" ]; then
cp $LCBP3_DIR/.env.production.new $LCBP3_DIR/$TARGET/.env.production
echo "✓ Configuration updated"
fi
# Step 4: Start target environment
echo "[4/9] Starting $TARGET environment..."
docker-compose up -d
echo "$TARGET environment started"
# Step 5: Wait for services to be ready
echo "[5/9] Waiting for services to be healthy..."
sleep 10
# Check backend health
for i in {1..30}; do
if docker exec lcbp3-${TARGET}-backend curl -f http://localhost:3000/health > /dev/null 2>&1; then
echo "✓ Backend is healthy"
break
fi
if [ $i -eq 30 ]; then
echo "✗ Backend health check failed!"
docker-compose logs backend
exit 1
fi
sleep 2
done
# Step 6: Run database migrations
echo "[6/9] Running database migrations..."
docker exec lcbp3-${TARGET}-backend npm run migration:run
echo "✓ Migrations completed"
# Step 7: Switch NGINX to target environment
echo "[7/9] Switching NGINX to $TARGET..."
sed -i "s/lcbp3-${CURRENT}-backend/lcbp3-${TARGET}-backend/g" $LCBP3_DIR/nginx-proxy/nginx.conf
sed -i "s/lcbp3-${CURRENT}-frontend/lcbp3-${TARGET}-frontend/g" $LCBP3_DIR/nginx-proxy/nginx.conf
docker exec lcbp3-nginx nginx -t
docker exec lcbp3-nginx nginx -s reload
echo "✓ NGINX switched to $TARGET"
# Step 8: Verify new environment
echo "[8/9] Verifying new environment..."
sleep 5
if curl -f -k https://lcbp3-dms.example.com/health > /dev/null 2>&1; then
echo "✓ New environment is responding"
else
echo "✗ New environment verification failed!"
echo "Rolling back..."
./rollback.sh
exit 1
fi
# Step 9: Stop old environment
echo "[9/9] Stopping $CURRENT environment..."
cd $LCBP3_DIR/$CURRENT
docker-compose down
echo "$CURRENT environment stopped"
# Update current pointer
echo "$TARGET" > $LCBP3_DIR/current
echo "========================================="
echo "✓ Deployment completed successfully!"
echo "Active environment: $TARGET"
echo "========================================="
# Send notification (optional)
# /scripts/send-notification.sh "Deployment completed: $TARGET is now active"
```
### Rollback Script
```bash
# File: /volume1/lcbp3/scripts/rollback.sh
#!/bin/bash
set -e
LCBP3_DIR="/volume1/lcbp3"
CURRENT=$(cat $LCBP3_DIR/current)
PREVIOUS=$([[ "$CURRENT" == "blue" ]] && echo "green" || echo "blue")
echo "========================================="
echo "LCBP3-DMS Rollback"
echo "========================================="
echo "Current: $CURRENT"
echo "Rolling back to: $PREVIOUS"
echo "========================================="
# Switch NGINX back
echo "[1/3] Switching NGINX to $PREVIOUS..."
sed -i "s/lcbp3-${CURRENT}-backend/lcbp3-${PREVIOUS}-backend/g" $LCBP3_DIR/nginx-proxy/nginx.conf
sed -i "s/lcbp3-${CURRENT}-frontend/lcbp3-${PREVIOUS}-frontend/g" $LCBP3_DIR/nginx-proxy/nginx.conf
docker exec lcbp3-nginx nginx -s reload
echo "✓ NGINX switched"
# Start previous environment if stopped
echo "[2/3] Ensuring $PREVIOUS environment is running..."
cd $LCBP3_DIR/$PREVIOUS
docker-compose up -d
sleep 10
echo "$PREVIOUS environment is running"
# Verify
echo "[3/3] Verifying rollback..."
if curl -f -k https://lcbp3-dms.example.com/health > /dev/null 2>&1; then
echo "✓ Rollback successful"
echo "$PREVIOUS" > $LCBP3_DIR/current
else
echo "✗ Rollback verification failed!"
exit 1
fi
echo "========================================="
echo "✓ Rollback completed"
echo "Active environment: $PREVIOUS"
echo "========================================="
```
### Make Scripts Executable
```bash
chmod +x /volume1/lcbp3/scripts/deploy.sh
chmod +x /volume1/lcbp3/scripts/rollback.sh
```
---
## 📋 Deployment Checklist
### Pre-Deployment
- [ ] Backup current database
- [ ] Tag Docker images with version
- [ ] Update `.env.production` if needed
- [ ] Review migration scripts
- [ ] Notify stakeholders of deployment window
- [ ] Verify SSL certificate validity (> 30 days)
- [ ] Check disk space (> 20% free)
- [ ] Review recent error logs
### During Deployment
- [ ] Pull latest Docker images
- [ ] Start target environment (blue/green)
- [ ] Run database migrations
- [ ] Verify health checks pass
- [ ] Switch NGINX proxy
- [ ] Verify application responds correctly
- [ ] Check for errors in logs
- [ ] Monitor performance metrics
### Post-Deployment
- [ ] Monitor logs for 30 minutes
- [ ] Check performance metrics
- [ ] Verify all features working
- [ ] Test critical user flows
- [ ] Stop old environment
- [ ] Update deployment log
- [ ] Notify stakeholders of completion
- [ ] Archive old Docker images
---
## 🔍 Troubleshooting
### Common Issues
#### 1. Container Won't Start
```bash
# Check logs
docker logs lcbp3-blue-backend
# Check resource usage
docker stats
# Restart container
docker restart lcbp3-blue-backend
```
#### 2. Database Connection Failed
```bash
# Check MariaDB is running
docker ps | grep mariadb
# Test connection
docker exec lcbp3-mariadb mysql -u lcbp3_user -p -e "SELECT 1"
# Check environment variables
docker exec lcbp3-blue-backend env | grep DB_
```
#### 3. NGINX 502 Bad Gateway
```bash
# Check backend is running
curl http://localhost:3000/health
# Check NGINX configuration
docker exec lcbp3-nginx nginx -t
# Check NGINX logs
docker logs lcbp3-nginx
# Reload NGINX
docker exec lcbp3-nginx nginx -s reload
```
#### 4. Migration Failed
```bash
# Check migration status
docker exec lcbp3-blue-backend npm run migration:show
# Revert last migration
docker exec lcbp3-blue-backend npm run migration:revert
# Re-run migrations
docker exec lcbp3-blue-backend npm run migration:run
```
---
## 📊 Monitoring
### Health Checks
```bash
# Backend health
curl https://lcbp3-dms.example.com/health
# Database health
docker exec lcbp3-mariadb mysqladmin ping
# Redis health
docker exec lcbp3-redis redis-cli ping
# All containers status
docker ps --filter "name=lcbp3" --format "table {{.Names}}\t{{.Status}}\t{{.Ports}}"
```
### Performance Monitoring
```bash
# Container resource usage
docker stats --no-stream
# Disk usage
df -h /volume1/lcbp3
# Database size
docker exec lcbp3-mariadb mysql -u root -p -e "
SELECT table_schema AS 'Database',
ROUND(SUM(data_length + index_length) / 1024 / 1024, 2) AS 'Size (MB)'
FROM information_schema.tables
WHERE table_schema = 'lcbp3_dms'
GROUP BY table_schema;"
```
---
## 🔐 Security Best Practices
1. **Change Default Passwords:** Update all passwords in `.env.production`
2. **SSL/TLS:** Always use HTTPS in production
3. **Firewall:** Only expose ports 80, 443, and 22 (SSH)
4. **Regular Updates:** Keep Docker images updated
5. **Backup Encryption:** Encrypt database backups
6. **Access Control:** Limit SSH access to specific IPs
7. **Secrets Management:** Never commit `.env` files to Git
8. **Log Monitoring:** Review logs daily for suspicious activity
---
## 📚 Related Documentation
- [Environment Setup Guide](04-02-environment-setup.md)
- [Backup & Recovery](04-04-backup-recovery.md)
- [Monitoring & Alerting](04-03-monitoring-alerting.md)
- [Maintenance Procedures](04-05-maintenance-procedures.md)
- [ADR-015: Deployment Infrastructure](../05-decisions/ADR-015-deployment-infrastructure.md)
---
**Version:** 1.6.0
**Last Updated:** 2025-12-02
**Next Review:** 2026-06-01

View File

@@ -0,0 +1,463 @@
# Environment Setup & Configuration
**Project:** LCBP3-DMS
**Version:** 1.6.0
**Last Updated:** 2025-12-02
---
## 📋 Overview
This document describes environment variables, configuration files, and secrets management for LCBP3-DMS deployment.
---
## 🔐 Environment Variables
### Backend (.env)
```bash
# File: backend/.env (DO NOT commit to Git)
# Application
NODE_ENV=production
APP_PORT=3000
APP_URL=https://lcbp3-dms.example.com
# Database
DB_HOST=lcbp3-mariadb
DB_PORT=3306
DB_USER=lcbp3_user
DB_PASS=<STRONG_PASSWORD>
DB_NAME=lcbp3_dms
# Redis
REDIS_HOST=lcbp3-redis
REDIS_PORT=6379
REDIS_PASSWORD=<STRONG_PASSWORD>
# JWT Authentication
JWT_SECRET=<RANDOM_256_BIT_SECRET>
JWT_EXPIRATION=1h
JWT_REFRESH_SECRET=<RANDOM_256_BIT_SECRET>
JWT_REFRESH_EXPIRATION=7d
# File Storage
UPLOAD_DIR=/app/uploads
TEMP_UPLOAD_DIR=/app/uploads/temp
MAX_FILE_SIZE=104857600 # 100MB
ALLOWED_FILE_TYPES=pdf,doc,docx,xls,xlsx,dwg,jpg,png
# SMTP Email
SMTP_HOST=smtp.gmail.com
SMTP_PORT=587
SMTP_USER=noreply@example.com
SMTP_PASS=<APP_PASSWORD>
SMTP_FROM="LCBP3-DMS System <noreply@example.com>"
# LINE Notify (Optional)
LINE_NOTIFY_ENABLED=true
# ClamAV Virus Scanner
CLAMAV_HOST=clamav
CLAMAV_PORT=3310
# Elasticsearch
ELASTICSEARCH_NODE=http://lcbp3-elasticsearch:9200
ELASTICSEARCH_INDEX_PREFIX=lcbp3_
# Logging
LOG_LEVEL=info
LOG_FILE_PATH=/app/logs
# Frontend URL (for email links)
FRONTEND_URL=https://lcbp3-dms.example.com
# Rate Limiting
RATE_LIMIT_TTL=60
RATE_LIMIT_MAX=100
```
### Frontend (.env.local)
```bash
# File: frontend/.env.local (DO NOT commit to Git)
# API Backend
NEXT_PUBLIC_API_URL=https://lcbp3-dms.example.com/api
# Application
NEXT_PUBLIC_APP_NAME=LCBP3-DMS
NEXT_PUBLIC_APP_VERSION=1.5.0
# Feature Flags
NEXT_PUBLIC_ENABLE_NOTIFICATIONS=true
NEXT_PUBLIC_ENABLE_LINE_NOTIFY=true
```
---
## 🐳 Docker Compose Configuration
### Production docker-compose.yml
```yaml
# File: docker-compose.yml
version: '3.8'
services:
# NGINX Reverse Proxy
nginx:
image: nginx:alpine
container_name: lcbp3-nginx
ports:
- '80:80'
- '443:443'
volumes:
- ./nginx/nginx.conf:/etc/nginx/nginx.conf:ro
- ./nginx/ssl:/etc/nginx/ssl:ro
- nginx-logs:/var/log/nginx
depends_on:
- backend
- frontend
restart: unless-stopped
networks:
- lcbp3-network
# NestJS Backend
backend:
image: lcbp3-backend:latest
container_name: lcbp3-backend
environment:
- NODE_ENV=production
env_file:
- ./backend/.env
volumes:
- uploads:/app/uploads
- backend-logs:/app/logs
depends_on:
- mariadb
- redis
- elasticsearch
restart: unless-stopped
networks:
- lcbp3-network
healthcheck:
test: ['CMD', 'curl', '-f', 'http://localhost:3000/health']
interval: 30s
timeout: 10s
retries: 3
# Next.js Frontend
frontend:
image: lcbp3-frontend:latest
container_name: lcbp3-frontend
environment:
- NODE_ENV=production
env_file:
- ./frontend/.env.local
restart: unless-stopped
networks:
- lcbp3-network
# MariaDB Database
mariadb:
image: mariadb:11.8
container_name: lcbp3-mariadb
environment:
MYSQL_ROOT_PASSWORD: ${DB_ROOT_PASS}
MYSQL_DATABASE: ${DB_NAME}
MYSQL_USER: ${DB_USER}
MYSQL_PASSWORD: ${DB_PASS}
volumes:
- mariadb-data:/var/lib/mysql
- ./mariadb/init:/docker-entrypoint-initdb.d:ro
ports:
- '3306:3306'
restart: unless-stopped
networks:
- lcbp3-network
command: --character-set-server=utf8mb4 --collation-server=utf8mb4_unicode_ci
# Redis Cache & Queue
redis:
image: redis:7.2-alpine
container_name: lcbp3-redis
command: redis-server --requirepass ${REDIS_PASSWORD}
volumes:
- redis-data:/data
ports:
- '6379:6379'
restart: unless-stopped
networks:
- lcbp3-network
# Elasticsearch
elasticsearch:
image: docker.elastic.co/elasticsearch/elasticsearch:8.11.0
container_name: lcbp3-elasticsearch
environment:
- discovery.type=single-node
- 'ES_JAVA_OPTS=-Xms512m -Xmx512m'
- xpack.security.enabled=false
volumes:
- elasticsearch-data:/usr/share/elasticsearch/data
ports:
- '9200:9200'
restart: unless-stopped
networks:
- lcbp3-network
# ClamAV (Optional - for virus scanning)
clamav:
image: clamav/clamav:latest
container_name: lcbp3-clamav
restart: unless-stopped
networks:
- lcbp3-network
networks:
lcbp3-network:
driver: bridge
volumes:
mariadb-data:
redis-data:
elasticsearch-data:
uploads:
backend-logs:
nginx-logs:
```
### Development docker-compose.override.yml
```yaml
# File: docker-compose.override.yml (Local development only)
# Add to .gitignore
version: '3.8'
services:
backend:
build:
context: ./backend
dockerfile: Dockerfile.dev
volumes:
- ./backend:/app
- /app/node_modules
environment:
- NODE_ENV=development
- LOG_LEVEL=debug
ports:
- '3000:3000'
- '9229:9229' # Node.js debugger
frontend:
build:
context: ./frontend
dockerfile: Dockerfile.dev
volumes:
- ./frontend:/app
- /app/node_modules
- /app/.next
ports:
- '3001:3000'
mariadb:
ports:
- '3307:3306' # Avoid conflict with local MySQL
redis:
ports:
- '6380:6379'
elasticsearch:
environment:
- 'ES_JAVA_OPTS=-Xms256m -Xmx256m' # Lower memory for dev
```
---
## 🔑 Secrets Management
### Using Docker Secrets (Recommended for Production)
```yaml
# docker-compose.yml
services:
backend:
secrets:
- db_password
- jwt_secret
environment:
DB_PASS_FILE: /run/secrets/db_password
JWT_SECRET_FILE: /run/secrets/jwt_secret
secrets:
db_password:
file: ./secrets/db_password.txt
jwt_secret:
file: ./secrets/jwt_secret.txt
```
### Generate Strong Secrets
```bash
# Generate JWT Secret
openssl rand -base64 64
# Generate Database Password
openssl rand -base64 32
# Generate Redis Password
openssl rand -base64 32
```
---
## 📁 Directory Structure
```
lcbp3/
├── backend/
│ ├── .env # Backend environment (DO NOT commit)
│ ├── .env.example # Example template (commit this)
│ └── ...
├── frontend/
│ ├── .env.local # Frontend environment (DO NOT commit)
│ ├── .env.example # Example template
│ └── ...
├── nginx/
│ ├── nginx.conf
│ └── ssl/
│ ├── cert.pem
│ └── key.pem
├── secrets/ # Docker secrets (DO NOT commit)
│ ├── db_password.txt
│ ├── jwt_secret.txt
│ └── redis_password.txt
├── docker-compose.yml # Production config
└── docker-compose.override.yml # Development config (DO NOT commit)
```
---
## ⚙️ Configuration Management
### Environment-Specific Configs
**Development:**
```bash
NODE_ENV=development
LOG_LEVEL=debug
DB_HOST=localhost
```
**Staging:**
```bash
NODE_ENV=staging
LOG_LEVEL=info
DB_HOST=staging-db.internal
```
**Production:**
```bash
NODE_ENV=production
LOG_LEVEL=warn
DB_HOST=prod-db.internal
```
### Configuration Validation
Backend validates environment variables at startup:
```typescript
// File: backend/src/config/env.validation.ts
import * as Joi from 'joi';
export const envValidationSchema = Joi.object({
NODE_ENV: Joi.string()
.valid('development', 'staging', 'production')
.required(),
DB_HOST: Joi.string().required(),
DB_PORT: Joi.number().default(3306),
DB_USER: Joi.string().required(),
DB_PASS: Joi.string().required(),
JWT_SECRET: Joi.string().min(32).required(),
// ...
});
```
---
## 🔒 Security Best Practices
### DO:
- ✅ Use strong, random passwords (minimum 32 characters)
- ✅ Rotate secrets every 90 days
- ✅ Use Docker secrets for production
- ✅ Add `.env` files to `.gitignore`
- ✅ Provide `.env.example` templates
- ✅ Validate environment variables at startup
### DON'T:
- ❌ Commit `.env` files to Git
- ❌ Use weak or default passwords
- ❌ Share production credentials via email/chat
- ❌ Reuse passwords across environments
- ❌ Hardcode secrets in source code
---
## 🛠️ Troubleshooting
### Common Issues
**Backend can't connect to database:**
```bash
# Check database container is running
docker ps | grep mariadb
# Check database logs
docker logs lcbp3-mariadb
# Verify credentials
docker exec lcbp3-backend env | grep DB_
```
**Redis connection refused:**
```bash
# Test Redis connection
docker exec lcbp3-redis redis-cli -a <PASSWORD> ping
# Should return: PONG
```
**Environment variable not loading:**
```bash
# Check if env file exists
ls -la backend/.env
# Check if backend loaded the env
docker exec lcbp3-backend env | grep NODE_ENV
```
---
## 📚 Related Documents
- [Deployment Guide](04-01-deployment-guide.md)
- [Security Operations](04-06-security-operations.md)
- [ADR-005: Technology Stack](../05-decisions/ADR-005-technology-stack.md)
---
**Version:** 1.6.0
**Last Review:** 2025-12-01
**Next Review:** 2026-03-01

View File

@@ -0,0 +1,443 @@
# Monitoring & Alerting
**Project:** LCBP3-DMS
**Version:** 1.6.0
**Last Updated:** 2025-12-02
---
## 📋 Overview
This document describes monitoring setup, health checks, and alerting rules for LCBP3-DMS.
---
## 🎯 Monitoring Objectives
- **Availability:** System uptime > 99.5%
- **Performance:** API response time < 500ms (P95)
- **Reliability:** Error rate < 1%
- **Capacity:** Resource utilization < 80%
---
## 📊 Key Metrics
### Application Metrics
| Metric | Target | Alert Threshold |
| ----------------------- | ------- | ------------------ |
| API Response Time (P95) | < 500ms | > 1000ms |
| Error Rate | < 1% | > 5% |
| Request Rate | N/A | Sudden ±50% change |
| Active Users | N/A | - |
| Queue Length (BullMQ) | < 100 | > 500 |
### Infrastructure Metrics
| Metric | Target | Alert Threshold |
| ------------ | ------ | ----------------- |
| CPU Usage | < 70% | > 90% |
| Memory Usage | < 80% | > 95% |
| Disk Usage | < 80% | > 90% |
| Network I/O | N/A | Anomaly detection |
### Database Metrics
| Metric | Target | Alert Threshold |
| --------------------- | ------- | --------------- |
| Query Time (P95) | < 100ms | > 500ms |
| Connection Pool Usage | < 80% | > 95% |
| Slow Queries | 0 | > 10/min |
| Replication Lag | 0s | > 30s |
---
## 🔍 Health Checks
### Backend Health Endpoint
```typescript
// File: backend/src/health/health.controller.ts
import { Controller, Get } from '@nestjs/common';
import {
HealthCheck,
HealthCheckService,
TypeOrmHealthIndicator,
DiskHealthIndicator,
} from '@nestjs/terminus';
@Controller('health')
export class HealthController {
constructor(
private health: HealthCheckService,
private db: TypeOrmHealthIndicator,
private disk: DiskHealthIndicator
) {}
@Get()
@HealthCheck()
check() {
return this.health.check([
// Database health
() => this.db.pingCheck('database'),
// Disk health
() =>
this.disk.checkStorage('storage', {
path: '/',
thresholdPercent: 0.9,
}),
// Redis health
async () => {
const redis = await this.redis.ping();
return { redis: { status: redis === 'PONG' ? 'up' : 'down' } };
},
]);
}
}
```
### Health Check Response
```json
{
"status": "ok",
"info": {
"database": {
"status": "up"
},
"storage": {
"status": "up",
"freePercent": 0.75
},
"redis": {
"status": "up"
}
},
"error": {},
"details": {
"database": {
"status": "up"
},
"storage": {
"status": "up",
"freePercent": 0.75
},
"redis": {
"status": "up"
}
}
}
```
---
## 🐳 Docker Container Monitoring
### Health Check in docker-compose.yml
```yaml
services:
backend:
healthcheck:
test: ['CMD', 'curl', '-f', 'http://localhost:3000/health']
interval: 30s
timeout: 10s
retries: 3
start_period: 40s
mariadb:
healthcheck:
test: ['CMD', 'mysqladmin', 'ping', '-h', 'localhost']
interval: 30s
timeout: 10s
retries: 3
redis:
healthcheck:
test: ['CMD', 'redis-cli', 'ping']
interval: 30s
timeout: 10s
retries: 3
```
### Monitor Container Status
```bash
#!/bin/bash
# File: /scripts/monitor-containers.sh
# Check all containers are healthy
CONTAINERS=("lcbp3-backend" "lcbp3-frontend" "lcbp3-mariadb" "lcbp3-redis")
for CONTAINER in "${CONTAINERS[@]}"; do
HEALTH=$(docker inspect --format='{{.State.Health.Status}}' $CONTAINER 2>/dev/null)
if [ "$HEALTH" != "healthy" ]; then
echo "ALERT: $CONTAINER is $HEALTH"
# Send alert (email, Slack, etc.)
fi
done
```
---
## 📈 Application Performance Monitoring (APM)
### Log-Based Monitoring (MVP Phase)
```typescript
// File: backend/src/common/interceptors/performance.interceptor.ts
import {
Injectable,
NestInterceptor,
ExecutionContext,
CallHandler,
} from '@nestjs/common';
import { Observable } from 'rxjs';
import { tap } from 'rxjs/operators';
import { logger } from 'src/config/logger.config';
@Injectable()
export class PerformanceInterceptor implements NestInterceptor {
intercept(context: ExecutionContext, next: CallHandler): Observable<any> {
const request = context.switchToHttp().getRequest();
const start = Date.now();
return next.handle().pipe(
tap({
next: () => {
const duration = Date.now() - start;
logger.info('Request completed', {
method: request.method,
url: request.url,
statusCode: context.switchToHttp().getResponse().statusCode,
duration: `${duration}ms`,
userId: request.user?.user_id,
});
// Alert on slow requests
if (duration > 1000) {
logger.warn('Slow request detected', {
method: request.method,
url: request.url,
duration: `${duration}ms`,
});
}
},
error: (error) => {
const duration = Date.now() - start;
logger.error('Request failed', {
method: request.method,
url: request.url,
duration: `${duration}ms`,
error: error.message,
});
},
})
);
}
}
```
---
## 🚨 Alerting Rules
### Critical Alerts (Immediate Action Required)
| Alert | Condition | Action |
| --------------- | ------------------------------------------- | --------------------------- |
| Service Down | Health check fails for 3 consecutive checks | Page on-call engineer |
| Database Down | Cannot connect to database | Page DBA + on-call engineer |
| Disk Full | Disk usage > 95% | Page operations team |
| High Error Rate | Error rate > 10% for 5 min | Page on-call engineer |
### Warning Alerts (Review Within 1 Hour)
| Alert | Condition | Action |
| ------------- | ----------------------- | ---------------------- |
| High CPU | CPU > 90% for 10 min | Notify operations team |
| High Memory | Memory > 95% for 10 min | Notify operations team |
| Slow Queries | > 50 slow queries/min | Notify DBA |
| Queue Backlog | BullMQ queue > 500 jobs | Notify backend team |
### Info Alerts (Review During Business Hours)
| Alert | Condition | Action |
| ------------------ | ------------------------------------ | --------------------- |
| Backup Failed | Daily backup job failed | Email operations team |
| SSL Expiring | SSL certificate expires in < 30 days | Email operations team |
| Disk Space Warning | Disk usage > 80% | Email operations team |
---
## 📧 Alert Notification Channels
### Email Alerts
```bash
#!/bin/bash
# File: /scripts/send-alert-email.sh
TO="ops-team@example.com"
SUBJECT="$1"
MESSAGE="$2"
echo "$MESSAGE" | mail -s "[LCBP3-DMS] $SUBJECT" "$TO"
```
### Slack Alerts
```bash
#!/bin/bash
# File: /scripts/send-alert-slack.sh
WEBHOOK_URL="https://hooks.slack.com/services/YOUR/WEBHOOK/URL"
MESSAGE="$1"
curl -X POST -H 'Content-type: application/json' \
--data "{\"text\":\"🚨 LCBP3-DMS Alert: $MESSAGE\"}" \
"$WEBHOOK_URL"
```
---
## 📊 Monitoring Dashboard
### Metrics to Display
**System Overview:**
- Service status (up/down)
- Overall system health score
- Active user count
- Request rate (req/s)
**Performance:**
- API response time (P50, P95, P99)
- Database query time
- Queue processing time
**Resources:**
- CPU usage %
- Memory usage %
- Disk usage %
- Network I/O
**Business Metrics:**
- Documents created today
- Workflows completed today
- Active correspondences
- Pending approvals
---
## 🔧 Log Aggregation
### Centralized Logging with Docker
```bash
# Configure Docker logging driver
# File: /etc/docker/daemon.json
{
"log-driver": "json-file",
"log-opts": {
"max-size": "10m",
"max-file": "3",
"labels": "service,environment"
}
}
```
### View Aggregated Logs
```bash
# View all LCBP3 container logs
docker-compose logs -f --tail=100
# View specific service logs
docker logs lcbp3-backend -f --since=1h
# Search logs
docker logs lcbp3-backend 2>&1 | grep "ERROR"
# Export logs for analysis
docker logs lcbp3-backend > backend-logs.txt
```
---
## 📈 Performance Baseline
### Establish Baselines
Run load tests to establish performance baselines:
```bash
# Install Apache Bench
apt-get install apache2-utils
# Test API endpoint
ab -n 1000 -c 10 \
-H "Authorization: Bearer <TOKEN>" \
https://lcbp3-dms.example.com/api/correspondences
# Results to record:
# - Requests per second
# - Mean response time
# - P95 response time
# - Error rate
```
### Regular Performance Testing
- **Weekly:** Quick health check (100 requests)
- **Monthly:** Full load test (10,000 requests)
- **Quarterly:** Stress test (find breaking point)
---
## ✅ Monitoring Checklist
### Daily
- [ ] Check service health dashboard
- [ ] Review error logs
- [ ] Verify backup completion
- [ ] Check disk space
### Weekly
- [ ] Review performance metrics trends
- [ ] Analyze slow query log
- [ ] Check SSL certificate expiry
- [ ] Review security alerts
### Monthly
- [ ] Capacity planning review
- [ ] Update monitoring thresholds
- [ ] Test alert notifications
- [ ] Review and tune performance
---
## 🔗 Related Documents
- [Backup & Recovery](04-04-backup-recovery.md)
- [Incident Response](04-07-incident-response.md)
- [ADR-010: Logging Strategy](../05-decisions/ADR-010-logging-monitoring-strategy.md)
---
**Version:** 1.6.0
**Last Review:** 2025-12-01
**Next Review:** 2026-03-01

View File

@@ -0,0 +1,374 @@
# Backup & Recovery Procedures
**Project:** LCBP3-DMS
**Version:** 1.6.0
**Last Updated:** 2025-12-02
---
## 📋 Overview
This document outlines backup strategies, recovery procedures, and disaster recovery planning for LCBP3-DMS.
---
## 🎯 Backup Strategy
### Backup Schedule
| Data Type | Frequency | Retention | Method |
| ---------------------- | -------------- | --------- | ----------------------- |
| Database (Full) | Daily at 02:00 | 30 days | mysqldump + compression |
| Database (Incremental) | Every 6 hours | 7 days | Binary logs |
| File Uploads | Daily at 03:00 | 30 days | rsync to backup server |
| Configuration Files | Weekly | 90 days | Git repository |
| Elasticsearch Indexes | Weekly | 14 days | Snapshot to S3/NFS |
| Application Logs | Daily | 90 days | Rotation + archival |
### Backup Locations
**Primary Backup:** QNAP NAS `/backup/lcbp3-dms`
**Secondary Backup:** External backup server (rsync)
**Offsite Backup:** Cloud storage (optional - for critical data)
---
## 💾 Database Backup
### Automated Daily Backup Script
```bash
#!/bin/bash
# File: /scripts/backup-database.sh
# Configuration
BACKUP_DIR="/backup/lcbp3-dms/database"
DB_CONTAINER="lcbp3-mariadb"
DB_NAME="lcbp3_dms"
DB_USER="backup_user"
DB_PASS="<BACKUP_USER_PASSWORD>"
RETENTION_DAYS=30
# Create backup directory
BACKUP_FILE="$BACKUP_DIR/lcbp3_$(date +%Y%m%d_%H%M%S).sql.gz"
mkdir -p "$BACKUP_DIR"
# Perform backup
echo "Starting database backup to $BACKUP_FILE"
docker exec $DB_CONTAINER mysqldump \
--user=$DB_USER \
--password=$DB_PASS \
--single-transaction \
--routines \
--triggers \
--databases $DB_NAME \
| gzip > "$BACKUP_FILE"
# Check backup success
if [ $? -eq 0 ]; then
echo "Backup completed successfully"
# Delete old backups
find "$BACKUP_DIR" -name "*.sql.gz" -type f -mtime +$RETENTION_DAYS -delete
echo "Old backups cleaned up (retention: $RETENTION_DAYS days)"
else
echo "ERROR: Backup failed!"
exit 1
fi
```
### Schedule with Cron
```bash
# Edit crontab
crontab -e
# Add backup job (runs daily at 2 AM)
0 2 * * * /scripts/backup-database.sh >> /var/log/backup-database.log 2>&1
```
### Manual Database Backup
```bash
# Backup specific database
docker exec lcbp3-mariadb mysqldump \
-u root -p \
--single-transaction \
lcbp3_dms > backup_$(date +%Y%m%d).sql
# Compress backup
gzip backup_$(date +%Y%m%d).sql
```
---
## 📂 File Uploads Backup
### Automated Rsync Backup
```bash
#!/bin/bash
# File: /scripts/backup-uploads.sh
SOURCE="/var/lib/docker/volumes/lcbp3_uploads/_data"
DEST="/backup/lcbp3-dms/uploads"
RETENTION_DAYS=30
# Create incremental backup with rsync
rsync -av --delete \
--backup --backup-dir="$DEST/backup-$(date +%Y%m%d)" \
"$SOURCE/" "$DEST/current/"
# Cleanup old backups
find "$DEST" -maxdepth 1 -type d -name "backup-*" -mtime +$RETENTION_DAYS -exec rm -rf {} \;
echo "Upload backup completed: $(date)"
```
### Schedule Uploads Backup
```bash
# Run daily at 3 AM
0 3 * * * /scripts/backup-uploads.sh >> /var/log/backup-uploads.log 2>&1
```
---
## 🔄 Database Recovery
### Full Database Restore
```bash
# Step 1: Stop backend application
docker stop lcbp3-backend
# Step 2: Restore database from backup
gunzip < backup_20241201.sql.gz | \
docker exec -i lcbp3-mariadb mysql -u root -p lcbp3_dms
# Step 3: Verify restore
docker exec lcbp3-mariadb mysql -u root -p -e "
USE lcbp3_dms;
SELECT COUNT(*) FROM users;
SELECT COUNT(*) FROM correspondences;
"
# Step 4: Restart backend
docker start lcbp3-backend
```
### Point-in-Time Recovery (Using Binary Logs)
```bash
# Step 1: Restore last full backup
gunzip < backup_20241201_020000.sql.gz | \
docker exec -i lcbp3-mariadb mysql -u root -p lcbp3_dms
# Step 2: Apply binary logs since backup
docker exec lcbp3-mariadb mysqlbinlog \
--start-datetime="2024-12-01 02:00:00" \
--stop-datetime="2024-12-01 14:30:00" \
/var/lib/mysql/mysql-bin.000001 | \
docker exec -i lcbp3-mariadb mysql -u root -p lcbp3_dms
```
---
## 📁 File Uploads Recovery
### Restore from Backup
```bash
# Stop backend to prevent file operations
docker stop lcbp3-backend
# Restore files
rsync -av \
/backup/lcbp3-dms/uploads/current/ \
/var/lib/docker/volumes/lcbp3_uploads/_data/
# Verify permissions
docker exec lcbp3-backend chown -R node:node /app/uploads
# Restart backend
docker start lcbp3-backend
```
---
## 🚨 Disaster Recovery Plan
### RTO & RPO
- **RTO (Recovery Time Objective):** 4 hours
- **RPO (Recovery Point Objective):** 24 hours (for files), 6 hours (for database)
### DR Scenarios
#### Scenario 1: Database Corruption
**Detection:** Database errors in logs, application errors
**Recovery Time:** 30 minutes
**Steps:**
1. Stop backend
2. Restore last full backup
3. Apply binary logs (if needed)
4. Verify data integrity
5. Restart services
#### Scenario 2: Complete Server Failure
**Detection:** Server unresponsive
**Recovery Time:** 4 hours
**Steps:**
1. Provision new QNAP server or VM
2. Install Docker & Container Station
3. Clone Git repository
4. Restore database backup
5. Restore file uploads
6. Deploy containers
7. Update DNS (if needed)
8. Verify functionality
#### Scenario 3: Ransomware Attack
**Detection:** Encrypted files, ransom note
**Recovery Time:** 6 hours
**Steps:**
1. **DO NOT pay ransom**
2. Isolate infected server
3. Provision clean environment
4. Restore from offsite backup
5. Scan restored backup for malware
6. Deploy and verify
7. Review security logs
8. Implement additional security measures
---
## ✅ Backup Verification
### Weekly Backup Testing
```bash
#!/bin/bash
# File: /scripts/test-backup.sh
# Create temporary test database
docker exec lcbp3-mariadb mysql -u root -p -e "
CREATE DATABASE IF NOT EXISTS test_restore;
"
# Restore latest backup to test database
LATEST_BACKUP=$(ls -t /backup/lcbp3-dms/database/*.sql.gz | head -1)
gunzip < "$LATEST_BACKUP" | \
sed 's/USE `lcbp3_dms`/USE `test_restore`/g' | \
docker exec -i lcbp3-mariadb mysql -u root -p
# Verify table counts
docker exec lcbp3-mariadb mysql -u root -p -e "
SELECT COUNT(*) FROM test_restore.users;
SELECT COUNT(*) FROM test_restore.correspondences;
"
# Cleanup
docker exec lcbp3-mariadb mysql -u root -p -e "
DROP DATABASE test_restore;
"
echo "Backup verification completed: $(date)"
```
### Monthly DR Drill
- Test full system restore on standby server
- Document time taken and issues encountered
- Update DR procedures based on findings
---
## 📊 Backup Monitoring
### Backup Status Dashboard
Monitor:
- ✅ Last successful backup timestamp
- ✅ Backup file size (detect anomalies)
- ✅ Backup success/failure rate
- ✅ Available backup storage space
### Alerts
Send alert if:
- ❌ Backup fails
- ❌ Backup file size < 50% of average (possible corruption)
- ❌ No backup in last 48 hours
- ❌ Backup storage < 20% free
---
## 🔧 Maintenance
### Optimize Backup Performance
```sql
-- Enable InnoDB compression for large tables
ALTER TABLE correspondences ROW_FORMAT=COMPRESSED;
ALTER TABLE workflow_history ROW_FORMAT=COMPRESSED;
-- Archive old audit logs
-- Move records older than 1 year to archive table
INSERT INTO audit_logs_archive
SELECT * FROM audit_logs
WHERE created_at < DATE_SUB(NOW(), INTERVAL 1 YEAR);
DELETE FROM audit_logs
WHERE created_at < DATE_SUB(NOW(), INTERVAL 1 YEAR);
```
---
## 📚 Backup Checklist
### Daily Tasks
- [ ] Verify automated backups completed
- [ ] Check backup log files for errors
- [ ] Monitor backup storage space
### Weekly Tasks
- [ ] Test restore from random backup
- [ ] Review backup size trends
- [ ] Verify offsite backups synced
### Monthly Tasks
- [ ] Full DR drill
- [ ] Review and update DR procedures
- [ ] Test backup restoration on different server
### Quarterly Tasks
- [ ] Audit backup access controls
- [ ] Review backup retention policies
- [ ] Update backup documentation
---
## 🔗 Related Documents
- [Deployment Guide](04-01-deployment-guide.md)
- [Monitoring & Alerting](04-03-monitoring-alerting.md)
- [Incident Response](04-07-incident-response.md)
---
**Version:** 1.6.0
**Last Review:** 2025-12-01
**Next Review:** 2026-03-01

View File

@@ -0,0 +1,501 @@
# Maintenance Procedures
**Project:** LCBP3-DMS
**Version:** 1.6.0
**Last Updated:** 2025-12-02
---
## 📋 Overview
This document outlines routine maintenance tasks, update procedures, and optimization guidelines for LCBP3-DMS.
---
## 📅 Maintenance Schedule
### Daily Tasks
- Monitor system health and backups
- Review error logs
- Check disk space
### Weekly Tasks
- Database optimization
- Log rotation and cleanup
- Security patch review
- Performance monitoring review
### Monthly Tasks
- SSL certificate check
- Dependency updates (Security patches)
- Database maintenance
- Backup restoration test
### Quarterly Tasks
- Full system update
- Capacity planning review
- Security audit
- Disaster recovery drill
---
## 🔄 Update Procedures
### Application Updates
#### Backend Update
```bash
#!/bin/bash
# File: /scripts/update-backend.sh
# Step 1: Backup database
/scripts/backup-database.sh
# Step 2: Pull latest code
cd /app/lcbp3/backend
git pull origin main
# Step 3: Install dependencies
docker exec lcbp3-backend npm install
# Step 4: Run migrations
docker exec lcbp3-backend npm run migration:run
# Step 5: Build application
docker exec lcbp3-backend npm run build
# Step 6: Restart backend
docker restart lcbp3-backend
# Step 7: Verify health
sleep 10
curl -f http://localhost:3000/health || {
echo "Health check failed! Rolling back..."
docker exec lcbp3-backend npm run migration:revert
docker restart lcbp3-backend
exit 1
}
echo "Backend updated successfully"
```
#### Frontend Update
```bash
#!/bin/bash
# File: /scripts/update-frontend.sh
# Step 1: Pull latest code
cd /app/lcbp3/frontend
git pull origin main
# Step 2: Install dependencies
docker exec lcbp3-frontend npm install
# Step 3: Build application
docker exec lcbp3-frontend npm run build
# Step 4: Restart frontend
docker restart lcbp3-frontend
# Step 5: Verify
sleep 10
curl -f http://localhost:3001 || {
echo "Frontend failed to start!"
exit 1
}
echo "Frontend updated successfully"
```
### Zero-Downtime Deployment
```bash
#!/bin/bash
# File: /scripts/zero-downtime-deploy.sh
# Using blue-green deployment strategy
# Step 1: Start new "green" backend
docker-compose -f docker-compose.green.yml up -d backend
# Step 2: Wait for health check
for i in {1..30}; do
curl -f http://localhost:3002/health && break
sleep 2
done
# Step 3: Switch NGINX to green
docker exec lcbp3-nginx nginx -s reload
# Step 4: Stop old "blue" backend
docker stop lcbp3-backend-blue
echo "Deployment completed with zero downtime"
```
---
## 🗄️ Database Maintenance
### Weekly Database Optimization
```sql
-- File: /scripts/optimize-database.sql
-- Optimize tables
OPTIMIZE TABLE correspondences;
OPTIMIZE TABLE rfas;
OPTIMIZE TABLE workflow_instances;
OPTIMIZE TABLE attachments;
-- Analyze tables for query optimization
ANALYZE TABLE correspondences;
ANALYZE TABLE rfas;
-- Check for table corruption
CHECK TABLE correspondences;
CHECK TABLE rfas;
-- Rebuild indexes if fragmented
ALTER TABLE correspondences ENGINE=InnoDB;
```
```bash
#!/bin/bash
# File: /scripts/weekly-db-maintenance.sh
docker exec lcbp3-mariadb mysql -u root -p lcbp3_dms < /scripts/optimize-database.sql
echo "Database optimization completed: $(date)"
```
### Monthly Database Cleanup
```sql
-- Archive old audit logs (older than 1 year)
INSERT INTO audit_logs_archive
SELECT * FROM audit_logs
WHERE created_at < DATE_SUB(NOW(), INTERVAL 1 YEAR);
DELETE FROM audit_logs
WHERE created_at < DATE_SUB(NOW(), INTERVAL 1 YEAR);
-- Clean up deleted notifications (older than 90 days)
DELETE FROM notifications
WHERE deleted_at IS NOT NULL
AND deleted_at < DATE_SUB(NOW(), INTERVAL 90 DAY);
-- Clean up expired temp uploads (older than 24h)
DELETE FROM temp_uploads
WHERE created_at < DATE_SUB(NOW(), INTERVAL 1 DAY);
-- Optimize after cleanup
OPTIMIZE TABLE audit_logs;
OPTIMIZE TABLE notifications;
OPTIMIZE TABLE temp_uploads;
```
---
## 📦 Dependency Updates
### Security Patch Updates (Monthly)
```bash
#!/bin/bash
# File: /scripts/update-dependencies.sh
cd /app/lcbp3/backend
# Check for security vulnerabilities
npm audit
# Update security patches only (no major versions)
npm audit fix
# Run tests
npm test
# If tests pass, commit and deploy
git add package*.json
git commit -m "chore: security patch updates"
git push origin main
```
### Major Version Updates (Quarterly)
```bash
# Check for outdated packages
npm outdated
# Update one major dependency at a time
npm install @nestjs/core@latest
# Test thoroughly
npm test
npm run test:e2e
# If successful, commit
git commit -am "chore: update @nestjs/core to vX.X.X"
```
---
## 🧹 Log Management
### Log Rotation Configuration
```bash
# File: /etc/logrotate.d/lcbp3-dms
/app/logs/*.log {
daily
rotate 30
compress
delaycompress
missingok
notifempty
create 0640 node node
sharedscripts
postrotate
docker exec lcbp3-backend kill -USR1 1
endscript
}
```
### Manual Log Cleanup
```bash
#!/bin/bash
# File: /scripts/cleanup-logs.sh
# Delete logs older than 90 days
find /app/logs -name "*.log" -type f -mtime +90 -delete
# Compress logs older than 7 days
find /app/logs -name "*.log" -type f -mtime +7 -exec gzip {} \;
# Clean Docker logs
docker system prune -f --volumes --filter "until=720h"
echo "Log cleanup completed: $(date)"
```
---
## 🔐 SSL Certificate Renewal
### Check Certificate Expiry
```bash
#!/bin/bash
# File: /scripts/check-ssl-cert.sh
CERT_FILE="/app/nginx/ssl/cert.pem"
EXPIRY_DATE=$(openssl x509 -enddate -noout -in "$CERT_FILE" | cut -d= -f2)
EXPIRY_EPOCH=$(date -d "$EXPIRY_DATE" +%s)
NOW_EPOCH=$(date +%s)
DAYS_LEFT=$(( ($EXPIRY_EPOCH - $NOW_EPOCH) / 86400 ))
echo "SSL certificate expires in $DAYS_LEFT days"
if [ $DAYS_LEFT -lt 30 ]; then
echo "WARNING: SSL certificate expires soon!"
# Send alert
/scripts/send-alert-email.sh "SSL Certificate Expiring" "Certificate expires in $DAYS_LEFT days"
fi
```
### Renew SSL Certificate (Let's Encrypt)
```bash
#!/bin/bash
# File: /scripts/renew-ssl.sh
# Renew certificate
certbot renew --webroot -w /app/nginx/html
# Copy new certificate
cp /etc/letsencrypt/live/lcbp3-dms.example.com/fullchain.pem /app/nginx/ssl/cert.pem
cp /etc/letsencrypt/live/lcbp3-dms.example.com/privkey.pem /app/nginx/ssl/key.pem
# Reload NGINX
docker exec lcbp3-nginx nginx -s reload
echo "SSL certificate renewed: $(date)"
```
---
## 🧪 Performance Optimization
### Database Query Optimization
```sql
-- Find slow queries
SELECT * FROM mysql.slow_log
ORDER BY query_time DESC
LIMIT 10;
-- Add indexes for frequently queried columns
CREATE INDEX idx_correspondences_status ON correspondences(status);
CREATE INDEX idx_rfas_workflow_status ON rfas(workflow_status);
CREATE INDEX idx_attachments_entity ON attachments(entity_type, entity_id);
-- Analyze query execution plan
EXPLAIN SELECT * FROM correspondences
WHERE status = 'PENDING'
AND created_at > DATE_SUB(NOW(), INTERVAL 30 DAY);
```
### Redis Cache Optimization
```bash
#!/bin/bash
# File: /scripts/optimize-redis.sh
# Check Redis memory usage
docker exec lcbp3-redis redis-cli INFO memory
# Set max memory policy
docker exec lcbp3-redis redis-cli CONFIG SET maxmemory 1gb
docker exec lcbp3-redis redis-cli CONFIG SET maxmemory-policy allkeys-lru
# Save configuration
docker exec lcbp3-redis redis-cli CONFIG REWRITE
# Clear stale cache (if needed)
docker exec lcbp3-redis redis-cli FLUSHDB
```
### Application Performance Tuning
```typescript
// Enable production optimizations in NestJS
// File: backend/src/main.ts
async function bootstrap() {
const app = await NestFactory.create(AppModule, {
logger:
process.env.NODE_ENV === 'production'
? ['error', 'warn']
: ['log', 'error', 'warn', 'debug'],
});
// Enable compression
app.use(compression());
// Enable caching
app.useGlobalInterceptors(new CacheInterceptor());
// Set global timeout
app.use(timeout('30s'));
await app.listen(3000);
}
```
---
## 🔒 Security Maintenance
### Monthly Security Tasks
```bash
#!/bin/bash
# File: /scripts/security-maintenance.sh
# Update system packages
apt-get update && apt-get upgrade -y
# Update ClamAV virus definitions
docker exec lcbp3-clamav freshclam
# Scan for rootkits
rkhunter --check --skip-keypress
# Check for unauthorized users
awk -F: '($3 >= 1000) {print $1}' /etc/passwd
# Review sudo access
cat /etc/sudoers
# Check firewall rules
iptables -L -n -v
echo "Security maintenance completed: $(date)"
```
---
## ✅ Maintenance Checklist
### Pre-Maintenance
- [ ] Announce maintenance window to users
- [ ] Backup database and files
- [ ] Document current system state
- [ ] Prepare rollback plan
### During Maintenance
- [ ] Put system in maintenance mode (if needed)
- [ ] Perform updates/changes
- [ ] Run smoke tests
- [ ] Monitor system health
### Post-Maintenance
- [ ] Verify all services running
- [ ] Run full test suite
- [ ] Monitor performance metrics
- [ ] Communicate completion to users
- [ ] Document changes made
---
## 🔧 Emergency Maintenance
### Unplanned Maintenance Procedures
1. **Assess Urgency**
- Can it wait for scheduled maintenance?
- Is it causing active issues?
2. **Communicate Impact**
- Notify stakeholders immediately
- Estimate downtime
- Provide updates every 30 minutes
3. **Execute Carefully**
- Always backup first
- Have rollback plan ready
- Test in staging if possible
4. **Post-Maintenance Review**
- Document what happened
- Identify preventive measures
- Update runbooks
---
## 📚 Related Documents
- [Deployment Guide](04-01-deployment-guide.md)
- [Backup & Recovery](04-04-backup-recovery.md)
- [Monitoring & Alerting](04-03-monitoring-alerting.md)
---
**Version:** 1.6.0
**Last Review:** 2025-12-01
**Next Review:** 2026-03-01

View File

@@ -0,0 +1,444 @@
# Security Operations
**Project:** LCBP3-DMS
**Version:** 1.6.0
**Last Updated:** 2025-12-02
---
## 📋 Overview
This document outlines security monitoring, access control management, vulnerability management, and security incident response for LCBP3-DMS.
---
## 🔒 Access Control Management
### User Access Review
**Monthly Tasks:**
```bash
#!/bin/bash
# File: /scripts/audit-user-access.sh
# Export active users
docker exec lcbp3-mariadb mysql -u root -p -e "
SELECT user_id, username, email, primary_organization_id, is_active, last_login_at
FROM lcbp3_dms.users
WHERE is_active = 1
ORDER BY last_login_at DESC;
" > /reports/active-users-$(date +%Y%m%d).csv
# Find dormant accounts (no login > 90 days)
docker exec lcbp3-mariadb mysql -u root -p -e "
SELECT user_id, username, email, last_login_at,
DATEDIFF(NOW(), last_login_at) AS days_inactive
FROM lcbp3_dms.users
WHERE is_active = 1
AND (last_login_at IS NULL OR last_login_at < DATE_SUB(NOW(), INTERVAL 90 DAY));
"
echo "User access audit completed: $(date)"
```
### Role & Permission Audit
```sql
-- Review users with elevated permissions
SELECT u.username, u.email, r.role_name, r.scope
FROM users u
JOIN user_assignments ua ON u.user_id = ua.user_id
JOIN roles r ON ua.role_id = r.role_id
WHERE r.role_name IN ('Superadmin', 'Document Controller', 'Project Manager')
ORDER BY r.role_name, u.username;
-- Review Global scope roles (highest privilege)
SELECT u.username, r.role_name
FROM users u
JOIN user_assignments ua ON u.user_id = ua.user_id
JOIN roles r ON ua.role_id = r.role_id
WHERE r.scope = 'Global';
```
---
## 🛡️ Security Monitoring
### Log Monitoring for Security Events
```bash
#!/bin/bash
# File: /scripts/monitor-security-events.sh
# Check for failed login attempts
docker logs lcbp3-backend | grep "Failed login" | tail -20
# Check for unauthorized access attempts (403)
docker logs lcbp3-backend | grep "403" | tail -20
# Check for unusual activity patterns
docker logs lcbp3-backend | grep -E "DELETE|DROP|TRUNCATE" | tail -20
# Check for SQL injection attempts
docker logs lcbp3-backend | grep -i "SELECT.*FROM.*WHERE" | grep -v "legitimate" | tail -20
```
### Failed Login Monitoring
```sql
-- Find accounts with multiple failed login attempts
SELECT username, failed_attempts, locked_until
FROM users
WHERE failed_attempts >= 3
ORDER BY failed_attempts DESC;
-- Unlock user account after verification
UPDATE users
SET failed_attempts = 0, locked_until = NULL
WHERE user_id = ?;
```
---
## 🔐 Secrets & Credentials Management
### Password Rotation Schedule
| Credential | Rotation Frequency | Owner |
| ---------------------- | ------------------------ | ------------ |
| Database Root Password | Every 90 days | DBA |
| Database App Password | Every 90 days | DevOps |
| JWT Secret | Every 180 days | Backend Team |
| Redis Password | Every 90 days | DevOps |
| SMTP Password | When provider requires | Operations |
| SSL Private Key | With certificate renewal | Operations |
### Password Rotation Procedure
```bash
#!/bin/bash
# File: /scripts/rotate-db-password.sh
# Generate new password
NEW_PASSWORD=$(openssl rand -base64 32)
# Update database user password
docker exec lcbp3-mariadb mysql -u root -p -e "
ALTER USER 'lcbp3_user'@'%' IDENTIFIED BY '$NEW_PASSWORD';
FLUSH PRIVILEGES;
"
# Update application .env file
sed -i "s/^DB_PASS=.*/DB_PASS=$NEW_PASSWORD/" /app/backend/.env
# Restart backend to apply new password
docker restart lcbp3-backend
# Verify connection
sleep 10
curl -f http://localhost:3000/health || {
echo "FAILED: Backend cannot connect with new password"
# Rollback procedure...
exit 1
}
echo "Database password rotated successfully: $(date)"
# Store password securely (e.g., password manager)
```
---
## 🚨 Vulnerability Management
### Dependency Vulnerability Scanning
```bash
#!/bin/bash
# File: /scripts/scan-vulnerabilities.sh
# Backend dependencies
cd /app/backend
npm audit --production
# Critical/High vulnerabilities
VULNERABILITIES=$(npm audit --production --json | jq '.metadata.vulnerabilities.high + .metadata.vulnerabilities.critical')
if [ "$VULNERABILITIES" -gt 0 ]; then
echo "WARNING: $VULNERABILITIES critical/high vulnerabilities found!"
npm audit --production > /reports/security-audit-$(date +%Y%m%d).txt
# Send alert
/scripts/send-alert-email.sh "Security Vulnerabilities Detected" "Found $VULNERABILITIES critical/high vulnerabilities"
fi
# Frontend dependencies
cd /app/frontend
npm audit --production
```
### Container Image Scanning
```bash
#!/bin/bash
# File: /scripts/scan-images.sh
# Install Trivy (if not installed)
# wget -qO - https://aquasecurity.github.io/trivy-repo/deb/public.key | apt-key add -
# echo "deb https://aquasecurity.github.io/trivy-repo/deb $(lsb_release -sc) main" | tee -a /etc/apt/sources.list.d/trivy.list
# apt-get update && apt-get install trivy
# Scan Docker images
trivy image --severity HIGH,CRITICAL lcbp3-backend:latest
trivy image --severity HIGH,CRITICAL lcbp3-frontend:latest
trivy image --severity HIGH,CRITICAL mariadb:11.8
trivy image --severity HIGH,CRITICAL redis:7.2-alpine
```
---
## 🔍 Security Hardening
### Server Hardening Checklist
- [ ] Disable root SSH login
- [ ] Use SSH key authentication only
- [ ] Configure firewall (allow only necessary ports)
- [ ] Enable automatic security updates
- [ ] Remove unnecessary services
- [ ] Configure fail2ban for brute-force protection
- [ ] Enable SELinux/AppArmor
- [ ] Regular security patch updates
### Docker Security
```yaml
# docker-compose.yml - Security best practices
services:
backend:
# Run as non-root user
user: 'node:node'
# Read-only root filesystem
read_only: true
# No new privileges
security_opt:
- no-new-privileges:true
# Limit capabilities
cap_drop:
- ALL
cap_add:
- NET_BIND_SERVICE
# Resource limits
deploy:
resources:
limits:
cpus: '2'
memory: 2G
reservations:
memory: 512M
```
### Database Security
```sql
-- Remove anonymous users
DELETE FROM mysql.user WHERE User='';
-- Remove test database
DROP DATABASE IF EXISTS test;
-- Remove remote root login
DELETE FROM mysql.user WHERE User='root' AND Host NOT IN ('localhost', '127.0.0.1');
-- Create dedicated backup user with minimal privileges
CREATE USER 'backup_user'@'localhost' IDENTIFIED BY 'STRONG_PASSWORD';
GRANT SELECT, LOCK TABLES, SHOW VIEW, EVENT, TRIGGER ON lcbp3_dms.* TO 'backup_user'@'localhost';
-- Enable SSL for database connections
-- GRANT USAGE ON *.* TO 'lcbp3_user'@'%' REQUIRE SSL;
FLUSH PRIVILEGES;
```
---
## 🚨 Security Incident Response
### Incident Classification
| Type | Examples | Response Time |
| ----------------------- | ---------------------------- | ---------------- |
| **Data Breach** | Unauthorized data access | Immediate (< 1h) |
| **Account Compromise** | Stolen credentials | Immediate (< 1h) |
| **DDoS Attack** | Service unavailable | Immediate (< 1h) |
| **Malware/Ransomware** | Infected systems | Immediate (< 1h) |
| **Unauthorized Access** | Failed authentication spikes | High (< 4h) |
| **Suspicious Activity** | Unusual patterns | Medium (< 24h) |
### Data Breach Response
**Immediate Actions:**
1. **Contain the breach**
```bash
# Block suspicious IPs at firewall level
iptables -A INPUT -s <SUSPICIOUS_IP> -j DROP
# Disable compromised user accounts
docker exec lcbp3-mariadb mysql -u root -p -e "
UPDATE lcbp3_dms.users
SET is_active = 0
WHERE user_id = <COMPROMISED_USER_ID>;
"
```
2. **Assess impact**
```sql
-- Check audit logs for unauthorized access
SELECT * FROM audit_logs
WHERE user_id = <COMPROMISED_USER_ID>
AND created_at >= '<SUSPECTED_START_TIME>'
ORDER BY created_at DESC;
-- Check what documents were accessed
SELECT DISTINCT entity_id, entity_type, action
FROM audit_logs
WHERE user_id = <COMPROMISED_USER_ID>;
```
3. **Notify stakeholders**
- Security officer
- Management
- Affected users (if applicable)
- Legal team (if required by law)
4. **Document everything**
- Timeline of events
- Data accessed/compromised
- Actions taken
- Lessons learned
### Account Compromise Response
```bash
#!/bin/bash
# File: /scripts/respond-account-compromise.sh
USER_ID=$1
# 1. Immediately disable account
docker exec lcbp3-mariadb mysql -u root -p -e "
UPDATE lcbp3_dms.users
SET is_active = 0,
locked_until = DATE_ADD(NOW(), INTERVAL 24 HOUR)
WHERE user_id = $USER_ID;
"
# 2. Invalidate all sessions
docker exec lcbp3-redis redis-cli DEL "session:user:$USER_ID:*"
# 3. Generate audit report
docker exec lcbp3-mariadb mysql -u root -p -e "
SELECT * FROM lcbp3_dms.audit_logs
WHERE user_id = $USER_ID
AND created_at >= DATE_SUB(NOW(), INTERVAL 24 HOUR)
ORDER BY created_at DESC;
" > /reports/compromise-audit-user-$USER_ID-$(date +%Y%m%d).txt
# 4. Notify security team
/scripts/send-alert-email.sh "Account Compromise" "User ID $USER_ID has been compromised and disabled"
echo "Account compromise response completed for User ID: $USER_ID"
```
---
## 📊 Security Metrics & KPIs
### Monthly Security Report
| Metric | Target | Actual |
| --------------------------- | --------- | ------ |
| Failed Login Attempts | < 100/day | Track |
| Locked Accounts | < 5/month | Track |
| Critical Vulnerabilities | 0 | Track |
| High Vulnerabilities | < 5 | Track |
| Unpatched Systems | 0 | Track |
| Security Incidents | 0 | Track |
| Mean Time To Detect (MTTD) | < 1 hour | Track |
| Mean Time To Respond (MTTR) | < 4 hours | Track |
---
## 🔐 Compliance & Audit
### Audit Log Retention
- **Access Logs:** 1 year
- **Security Events:** 2 years
- **Admin Actions:** 3 years
- **Data Changes:** 7 years (as required)
### Compliance Checklist
- [ ] Regular security audits (quarterly)
- [ ] Penetration testing (annually)
- [ ] Access control reviews (monthly)
- [ ] Encryption at rest and in transit
- [ ] Secure password policies enforced
- [ ] Multi-factor authentication (if required)
- [ ] Data backup and recovery tested
- [ ] Incident response plan documented and tested
---
## ✅ Security Operations Checklist
### Daily
- [ ] Review security alerts and logs
- [ ] Monitor failed login attempts
- [ ] Check for unusual access patterns
- [ ] Verify backup completion
### Weekly
- [ ] Review user access logs
- [ ] Scan for vulnerabilities
- [ ] Update virus definitions
- [ ] Review firewall logs
### Monthly
- [ ] User access audit
- [ ] Role and permission review
- [ ] Security patch application
- [ ] Compliance review
### Quarterly
- [ ] Full security audit
- [ ] Penetration testing
- [ ] Disaster recovery drill
- [ ] Update security policies
---
## 🔗 Related Documents
- [Incident Response](04-07-incident-response.md)
- [Monitoring & Alerting](04-03-monitoring-alerting.md)
- [ADR-004: RBAC Implementation](../05-decisions/ADR-004-rbac-implementation.md)
---
**Version:** 1.6.0
**Last Review:** 2025-12-01
**Next Review:** 2026-03-01

View File

@@ -0,0 +1,483 @@
# Incident Response Procedures
**Project:** LCBP3-DMS
**Version:** 1.6.0
**Last Updated:** 2025-12-02
---
## 📋 Overview
This document outlines incident classification, response procedures, and post-incident reviews for LCBP3-DMS.
---
## 🚨 Incident Classification
### Severity Levels
| Severity | Description | Response Time | Examples |
| ----------------- | ---------------------------- | ----------------- | ----------------------------------------------- |
| **P0 - Critical** | Complete system outage | 15 minutes | Database down, All services unavailable |
| **P1 - High** | Major functionality impaired | 1 hour | Authentication failing, Cannot create documents |
| **P2 - Medium** | Degraded performance | 4 hours | Slow response time, Some features broken |
| **P3 - Low** | Minor issues | Next business day | UI glitch, Non-critical bug |
---
## 📞 Incident Response Team
### Roles & Responsibilities
**Incident Commander (IC)**
- Coordinates response efforts
- Makes final decisions
- Communicates with stakeholders
**Technical Lead (TL)**
- Diagnoses technical issues
- Implements fixes
- Coordinates with engineers
**Communications Lead (CL)**
- Updates stakeholders
- Manages internal/external communications
- Documents incident timeline
**On-Call Engineer**
- First responder
- Initial triage and investigation
- Escalates to appropriate team
---
## 🔄 Incident Response Workflow
```mermaid
flowchart TD
Start([Incident Detected]) --> Acknowledge[Acknowledge Incident]
Acknowledge --> Assess[Assess Severity]
Assess --> P0{Severity?}
P0 -->|P0/P1| Alert[Page Incident Commander]
P0 -->|P2/P3| Assign[Assign to On-Call]
Alert --> Investigate[Investigate Root Cause]
Assign --> Investigate
Investigate --> Mitigate[Implement Mitigation]
Mitigate --> Verify[Verify Resolution]
Verify --> Resolved{Resolved?}
Resolved -->|No| Escalate[Escalate/Re-assess]
Escalate --> Investigate
Resolved -->|Yes| Communicate[Communicate Resolution]
Communicate --> PostMortem[Schedule Post-Mortem]
PostMortem --> End([Close Incident])
```
---
## 📋 Incident Response Playbooks
### P0: Database Down
**Symptoms:**
- Backend returns 500 errors
- Cannot connect to database
- Health check fails
**Immediate Actions:**
1. **Verify Issue**
```bash
docker ps | grep mariadb
docker logs lcbp3-mariadb --tail=50
```
2. **Attempt Restart**
```bash
docker restart lcbp3-mariadb
```
3. **Check Database Process**
```bash
docker exec lcbp3-mariadb ps aux | grep mysql
```
4. **If Restart Fails:**
```bash
# Check disk space
df -h
# Check database logs for corruption
docker exec lcbp3-mariadb cat /var/log/mysql/error.log
# If corrupted, restore from backup
# See backup-recovery.md
```
5. **Escalate to DBA** if not resolved in 30 minutes
---
### P0: Complete System Outage
**Symptoms:**
- All services return 502/503
- Health checks fail
- Users cannot access system
**Immediate Actions:**
1. **Check Container Status**
```bash
docker-compose ps
# Identify which containers are down
```
2. **Restart All Services**
```bash
docker-compose restart
```
3. **Check QNAP Server Resources**
```bash
top
df -h
free -h
```
4. **Check Network**
```bash
ping 8.8.8.8
netstat -tlnp
```
5. **If Server Issue:**
- Reboot QNAP server
- Contact QNAP support
---
### P1: Authentication System Failing
**Symptoms:**
- Users cannot log in
- JWT validation fails
- 401 errors
**Immediate Actions:**
1. **Check Redis (Session Store)**
```bash
docker exec lcbp3-redis redis-cli ping
# Should return PONG
```
2. **Check JWT Secret Configuration**
```bash
docker exec lcbp3-backend env | grep JWT_SECRET
# Verify not empty
```
3. **Check Backend Logs**
```bash
docker logs lcbp3-backend --tail=100 | grep "JWT\|Auth"
```
4. **Temporary Mitigation:**
```bash
# Restart backend to reload config
docker restart lcbp3-backend
```
---
### P1: File Upload Failing
**Symptoms:**
- Users cannot upload files
- 500 errors on file upload
- "Disk full" errors
**Immediate Actions:**
1. **Check Disk Space**
```bash
df -h /var/lib/docker/volumes/lcbp3_uploads
```
2. **If Disk Full:**
```bash
# Clean up temp uploads
find /var/lib/docker/volumes/lcbp3_uploads/_data/temp \
-type f -mtime +1 -delete
```
3. **Check ClamAV (Virus Scanner)**
```bash
docker logs lcbp3-clamav --tail=50
docker restart lcbp3-clamav
```
4. **Check File Permissions**
```bash
docker exec lcbp3-backend ls -la /app/uploads
```
---
### P2: Slow Performance
**Symptoms:**
- Pages load slowly
- API response time > 2s
- Users complain about slowness
**Actions:**
1. **Check System Resources**
```bash
docker stats
# Identify high CPU/memory containers
```
2. **Check Database Performance**
```sql
-- Show slow queries
SHOW PROCESSLIST;
-- Check connections
SHOW STATUS LIKE 'Threads_connected';
```
3. **Check Redis**
```bash
docker exec lcbp3-redis redis-cli --stat
```
4. **Check Application Logs**
```bash
docker logs lcbp3-backend | grep "Slow request"
```
5. **Temporary Mitigation:**
- Restart slow containers
- Clear Redis cache if needed
- Kill long-running queries
---
### P2: Email Notifications Not Sending
**Symptoms:**
- Users not receiving emails
- Email queue backing up
**Actions:**
1. **Check Email Queue**
```bash
# Access BullMQ dashboard or check Redis
docker exec lcbp3-redis redis-cli LLEN bull:email:waiting
```
2. **Check Email Processor Logs**
```bash
docker logs lcbp3-backend | grep "email\|SMTP"
```
3. **Test SMTP Connection**
```bash
docker exec lcbp3-backend node -e "
const nodemailer = require('nodemailer');
const transport = nodemailer.createTransport({
host: process.env.SMTP_HOST,
port: process.env.SMTP_PORT,
auth: {
user: process.env.SMTP_USER,
pass: process.env.SMTP_PASS
}
});
transport.verify().then(console.log).catch(console.error);
"
```
4. **Check SMTP Credentials**
- Verify not expired
- Check firewall/network access
---
## 📝 Incident Documentation
### Incident Report Template
```markdown
# Incident Report: [Brief Description]
**Incident ID:** INC-YYYYMMDD-001
**Severity:** P1
**Status:** Resolved
**Incident Commander:** [Name]
## Timeline
| Time | Event |
| ----- | --------------------------------------------------------- |
| 14:00 | Alert: High error rate detected |
| 14:05 | On-call engineer acknowledged |
| 14:10 | Identified root cause: Database connection pool exhausted |
| 14:15 | Implemented mitigation: Increased pool size |
| 14:20 | Verified resolution |
| 14:30 | Incident resolved |
## Impact
- **Duration:** 30 minutes
- **Affected Users:** ~50 users
- **Affected Services:** Document creation, Search
- **Data Loss:** None
## Root Cause
Database connection pool was exhausted due to slow queries not releasing connections.
## Resolution
1. Increased connection pool size from 10 to 20
2. Optimized slow queries
3. Added connection pool monitoring
## Action Items
- [ ] Add connection pool size alert (Owner: DevOps, Due: Next Sprint)
- [ ] Implement automatic query timeouts (Owner: Backend, Due: 2025-12-15)
- [ ] Review all queries for optimization (Owner: DBA, Due: 2025-12-31)
## Lessons Learned
- Connection pool monitoring was insufficient
- Need automated remediation for common issues
```
---
## 🔍 Post-Incident Review (PIR)
### PIR Meeting Agenda
1. **Timeline Review** (10 min)
- What happened and when?
- What was the impact?
2. **Root Cause Analysis** (15 min)
- Why did it happen?
- What were the contributing factors?
3. **What Went Well** (10 min)
- What did we do right?
- What helped us resolve quickly?
4. **What Went Wrong** (15 min)
- What could we have done better?
- What slowed us down?
5. **Action Items** (10 min)
- What changes will prevent this?
- Who owns each action?
- When will they be completed?
### PIR Best Practices
- **Blameless Culture:** Focus on systems, not individuals
- **Actionable Outcomes:** Every PIR should produce concrete actions
- **Follow Through:** Track action items to completion
- **Share Learnings:** Distribute PIR summary to entire team
---
## 📊 Incident Metrics
### Track & Review Monthly
- **MTTR (Mean Time To Resolution):** Average time to resolve incidents
- **MTBF (Mean Time Between Failures):** Average time between incidents
- **Incident Frequency:** Number of incidents per month
- **Severity Distribution:** Breakdown by P0/P1/P2/P3
- **Repeat Incidents:** Same root cause occurring multiple times
---
## ✅ Incident Response Checklist
### During Incident
- [ ] Acknowledge incident in tracking system
- [ ] Assess severity and assign IC
- [ ] Create incident channel (Slack/Teams)
- [ ] Begin documenting timeline
- [ ] Investigate and implement mitigation
- [ ] Communicate status updates every 30 min (P0/P1)
- [ ] Verify resolution
- [ ] Communicate resolution to stakeholders
### After Incident
- [ ] Create incident report
- [ ] Schedule PIR within 48 hours
- [ ] Identify action items
- [ ] Assign owners and deadlines
- [ ] Update runbooks/playbooks
- [ ] Share learnings with team
---
## 🔗 Related Documents
- [Monitoring & Alerting](04-03-monitoring-alerting.md)
- [Backup & Recovery](04-04-backup-recovery.md)
- [Security Operations](04-06-security-operations.md)
---
**Version:** 1.6.0
**Last Review:** 2025-12-01
**Next Review:** 2026-03-01

View File

@@ -0,0 +1,686 @@
# Document Numbering Operations Guide
---
title: 'Document Numbering Operations Guide'
version: 1.7.0
status: APPROVED
owner: Operations Team
last_updated: 2025-12-18
related:
- specs/01-requirements/03.11-document-numbering.md
- specs/03-implementation/03-08-document-numbering.md
- specs/04-operations/04-08-monitoring-alerting.md
- specs/05-decisions/ADR-002-document-numbering-strategy.md
---
## Overview
เอกสารนี้อธิบาย operations procedures, monitoring, และ troubleshooting สำหรับระบบ Document Numbering
## 1. Performance Requirements
### 1.1. Response Time Targets
| Metric | Target | Measurement |
| ---------------- | -------- | ------------------------ |
| 95th percentile | ≤ 2 วินาที | ตั้งแต่ request ถึง response |
| 99th percentile | ≤ 5 วินาที | ตั้งแต่ request ถึง response |
| Normal operation | ≤ 500ms | ไม่มี retry |
### 1.2. Throughput Targets
| Load Level | Target | Notes |
| -------------- | ----------- | ------------------------ |
| Normal load | ≥ 50 req/s | ใช้งานปกติ |
| Peak load | ≥ 100 req/s | ช่วงเร่งงาน |
| Burst capacity | ≥ 200 req/s | Short duration (< 1 min) |
### 1.3. Availability SLA
- **Uptime**: ≥ 99.5% (excluding planned maintenance)
- **Maximum downtime**: ≤ 3.6 ชั่วโมง/เดือน (~ 8.6 นาที/วัน)
- **Recovery Time Objective (RTO)**: ≤ 30 นาที
- **Recovery Point Objective (RPO)**: ≤ 5 นาที
## 2. Infrastructure Setup
### 2.1. Database Configuration
#### MariaDB Connection Pool
```typescript
// ormconfig.ts
{
type: 'mysql',
host: process.env.DB_HOST,
port: parseInt(process.env.DB_PORT),
username: process.env.DB_USERNAME,
password: process.env.DB_PASSWORD,
database: process.env.DB_DATABASE,
extra: {
connectionLimit: 20, // Pool size
queueLimit: 0, // Unlimited queue
acquireTimeout: 10000, // 10s timeout
retryAttempts: 3,
retryDelay: 1000
}
}
```
#### High Availability Setup
```yaml
# docker-compose.yml
services:
mariadb-master:
image: mariadb:11.8
environment:
MYSQL_REPLICATION_MODE: master
MYSQL_ROOT_PASSWORD: ${DB_ROOT_PASSWORD}
volumes:
- mariadb-master-data:/var/lib/mysql
networks:
- backend
mariadb-replica:
image: mariadb:11.8
environment:
MYSQL_REPLICATION_MODE: slave
MYSQL_MASTER_HOST: mariadb-master
MYSQL_MASTER_ROOT_PASSWORD: ${DB_ROOT_PASSWORD}
volumes:
- mariadb-replica-data:/var/lib/mysql
networks:
- backend
```
### 2.2. Redis Configuration
#### Redis Sentinel for High Availability
```yaml
# docker-compose.yml
services:
redis-master:
image: redis:7-alpine
command: redis-server --appendonly yes
volumes:
- redis-master-data:/data
networks:
- backend
redis-replica:
image: redis:7-alpine
command: redis-server --replicaof redis-master 6379 --appendonly yes
volumes:
- redis-replica-data:/data
networks:
- backend
redis-sentinel:
image: redis:7-alpine
command: >
redis-sentinel /etc/redis/sentinel.conf
--sentinel monitor mymaster redis-master 6379 2
--sentinel down-after-milliseconds mymaster 5000
--sentinel failover-timeout mymaster 10000
networks:
- backend
```
#### Redis Connection Pool
```typescript
// redis.config.ts
import IORedis from 'ioredis';
export const redisConfig = {
host: process.env.REDIS_HOST || 'localhost',
port: parseInt(process.env.REDIS_PORT) || 6379,
password: process.env.REDIS_PASSWORD,
maxRetriesPerRequest: 3,
enableReadyCheck: true,
lazyConnect: false,
poolSize: 10,
retryStrategy: (times: number) => {
if (times > 3) {
return null; // Stop retry
}
return Math.min(times * 100, 3000);
},
};
```
### 2.3. Load Balancing
#### Nginx Configuration
```nginx
# nginx.conf
upstream backend {
least_conn; # Least connections algorithm
server backend-1:3000 max_fails=3 fail_timeout=30s weight=1;
server backend-2:3000 max_fails=3 fail_timeout=30s weight=1;
server backend-3:3000 max_fails=3 fail_timeout=30s weight=1;
keepalive 32;
}
server {
listen 80;
server_name api.lcbp3.local;
location /api/v1/document-numbering/ {
proxy_pass http://backend;
proxy_http_version 1.1;
proxy_set_header Connection "";
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_next_upstream error timeout;
proxy_connect_timeout 10s;
proxy_send_timeout 30s;
proxy_read_timeout 30s;
}
}
```
#### Docker Compose Scaling
```yaml
# docker-compose.yml
services:
backend:
image: lcbp3-backend:latest
deploy:
replicas: 3
resources:
limits:
cpus: '1.0'
memory: 1G
reservations:
cpus: '0.5'
memory: 512M
restart_policy:
condition: on-failure
delay: 5s
max_attempts: 3
environment:
NODE_ENV: production
DB_POOL_SIZE: 20
networks:
- backend
```
## 3. Monitoring & Metrics
### 3.1. Prometheus Metrics
#### Key Metrics to Collect
```typescript
// metrics.service.ts
import { Counter, Histogram, Gauge } from 'prom-client';
// Lock acquisition metrics
export const lockAcquisitionDuration = new Histogram({
name: 'docnum_lock_acquisition_duration_ms',
help: 'Lock acquisition time in milliseconds',
labelNames: ['project', 'type'],
buckets: [10, 50, 100, 200, 500, 1000, 2000, 5000],
});
export const lockAcquisitionFailures = new Counter({
name: 'docnum_lock_acquisition_failures_total',
help: 'Total number of lock acquisition failures',
labelNames: ['project', 'type', 'reason'],
});
// Generation metrics
export const generationDuration = new Histogram({
name: 'docnum_generation_duration_ms',
help: 'Total document number generation time',
labelNames: ['project', 'type', 'status'],
buckets: [100, 200, 500, 1000, 2000, 5000],
});
export const retryCount = new Histogram({
name: 'docnum_retry_count',
help: 'Number of retries per generation',
labelNames: ['project', 'type'],
buckets: [0, 1, 2, 3, 5, 10],
});
// Connection health
export const redisConnectionStatus = new Gauge({
name: 'docnum_redis_connection_status',
help: 'Redis connection status (1=up, 0=down)',
});
export const dbConnectionPoolUsage = new Gauge({
name: 'docnum_db_connection_pool_usage',
help: 'Database connection pool usage percentage',
});
```
### 3.2. Prometheus Alert Rules
```yaml
# prometheus/alerts.yml
groups:
- name: document_numbering_alerts
interval: 30s
rules:
# CRITICAL: Redis unavailable
- alert: RedisUnavailable
expr: docnum_redis_connection_status == 0
for: 1m
labels:
severity: critical
component: document-numbering
annotations:
summary: "Redis is unavailable for document numbering"
description: "System is falling back to DB-only locking. Performance degraded by 30-50%."
runbook_url: "https://wiki.lcbp3/runbooks/redis-unavailable"
# CRITICAL: High lock failure rate
- alert: HighLockFailureRate
expr: |
rate(docnum_lock_acquisition_failures_total[5m]) > 0.1
for: 5m
labels:
severity: critical
component: document-numbering
annotations:
summary: "Lock acquisition failure rate > 10%"
description: "Check Redis and database performance immediately"
runbook_url: "https://wiki.lcbp3/runbooks/high-lock-failure"
# WARNING: Elevated lock failure rate
- alert: ElevatedLockFailureRate
expr: |
rate(docnum_lock_acquisition_failures_total[5m]) > 0.05
for: 5m
labels:
severity: warning
component: document-numbering
annotations:
summary: "Lock acquisition failure rate > 5%"
description: "Monitor closely. May escalate to critical soon."
# WARNING: Slow lock acquisition
- alert: SlowLockAcquisition
expr: |
histogram_quantile(0.95,
rate(docnum_lock_acquisition_duration_ms_bucket[5m])
) > 1000
for: 5m
labels:
severity: warning
component: document-numbering
annotations:
summary: "P95 lock acquisition time > 1 second"
description: "Lock acquisition is slower than expected. Check Redis latency."
# WARNING: High retry count
- alert: HighRetryCount
expr: |
sum by (project) (
rate(docnum_retry_count_sum[1h])
) > 100
for: 1h
labels:
severity: warning
component: document-numbering
annotations:
summary: "Retry count > 100 per hour in project {{ $labels.project }}"
description: "High contention detected. Consider scaling."
# WARNING: Slow generation
- alert: SlowDocumentNumberGeneration
expr: |
histogram_quantile(0.95,
rate(docnum_generation_duration_ms_bucket[5m])
) > 2000
for: 5m
labels:
severity: warning
component: document-numbering
annotations:
summary: "P95 generation time > 2 seconds"
description: "Document number generation is slower than SLA target"
```
### 3.3. AlertManager Configuration
```yaml
# alertmanager/config.yml
global:
resolve_timeout: 5m
slack_api_url: ${SLACK_WEBHOOK_URL}
route:
group_by: ['alertname', 'severity', 'project']
group_wait: 30s
group_interval: 5m
repeat_interval: 4h
receiver: 'ops-team'
routes:
# CRITICAL alerts → PagerDuty + Slack
- match:
severity: critical
receiver: 'pagerduty-critical'
continue: true
- match:
severity: critical
receiver: 'slack-critical'
continue: false
# WARNING alerts → Slack only
- match:
severity: warning
receiver: 'slack-warnings'
receivers:
- name: 'pagerduty-critical'
pagerduty_configs:
- service_key: ${PAGERDUTY_SERVICE_KEY}
description: '{{ .GroupLabels.alertname }}: {{ .CommonAnnotations.summary }}'
details:
firing: '{{ .Alerts.Firing | len }}'
resolved: '{{ .Alerts.Resolved | len }}'
runbook: '{{ .CommonAnnotations.runbook_url }}'
- name: 'slack-critical'
slack_configs:
- channel: '#lcbp3-critical-alerts'
title: '🚨 CRITICAL: {{ .GroupLabels.alertname }}'
text: |
*Summary:* {{ .CommonAnnotations.summary }}
*Description:* {{ .CommonAnnotations.description }}
*Runbook:* {{ .CommonAnnotations.runbook_url }}
color: 'danger'
- name: 'slack-warnings'
slack_configs:
- channel: '#lcbp3-alerts'
title: '⚠️ WARNING: {{ .GroupLabels.alertname }}'
text: '{{ .CommonAnnotations.description }}'
color: 'warning'
- name: 'ops-team'
email_configs:
- to: 'ops@example.com'
subject: '[LCBP3] {{ .GroupLabels.alertname }}'
```
### 3.4. Grafana Dashboard
Dashboard panels ที่สำคัญ:
1. **Lock Acquisition Success Rate** (Gauge)
- Query: `1 - (rate(docnum_lock_acquisition_failures_total[5m]) / rate(docnum_lock_acquisition_total[5m]))`
- Alert threshold: < 95%
2. **Lock Acquisition Time Percentiles** (Graph)
- P50: `histogram_quantile(0.50, rate(docnum_lock_acquisition_duration_ms_bucket[5m]))`
- P95: `histogram_quantile(0.95, rate(docnum_lock_acquisition_duration_ms_bucket[5m]))`
- P99: `histogram_quantile(0.99, rate(docnum_lock_acquisition_duration_ms_bucket[5m]))`
3. **Generation Rate** (Stat)
- Query: `sum(rate(docnum_generation_duration_ms_count[1m])) * 60`
- Unit: documents/minute
4. **Error Rate by Type** (Graph)
- Query: `sum by (reason) (rate(docnum_lock_acquisition_failures_total[5m]))`
5. **Redis Connection Status** (Stat)
- Query: `docnum_redis_connection_status`
- Thresholds: 0 = red, 1 = green
6. **DB Connection Pool Usage** (Gauge)
- Query: `docnum_db_connection_pool_usage`
- Alert threshold: > 80%
## 4. Troubleshooting Runbooks
### 4.1. Scenario: Redis Unavailable
**Symptoms:**
- Alert: `RedisUnavailable`
- System falls back to DB-only locking
- Performance degraded 30-50%
**Action Steps:**
1. **Check Redis status:**
```bash
docker exec lcbp3-redis redis-cli ping
# Expected: PONG
```
2. **Check Redis logs:**
```bash
docker logs lcbp3-redis --tail=100
```
3. **Restart Redis (if needed):**
```bash
docker restart lcbp3-redis
```
4. **Verify failover (if using Sentinel):**
```bash
docker exec lcbp3-redis-sentinel redis-cli -p 26379 SENTINEL masters
```
5. **Monitor recovery:**
- Check metric: `docnum_redis_connection_status` returns to 1
- Check performance: P95 latency returns to normal (< 500ms)
### 4.2. Scenario: High Lock Failure Rate
**Symptoms:**
- Alert: `HighLockFailureRate` (> 10%)
- Users report "ระบบกำลังยุ่ง" errors
**Action Steps:**
1. **Check concurrent load:**
```bash
# Check current request rate
curl http://prometheus:9090/api/v1/query?query=rate(docnum_generation_duration_ms_count[1m])
```
2. **Check database connections:**
```sql
SHOW PROCESSLIST;
-- Look for waiting/locked queries
```
3. **Check Redis memory:**
```bash
docker exec lcbp3-redis redis-cli INFO memory
```
4. **Scale up if needed:**
```bash
# Increase backend replicas
docker-compose up -d --scale backend=5
```
5. **Check for deadlocks:**
```sql
SHOW ENGINE INNODB STATUS;
-- Look for LATEST DETECTED DEADLOCK section
```
### 4.3. Scenario: Slow Performance
**Symptoms:**
- Alert: `SlowDocumentNumberGeneration`
- P95 > 2 seconds
**Action Steps:**
1. **Check database query performance:**
```sql
SELECT * FROM document_number_counters USE INDEX (idx_counter_lookup)
WHERE project_id = 2 AND correspondence_type_id = 6 AND current_year = 2025;
-- Check execution plan
EXPLAIN SELECT ...;
```
2. **Check for missing indexes:**
```sql
SHOW INDEX FROM document_number_counters;
```
3. **Check Redis latency:**
```bash
docker exec lcbp3-redis redis-cli --latency
```
4. **Check network latency:**
```bash
ping mariadb-master
ping redis-master
```
5. **Review slow query log:**
```bash
docker exec lcbp3-mariadb-master cat /var/log/mysql/slow.log
```
### 4.4. Scenario: Version Conflicts
**Symptoms:**
- High retry count
- Users report "เลขที่เอกสารถูกเปลี่ยน" errors
**Action Steps:**
1. **Check concurrent requests to same counter:**
```sql
SELECT
project_id,
correspondence_type_id,
COUNT(*) as concurrent_requests
FROM document_number_audit
WHERE created_at > NOW() - INTERVAL 5 MINUTE
GROUP BY project_id, correspondence_type_id
HAVING COUNT(*) > 10
ORDER BY concurrent_requests DESC;
```
2. **Investigate specific counter:**
```sql
SELECT * FROM document_number_counters
WHERE project_id = X AND correspondence_type_id = Y;
-- Check audit trail
SELECT * FROM document_number_audit
WHERE counter_key LIKE '%project_id:X%'
ORDER BY created_at DESC
LIMIT 20;
```
3. **Check for application bugs:**
- Review error logs for stack traces
- Check if retry logic is working correctly
4. **Temporary mitigation:**
- Increase retry count in application config
- Consider manual counter adjustment (last resort)
## 5. Maintenance Procedures
### 5.1. Counter Reset (Manual)
**Requires:** SUPER_ADMIN role + 2-person approval
**Steps:**
1. **Request approval via API:**
```bash
POST /api/v1/document-numbering/configs/{configId}/reset-counter
{
"reason": "เหตุผลที่ชัดเจน อย่างน้อย 20 ตัวอักษร",
"approver_1": "user_id",
"approver_2": "user_id"
}
```
2. **Verify in audit log:**
```sql
SELECT * FROM document_number_config_history
WHERE config_id = X
ORDER BY changed_at DESC
LIMIT 1;
```
### 5.2. Template Update
**Best Practices:**
1. Always test template in staging first
2. Preview generated numbers before applying
3. Document reason for change
4. Template changes do NOT affect existing documents
**API Call:**
```bash
PUT /api/v1/document-numbering/configs/{configId}
{
"template": "{ORIGINATOR}-{RECIPIENT}-{SEQ:4}-{YEAR:B.E.}",
"change_reason": "เหตุผลในการเปลี่ยนแปลง"
}
```
### 5.3. Database Maintenance
**Weekly Tasks:**
- Check slow query log
- Optimize tables if needed:
```sql
OPTIMIZE TABLE document_number_counters;
OPTIMIZE TABLE document_number_audit;
```
**Monthly Tasks:**
- Review and archive old audit logs (> 2 years)
- Check index usage:
```sql
SELECT * FROM sys.schema_unused_indexes
WHERE object_schema = 'lcbp3_db';
```
## 6. Backup & Recovery
### 6.1. Backup Strategy
**Database:**
- Full backup: Daily at 02:00 AM
- Incremental backup: Every 4 hours
- Retention: 30 days
**Redis:**
- AOF (Append-Only File) enabled
- Snapshot every 1 hour
- Retention: 7 days
### 6.2. Recovery Procedures
See: [Backup & Recovery Guide](./04-04-backup-recovery.md)
## References
- [Requirements](../01-requirements/01-03.11-document-numbering.md)
- [Implementation Guide](../03-implementation/03-04-document-numbering.md)
- [ADR-002 Document Numbering Strategy](../05-decisions/ADR-002-document-numbering-strategy.md)
- [Monitoring & Alerting](../04-operations/04-03-monitoring-alerting.md)
- [Incident Response](../04-operations/04-07-incident-response.md)

View File

@@ -0,0 +1,191 @@
# Operations Documentation
**Project:** LCBP3-DMS (Laem Chabang Port Phase 3 - Document Management System)
**Version:** 1.7.0
**Last Updated:** 2025-12-18
---
## 📋 Overview
This directory contains operational documentation for deploying, maintaining, and monitoring the LCBP3-DMS system.
---
## 📚 Documentation Index
### Deployment & Infrastructure
| Document | Description | Status |
| ---------------------------------------------- | ------------------------------------------------------ | ---------- |
| [deployment-guide.md](04-01-deployment-guide.md) | Docker deployment procedures on QNAP Container Station | ✅ Complete |
| [environment-setup.md](04-02-environment-setup.md) | Environment variables and configuration management | ✅ Complete |
### Monitoring & Maintenance
| Document | Description | Status |
| -------------------------------------------------------- | --------------------------------------------------- | ---------- |
| [monitoring-alerting.md](04-03-monitoring-alerting.md) | Monitoring setup, health checks, and alerting rules | ✅ Complete |
| [backup-recovery.md](04-04-backup-recovery.md) | Backup strategies and disaster recovery procedures | ✅ Complete |
| [maintenance-procedures.md](04-05-maintenance-procedures.md) | Routine maintenance and update procedures | ✅ Complete |
### Security & Compliance
| Document | Description | Status |
| -------------------------------------------------- | ---------------------------------------------- | ---------- |
| [security-operations.md](04-06-security-operations.md) | Security monitoring and incident response | ✅ Complete |
| [incident-response.md](04-07-incident-response.md) | Incident classification and response playbooks | ✅ Complete |
---
## 🚀 Quick Start for Operations Team
### Initial Setup
1. **Read Deployment Guide** - [deployment-guide.md](04-01-deployment-guide.md)
2. **Configure Environment** - [environment-setup.md](04-02-environment-setup.md)
3. **Setup Monitoring** - [monitoring-alerting.md](04-03-monitoring-alerting.md)
4. **Configure Backups** - [backup-recovery.md](04-04-backup-recovery.md)
### Daily Operations
1. Monitor system health via logs and metrics
2. Review backup status (automated daily)
3. Check for security alerts
4. Review system performance metrics
### Weekly/Monthly Tasks
- Review and update SSL certificates (90 days before expiry)
- Database optimization and cleanup
- Log rotation and archival
- Security patch review and application
---
## 🏗️ Infrastructure Overview
### QNAP Container Station Architecture
```mermaid
graph TB
subgraph "QNAP Server"
subgraph "Container Station"
NGINX[NGINX<br/>Reverse Proxy<br/>Port 80/443]
Backend[NestJS Backend<br/>Port 3000]
Frontend[Next.js Frontend<br/>Port 3001]
MariaDB[(MariaDB 11.8<br/>Port 3306)]
Redis[(Redis 7.2<br/>Port 6379)]
ES[(Elasticsearch<br/>Port 9200)]
end
Volumes[("Persistent Volumes<br/>- database<br/>- uploads<br/>- logs")]
end
Internet([Internet]) --> NGINX
NGINX --> Frontend
NGINX --> Backend
Backend --> MariaDB
Backend --> Redis
Backend --> ES
MariaDB --> Volumes
Backend --> Volumes
```
### Container Services
| Service | Container Name | Ports | Persistent Volume |
| ------------- | ------------------- | ------- | ----------------------------- |
| NGINX | lcbp3-nginx | 80, 443 | /config/nginx |
| Backend | lcbp3-backend | 3000 | /app/uploads, /app/logs |
| Frontend | lcbp3-frontend | 3001 | - |
| MariaDB | lcbp3-mariadb | 3306 | /var/lib/mysql |
| Redis | lcbp3-redis | 6379 | /data |
| Elasticsearch | lcbp3-elasticsearch | 9200 | /usr/share/elasticsearch/data |
---
## 👥 Roles & Responsibilities
### System Administrator
- Deploy and configure infrastructure
- Manage QNAP server and Container Station
- Configure networking and firewall rules
- SSL certificate management
### Database Administrator (DBA)
- Database backup and recovery
- Performance tuning and optimization
- Migration execution
- Access control management
### DevOps Engineer
- CI/CD pipeline maintenance
- Container orchestration
- Monitoring and alerting setup
- Log aggregation
### Security Officer
- Security monitoring
- Incident response coordination
- Access audit reviews
- Vulnerability management
---
## 📞 Support & Escalation
### Support Tiers
**Tier 1: User Support**
- User access issues
- Password resets
- Basic troubleshooting
**Tier 2: Technical Support**
- Application errors
- Performance issues
- Feature bugs
**Tier 3: Operations Team**
- Infrastructure failures
- Database issues
- Security incidents
### Escalation Path
1. **Minor Issues** → Tier 1/2 Support → Resolution within 24h
2. **Major Issues** → Tier 3 Operations → Resolution within 4h
3. **Critical Issues** → Immediate escalation to System Architect → Resolution within 1h
---
## 🔗 Related Documentation
- [Architecture Documentation](../02-architecture/)
- [Implementation Guidelines](../03-implementation/)
- [Architecture Decision Records](../05-decisions/)
- [Backend Development Tasks](../06-tasks/)
---
## 📝 Document Maintenance
- **Review Frequency:** Monthly
- **Owner:** Operations Team
- **Last Review:** 2025-12-01
- **Next Review:** 2026-01-01
---
**Version:** 1.7.0
**Last Updated:** 2025-12-18
**Status:** Active
**Classification:** Internal Use Only

View File

@@ -0,0 +1,853 @@
# การติดตั้ง Monitoring Stack บน ASUSTOR
## **📝 คำอธิบายและข้อควรพิจารณา**
> ⚠️ **หมายเหตุ**: Monitoring Stack ทั้งหมดติดตั้งบน **ASUSTOR AS5403T** ไม่ใช่ QNAP
> เพื่อแยก Application workload ออกจาก Infrastructure/Monitoring workload
Stack สำหรับ Monitoring ประกอบด้วย:
| Service | Port | Purpose | Host |
| :---------------- | :--------------------------- | :-------------------------------- | :------ |
| **Prometheus** | 9090 | เก็บ Metrics และ Time-series data | ASUSTOR |
| **Grafana** | 3000 | Dashboard สำหรับแสดงผล Metrics | ASUSTOR |
| **Node Exporter** | 9100 | เก็บ Metrics ของ Host system | Both |
| **cAdvisor** | 8080 (ASUSTOR) / 8088 (QNAP) | เก็บ Metrics ของ Docker containers | Both |
| **Uptime Kuma** | 3001 | Service Availability Monitoring | ASUSTOR |
| **Loki** | 3100 | Log aggregation | ASUSTOR |
| **Promtail** | - | Log shipper (Sender) | ASUSTOR |
---
## 🏗️ Architecture Overview
```
┌─────────────────────────────────────────────────────────────────────────┐
│ ASUSTOR AS5403T (Monitoring Hub) │
├─────────────────────────────────────────────────────────────────────────┤
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
│ │ Prometheus │───▶│ Grafana │ │ Uptime Kuma │ │
│ │ :9090 │ │ :3000 │ │ :3001 │ │
│ └──────┬──────┘ └─────────────┘ └─────────────┘ │
│ │ │
│ │ Scrape Metrics │
│ ▼ │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
│ │node-exporter│ │ cAdvisor │ │ Promtail │ │
│ │ :9100 │ │ :8080 │ │ (Log Ship) │ │
│ │ (Local) │ │ (Local) │ │ (Local) │ │
│ └─────────────┘ └─────────────┘ └─────────────┘ │
└─────────────────────────────────────────────────────────────────────────┘
│ Remote Scrape
┌─────────────────────────────────────────────────────────────────────────┐
│ QNAP TS-473A (App Server) │
├─────────────────────────────────────────────────────────────────────────┤
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
│ │node-exporter│ │ cAdvisor │ │ Backend │ │
│ │ :9100 │ │ :8080 │ │ /metrics │ │
│ └─────────────┘ └─────────────┘ └─────────────┘ │
└─────────────────────────────────────────────────────────────────────────┘
```
---
## กำหนดสิทธิ (บน ASUSTOR)
```bash
# SSH เข้า ASUSTOR
ssh admin@192.168.10.9
# สร้าง Directory
mkdir -p /volume1/np-dms/monitoring/prometheus/data
mkdir -p /volume1/np-dms/monitoring/prometheus/config
mkdir -p /volume1/np-dms/monitoring/grafana/data
mkdir -p /volume1/np-dms/monitoring/uptime-kuma/data
mkdir -p /volume1/np-dms/monitoring/loki/data
mkdir -p /volume1/np-dms/monitoring/promtail/config
# กำหนดสิทธิ์ให้ตรงกับ User ID ใน Container
# Prometheus (UID 65534 - nobody)
chown -R 65534:65534 /volume1/np-dms/monitoring/prometheus
chmod -R 750 /volume1/np-dms/monitoring/prometheus
# Grafana (UID 472)
chown -R 472:472 /volume1/np-dms/monitoring/grafana/data
chmod -R 750 /volume1/np-dms/monitoring/grafana/data
# Uptime Kuma (UID 1000)
chown -R 1000:1000 /volume1/np-dms/monitoring/uptime-kuma/data
chmod -R 750 /volume1/np-dms/monitoring/uptime-kuma/data
# Loki (UID 10001)
chown -R 10001:10001 /volume1/np-dms/monitoring/loki/data
chmod -R 750 /volume1/np-dms/monitoring/loki/data
# Promtail (Runs as root to read docker logs - no specific chown needed for config dir if created by admin)
# But ensure config file is readable
chmod -R 755 /volume1/np-dms/monitoring/promtail/config
```
---
## 🔗 สร้าง Docker Network (ทำครั้งแรกครั้งเดียว)
> ⚠️ **ต้องสร้าง network ก่อน deploy docker-compose ทุกตัว** เพราะทุก service ใช้ `lcbp3` เป็น external network
### สร้างผ่าน Portainer (แนะนำ)
1. เปิด **Portainer** → เลือก Environment ของ ASUSTOR
2. ไปที่ **Networks****Add network**
3. กรอกข้อมูล:
- **Name:** `lcbp3`
- **Driver:** `bridge`
4. กด **Create the network**
### สร้างผ่าน SSH
```bash
# SSH เข้า ASUSTOR
ssh admin@192.168.10.9
# สร้าง external network
docker network create lcbp3
# ตรวจสอบ
docker network ls | grep lcbp3
docker network inspect lcbp3
```
> 📖 **QNAP** ก็ต้องมี network ชื่อ `lcbp3` เช่นกัน (สร้างผ่าน Container Station หรือ SSH)
> ดู [README.md Quick Reference](README.md#-quick-reference) สำหรับคำสั่งบน QNAP
---
## Note: NPM Proxy Configuration (NPM รันบน QNAP → Forward ไป ASUSTOR)
> ⚠️ เนื่องจาก NPM อยู่บน **QNAP** แต่ Monitoring services อยู่บน **ASUSTOR**
> ต้องใช้ **IP Address** (`192.168.10.9`) แทนชื่อ container (resolve ข้ามเครื่องไม่ได้)
| Domain Names | Scheme | Forward Hostname | Forward Port | Block Common Exploits | Websockets | Force SSL | HTTP/2 |
| :--------------------- | :----- | :--------------- | :----------- | :-------------------- | :--------- | :-------- | :----- |
| grafana.np-dms.work | `http` | `192.168.10.9` | 3000 | [x] | [x] | [x] | [x] |
| prometheus.np-dms.work | `http` | `192.168.10.9` | 9090 | [x] | [ ] | [x] | [x] |
| uptime.np-dms.work | `http` | `192.168.10.9` | 3001 | [x] | [x] | [x] | [x] |
---
## Docker Compose File (ASUSTOR)
```yaml
# File: /volume1/np-dms/monitoring/docker-compose.yml
# DMS Container v1.8.0: Application name: lcbp3-monitoring
# Deploy on: ASUSTOR AS5403T
# Services: prometheus, grafana, node-exporter, cadvisor, uptime-kuma, loki, promtail
x-restart: &restart_policy
restart: unless-stopped
x-logging: &default_logging
logging:
driver: "json-file"
options:
max-size: "10m"
max-file: "5"
networks:
lcbp3:
external: true
services:
# ----------------------------------------------------------------
# 1. Prometheus (Metrics Collection & Storage)
# ----------------------------------------------------------------
prometheus:
<<: [*restart_policy, *default_logging]
image: prom/prometheus:v2.48.0
container_name: prometheus
stdin_open: true
tty: true
deploy:
resources:
limits:
cpus: "1.0"
memory: 1G
reservations:
cpus: "0.25"
memory: 256M
environment:
TZ: "Asia/Bangkok"
command:
- '--config.file=/etc/prometheus/prometheus.yml'
- '--storage.tsdb.path=/prometheus'
- '--storage.tsdb.retention.time=30d'
- '--web.enable-lifecycle'
ports:
- "9090:9090"
networks:
- lcbp3
volumes:
- "/volume1/np-dms/monitoring/prometheus/config:/etc/prometheus:ro"
- "/volume1/np-dms/monitoring/prometheus/data:/prometheus"
healthcheck:
test: ["CMD", "wget", "--spider", "-q", "http://localhost:9090/-/healthy"]
interval: 30s
timeout: 10s
retries: 3
# ----------------------------------------------------------------
# 2. Grafana (Dashboard & Visualization)
# ----------------------------------------------------------------
grafana:
<<: [*restart_policy, *default_logging]
image: grafana/grafana:10.2.2
container_name: grafana
stdin_open: true
tty: true
deploy:
resources:
limits:
cpus: "1.0"
memory: 512M
reservations:
cpus: "0.25"
memory: 128M
environment:
TZ: "Asia/Bangkok"
GF_SECURITY_ADMIN_USER: admin
GF_SECURITY_ADMIN_PASSWORD: "Center#2025"
GF_SERVER_ROOT_URL: "https://grafana.np-dms.work"
GF_INSTALL_PLUGINS: grafana-clock-panel,grafana-piechart-panel
ports:
- "3000:3000"
networks:
- lcbp3
volumes:
- "/volume1/np-dms/monitoring/grafana/data:/var/lib/grafana"
depends_on:
- prometheus
healthcheck:
test: ["CMD-SHELL", "wget --spider -q http://localhost:3000/api/health || exit 1"]
interval: 30s
timeout: 10s
retries: 3
# ----------------------------------------------------------------
# 3. Uptime Kuma (Service Availability Monitoring)
# ----------------------------------------------------------------
uptime-kuma:
<<: [*restart_policy, *default_logging]
image: louislam/uptime-kuma:1
container_name: uptime-kuma
deploy:
resources:
limits:
cpus: "0.5"
memory: 256M
environment:
TZ: "Asia/Bangkok"
ports:
- "3001:3001"
networks:
- lcbp3
volumes:
- "/volume1/np-dms/monitoring/uptime-kuma/data:/app/data"
healthcheck:
test: ["CMD-SHELL", "curl -f http://localhost:3001/api/entry-page || exit 1"]
interval: 30s
timeout: 10s
retries: 3
# ----------------------------------------------------------------
# 4. Node Exporter (Host Metrics - ASUSTOR)
# ----------------------------------------------------------------
node-exporter:
<<: [*restart_policy, *default_logging]
image: prom/node-exporter:v1.7.0
container_name: node-exporter
deploy:
resources:
limits:
cpus: "0.5"
memory: 128M
environment:
TZ: "Asia/Bangkok"
command:
- '--path.procfs=/host/proc'
- '--path.sysfs=/host/sys'
- '--collector.filesystem.mount-points-exclude=^/(sys|proc|dev|host|etc)($$|/)'
ports:
- "9100:9100"
networks:
- lcbp3
volumes:
- /proc:/host/proc:ro
- /sys:/host/sys:ro
- /:/rootfs:ro
healthcheck:
test: ["CMD", "wget", "--spider", "-q", "http://localhost:9100/metrics"]
interval: 30s
timeout: 10s
retries: 3
# ----------------------------------------------------------------
# 5. cAdvisor (Container Metrics - ASUSTOR)
# ----------------------------------------------------------------
cadvisor:
<<: [*restart_policy, *default_logging]
image: gcr.io/cadvisor/cadvisor:v0.47.2
container_name: cadvisor
privileged: true
devices:
- /dev/kmsg
deploy:
resources:
limits:
cpus: "0.5"
memory: 256M
environment:
TZ: "Asia/Bangkok"
ports:
- "8088:8088"
networks:
- lcbp3
volumes:
- /:/rootfs:ro
- /var/run:/var/run:ro
- /sys:/sys:ro
- /var/lib/docker/:/var/lib/docker:ro
- /dev/disk/:/dev/disk:ro
healthcheck:
test: ["CMD", "wget", "--spider", "-q", "http://localhost:8080/healthz"]
interval: 30s
timeout: 10s
retries: 3
# ----------------------------------------------------------------
# 6. Loki (Log Aggregation)
# ----------------------------------------------------------------
loki:
<<: [*restart_policy, *default_logging]
image: grafana/loki:2.9.0
container_name: loki
deploy:
resources:
limits:
cpus: "0.5"
memory: 512M
environment:
TZ: "Asia/Bangkok"
command: -config.file=/etc/loki/local-config.yaml
ports:
- "3100:3100"
networks:
- lcbp3
volumes:
- "/volume1/np-dms/monitoring/loki/data:/loki"
healthcheck:
test: ["CMD", "wget", "--spider", "-q", "http://localhost:3100/ready"]
interval: 30s
timeout: 10s
retries: 3
# ----------------------------------------------------------------
# 7. Promtail (Log Shipper)
# ----------------------------------------------------------------
promtail:
<<: [*restart_policy, *default_logging]
image: grafana/promtail:2.9.0
container_name: promtail
user: "0:0"
deploy:
resources:
limits:
cpus: "0.5"
memory: 256M
environment:
TZ: "Asia/Bangkok"
command: -config.file=/etc/promtail/promtail-config.yml
networks:
- lcbp3
volumes:
- "/volume1/np-dms/monitoring/promtail/config:/etc/promtail:ro"
- "/var/run/docker.sock:/var/run/docker.sock:ro"
- "/var/lib/docker/containers:/var/lib/docker/containers:ro"
depends_on:
- loki
```
---
## QNAP Node Exporter & cAdvisor
ติดตั้ง node-exporter และ cAdvisor บน QNAP เพื่อให้ Prometheus บน ASUSTOR scrape metrics ได้:
```yaml
# File: /share/np-dms/monitoring/docker-compose.yml (QNAP)
# เฉพาะ exporters เท่านั้น - metrics ถูก scrape โดย Prometheus บน ASUSTOR
version: '3.8'
networks:
lcbp3:
external: true
services:
node-exporter:
image: prom/node-exporter:v1.7.0
container_name: node-exporter
restart: unless-stopped
command:
- '--path.procfs=/host/proc'
- '--path.sysfs=/host/sys'
- '--collector.filesystem.mount-points-exclude=^/(sys|proc|dev|host|etc)($$|/)'
ports:
- "9100:9100"
networks:
- lcbp3
volumes:
- /proc:/host/proc:ro
- /sys:/host/sys:ro
- /:/rootfs:ro
cadvisor:
image: gcr.io/cadvisor/cadvisor:v0.47.2
container_name: cadvisor
restart: unless-stopped
privileged: true
ports:
- "8088:8080"
networks:
- lcbp3
volumes:
- /:/rootfs:ro
- /var/run:/var/run:ro
- /sys:/sys:ro
- /var/lib/docker/:/var/lib/docker:ro
- /sys/fs/cgroup:/sys/fs/cgroup:ro
mysqld-exporter:
image: prom/mysqld-exporter:v0.15.0
container_name: mysqld-exporter
restart: unless-stopped
user: root
command:
- '--config.my-cnf=/etc/mysql/my.cnf'
ports:
- "9104:9104"
networks:
- lcbp3
volumes:
- "/share/np-dms/monitoring/mysqld-exporter/.my.cnf:/etc/mysql/my.cnf:ro"
```
---
## Prometheus Configuration
สร้างไฟล์ `/volume1/np-dms/monitoring/prometheus/config/prometheus.yml` บน ASUSTOR:
```yaml
global:
scrape_interval: 15s
evaluation_interval: 15s
scrape_configs:
# Prometheus self-monitoring (ASUSTOR)
- job_name: 'prometheus'
static_configs:
- targets: ['localhost:9090']
# ============================================
# ASUSTOR Metrics (Local)
# ============================================
# Host metrics from Node Exporter (ASUSTOR)
- job_name: 'asustor-node'
static_configs:
- targets: ['node-exporter:9100']
labels:
host: 'asustor'
# Container metrics from cAdvisor (ASUSTOR)
- job_name: 'asustor-cadvisor'
static_configs:
- targets: ['cadvisor:8080']
labels:
host: 'asustor'
# ============================================
# QNAP Metrics (Remote - 192.168.10.8)
# ============================================
# Host metrics from Node Exporter (QNAP)
- job_name: 'qnap-node'
static_configs:
- targets: ['192.168.10.8:9100']
labels:
host: 'qnap'
# Container metrics from cAdvisor (QNAP)
- job_name: 'qnap-cadvisor'
static_configs:
- targets: ['192.168.10.8:8088']
labels:
host: 'qnap'
# Backend NestJS application (QNAP)
- job_name: 'backend'
static_configs:
- targets: ['192.168.10.8:3000']
labels:
host: 'qnap'
metrics_path: '/metrics'
# MariaDB Exporter (QNAP)
- job_name: 'mariadb'
static_configs:
- targets: ['192.168.10.8:9104']
labels:
host: 'qnap'
```
---
## Uptime Kuma Monitors
เมื่อ Uptime Kuma พร้อมใช้งาน ให้เพิ่ม monitors ต่อไปนี้:
| Monitor Name | Type | URL / Host | Interval |
| :------------ | :--- | :--------------------------------- | :------- |
| QNAP NPM | HTTP | https://npm.np-dms.work | 60s |
| Frontend | HTTP | https://lcbp3.np-dms.work | 60s |
| Backend API | HTTP | https://backend.np-dms.work/health | 60s |
| MariaDB | TCP | 192.168.10.8:3306 | 60s |
| Redis | TCP | 192.168.10.8:6379 | 60s |
| Elasticsearch | HTTP | http://192.168.10.8:9200 | 60s |
| Gitea | HTTP | https://git.np-dms.work | 60s |
| n8n | HTTP | https://n8n.np-dms.work | 60s |
| Grafana | HTTP | https://grafana.np-dms.work | 60s |
| QNAP Host | Ping | 192.168.10.8 | 60s |
| ASUSTOR Host | Ping | 192.168.10.9 | 60s |
---
## Grafana Dashboards
### Recommended Dashboards to Import
| Dashboard ID | Name | Purpose |
| :----------- | :--------------------------- | :----------------------------- |
| 1860 | Node Exporter Full | Host system metrics |
| 14282 | cAdvisor exporter | Container metrics |
| 11074 | Node Exporter for Prometheus | Node overview |
| 893 | Docker and Container | Docker overview |
| 7362 | MySQL | MySQL view |
| 1214 | Redis | Redis view |
| 14204 | Elasticsearch | Elasticsearch view |
| 13106 | MySQL/MariaDB Overview | Detailed MySQL/MariaDB metrics |
### Import Dashboard via Grafana UI
1. Go to **Dashboards → Import**
2. Enter Dashboard ID (e.g., `1860`)
3. Select Prometheus data source
4. Click **Import**
---
## 🚀 Deploy lcbp3-monitoring บน ASUSTOR
### 📋 Prerequisites Checklist
| # | ขั้นตอน | Status |
| :--- | :------------------------------------------------------------------------------------------------- | :----- |
| 1 | SSH เข้า ASUSTOR ได้ (`ssh admin@192.168.10.9`) | ✅ |
| 2 | Docker Network `lcbp3` สร้างแล้ว (ดูหัวข้อ [สร้าง Docker Network](#-สร้าง-docker-network-ทำครั้งแรกครั้งเดียว)) | ✅ |
| 3 | สร้าง Directories และกำหนดสิทธิ์แล้ว (ดูหัวข้อ [กำหนดสิทธิ](#กำหนดสิทธิ-บน-asustor)) | ✅ |
| 4 | สร้าง `prometheus.yml` แล้ว (ดูหัวข้อ [Prometheus Configuration](#prometheus-configuration)) | ✅ |
| 5 | สร้าง `promtail-config.yml` แล้ว (ดูหัวข้อ [Step 1.2](#step-12-สร้าง-promtail-configyml)) | ✅ |
---
### Step 1: สร้าง prometheus.yml
```bash
# SSH เข้า ASUSTOR
ssh admin@192.168.10.9
# สร้างไฟล์ prometheus.yml
cat > /volume1/np-dms/monitoring/prometheus/config/prometheus.yml << 'EOF'
global:
scrape_interval: 15s
evaluation_interval: 15s
scrape_configs:
- job_name: 'prometheus'
static_configs:
- targets: ['localhost:9090']
- job_name: 'asustor-node'
static_configs:
- targets: ['node-exporter:9100']
labels:
host: 'asustor'
- job_name: 'asustor-cadvisor'
static_configs:
- targets: ['cadvisor:8080']
labels:
host: 'asustor'
- job_name: 'qnap-node'
static_configs:
- targets: ['192.168.10.8:9100']
labels:
host: 'qnap'
- job_name: 'qnap-cadvisor'
static_configs:
- targets: ['192.168.10.8:8088']
labels:
host: 'qnap'
- job_name: 'backend'
static_configs:
- targets: ['192.168.10.8:3000']
labels:
host: 'qnap'
metrics_path: '/metrics'
EOF
# ตรวจสอบ
cat /volume1/np-dms/monitoring/prometheus/config/prometheus.yml
```
### Step 1.2: สร้าง promtail-config.yml
ต้องสร้าง Config ให้ Promtail อ่าน logs จาก Docker containers และส่งไป Loki:
```bash
# สร้างไฟล์ promtail-config.yml
cat > /volume1/np-dms/monitoring/promtail/config/promtail-config.yml << 'EOF'
server:
http_listen_port: 9080
grpc_listen_port: 0
positions:
filename: /tmp/positions.yaml
clients:
- url: http://loki:3100/loki/api/v1/push
scrape_configs:
- job_name: docker
docker_sd_configs:
- host: unix:///var/run/docker.sock
refresh_interval: 5s
relabel_configs:
- source_labels: ['__meta_docker_container_name']
regex: '/(.*)'
target_label: 'container'
- source_labels: ['__meta_docker_container_log_stream']
target_label: 'stream'
EOF
# ขั้นตอนการเตรียมระบบที่ QNAP (ก่อน Deploy Stack)
### 1. สร้าง Monitoring User ใน MariaDB
รันคำสั่ง SQL นี้ผ่าน **phpMyAdmin** หรือ `docker exec`:
```sql
CREATE USER 'exporter'@'%' IDENTIFIED BY 'Center2025' WITH MAX_USER_CONNECTIONS 3;
GRANT PROCESS, REPLICATION CLIENT, SELECT, SLAVE MONITOR ON *.* TO 'exporter'@'%';
FLUSH PRIVILEGES;
```
### 2. สร้างไฟล์คอนฟิก .my.cnf บน QNAP
เพื่อให้ `mysqld-exporter` อ่านรหัสผ่านที่มีตัวอักษรพิเศษได้ถูกต้อง:
1. **SSH เข้า QNAP** (หรือใช้ File Station สร้าง Folder):
```bash
ssh admin@192.168.10.8
```
2. **สร้าง Directory สำหรับเก็บ Config**:
```bash
mkdir -p /share/np-dms/monitoring/mysqld-exporter
```
3. **สร้างไฟล์ .my.cnf**:
```bash
cat > /share/np-dms/monitoring/mysqld-exporter/.my.cnf << 'EOF'
[client]
user=exporter
password=Center2025
host=mariadb
EOF
```
4. **กำหนดสิทธิ์ไฟล์** (เพื่อให้ Container อ่านไฟล์ได้):
```bash
chmod 644 /share/np-dms/monitoring/mysqld-exporter/.my.cnf
```
# ตรวจสอบ
cat /volume1/np-dms/monitoring/promtail/config/promtail-config.yml
```
---
### Step 2: Deploy ผ่าน Portainer (แนะนำ)
1. เปิด **Portainer** → เลือก Environment ของ **ASUSTOR**
2. ไปที่ **Stacks** → **Add stack**
3. กรอกข้อมูล:
- **Name:** `lcbp3-monitoring`
- **Build method:** เลือก **Web editor**
4. วาง (Paste) เนื้อหาจาก [Docker Compose File (ASUSTOR)](#docker-compose-file-asustor) ด้านบน
5. กด **Deploy the stack**
> ⚠️ **สำคัญ:** ตรวจสอบ Password ของ Grafana (`GF_SECURITY_ADMIN_PASSWORD`) ใน docker-compose ก่อน deploy
### Deploy ผ่าน SSH (วิธีสำรอง)
```bash
# SSH เข้า ASUSTOR
ssh admin@192.168.10.9
# คัดลอก docker-compose.yml ไปยัง path
# (วางไฟล์ที่ /volume1/np-dms/monitoring/docker-compose.yml)
# Deploy
cd /volume1/np-dms/monitoring
docker compose up -d
# ตรวจสอบ container status
docker compose ps
```
---
### Step 3: Verify Services
```bash
# ตรวจสอบ containers ทั้งหมด
docker ps --filter "name=prometheus" --filter "name=grafana" \
--filter "name=uptime-kuma" --filter "name=node-exporter" \
--filter "name=cadvisor" --filter "name=loki" --filter "name=promtail"
```
| Service | วิธีตรวจสอบ | Expected Result |
| :---------------- | :----------------------------------------------------------------- | :------------------------------------ |
| ✅ **Prometheus** | `curl http://192.168.10.9:9090/-/healthy` | `Prometheus Server is Healthy` |
| ✅ **Grafana** | เปิด `https://grafana.np-dms.work` (หรือ `http://192.168.10.9:3000`) | หน้า Login |
| ✅ **Uptime Kuma** | เปิด `https://uptime.np-dms.work` (หรือ `http://192.168.10.9:3001`) | หน้า Setup |
| ✅ **Node Exp.** | `curl http://192.168.10.9:9100/metrics \| head` | Metrics output |
| ✅ **cAdvisor** | `curl http://192.168.10.9:8080/healthz` | `ok` |
| ✅ **Loki** | `curl http://192.168.10.9:3100/ready` | `ready` |
| ✅ **Promtail** | เช็ค Logs: `docker logs promtail` | ไม่ควรมี Error + เห็น connection success |
---
### Step 4: Deploy QNAP Exporters
ติดตั้ง node-exporter และ cAdvisor บน QNAP เพื่อให้ Prometheus scrape ข้ามเครื่องได้:
#### ผ่าน Container Station (QNAP)
1. เปิด **Container Station** บน QNAP Web UI
2. ไปที่ **Applications** → **Create**
3. ตั้งชื่อ Application: `lcbp3-exporters`
4. วาง (Paste) เนื้อหาจาก [QNAP Node Exporter & cAdvisor](#qnap-node-exporter--cadvisor)
5. กด **Create**
#### ตรวจสอบจาก ASUSTOR
```bash
# ตรวจว่า Prometheus scrape QNAP ได้
curl -s http://localhost:9090/api/v1/targets | grep -E '"qnap-(node|cadvisor)"'
# หรือเปิด Prometheus UI → Targets
# URL: http://192.168.10.9:9090/targets
# ดูว่า qnap-node, qnap-cadvisor เป็น State: UP
```
---
### Step 5: ตั้งค่า Grafana & Uptime Kuma
#### Grafana — First Login
1. เปิด `https://grafana.np-dms.work`
2. Login: `admin` / `Center#2025` (หรือ password ที่ตั้งไว้)
3. ไปที่ **Connections** → **Data sources** → **Add data source**
4. เลือก **Prometheus**
- URL: `http://prometheus:9090`
- กด **Save & Test** → ต้องขึ้น ✅
5. Import Dashboards (ดูรายละเอียดในหัวข้อ [6. Grafana Dashboards Setup](#6-grafana-dashboards-setup))
#### Uptime Kuma — First Setup
1. เปิด `https://uptime.np-dms.work`
2. สร้าง Admin account
3. เพิ่ม Monitors ตาม [ตาราง Uptime Kuma Monitors](#uptime-kuma-monitors)
---
### 6. Grafana Dashboards Setup
เพื่อการ Monitor ที่สมบูรณ์ แนะนำให้ Import Dashboards ต่อไปนี้:
#### 6.1 Host Monitoring (Node Exporter)
* **Concept:** ดู resource ของเครื่อง Host (CPU, RAM, Disk, Network)
* **Dashboard ID:** `1860` (Node Exporter Full)
* **วิธี Import:**
1. ไปที่ **Dashboards** → **New** → **Import**
2. ช่อง **Import via grafana.com** ใส่เลข `1860` กด **Load**
3. เลือก Data source: **Prometheus**
4. กด **Import**
#### 6.2 Container Monitoring (cAdvisor)
* **Concept:** ดู resource ของแต่ละ Container (เชื่อม Logs ด้วย)
* **Dashboard ID:** `14282` (Cadvisor exporter)
* **วิธี Import:**
1. ใส่เลข `14282` กด **Load**
2. เลือก Data source: **Prometheus**
3. กด **Import**
#### 6.3 Logs Monitoring (Loki Integration)
เพื่อให้ Dashboard ของ Container แสดง Logs จาก Loki ได้ด้วย:
1. เปิด Dashboard **Cadvisor exporter** ที่เพิ่ง Import มา
2. กดปุ่ม **Add visualization** (หรือ Edit dashboard)
3. เลือก Data source: **Loki**
4. ในช่อง Query ใส่: `{container="$name"}`
* *(Note: `$name` มาจาก Variable ของ Dashboard 14282)*
5. ปรับ Visualization type เป็น **Logs**
6. ตั้งชื่อ Panel ว่า **"Container Logs"**
7. กด **Apply** และ **Save Dashboard**
ตอนนี้เราจะเห็นทั้ง **กราฟการกินทรัพยากร** และ **Logs** ของ Container นั้นๆ ในหน้าเดียวกันครับ
#### 6.4 Integrated Dashboard (Recommended)
ผมได้เตรียม JSON file ที่รวม Metrics และ Logs ไว้ให้แล้วครับ:
1. ไปที่ **Dashboards** → **New** → **Import**
2. ลากไฟล์ หรือ Copy เนื้อหาจากไฟล์:
`specs/08-infrastructure/grafana/dashboards/lcbp3-docker-monitoring.json`
3. กด **Load** และ **Import**
## 7.3 Backup / Export Dashboards
เมื่อปรับแต่ง Dashboard จนพอใจแล้ว ควร Export เก็บเป็นไฟล์ JSON ไว้ backup หรือ version control:
1. เปิด Dashboard ที่ต้องการ backup
2. ไปที่ปุ่ม **Share Dashboard** (ไอคอน 🔗 หรือ Share มุมซ้ายบน)
3. เลือกTab **Export**
4. เปิดตัวเลือก **Export for sharing externally** (เพื่อให้ลบ hardcoded value)
5. กด **Save to file**
6. นำไฟล์ JSON มาเก็บไว้ที่ path: `specs/08-infrastructure/grafana/dashboards/`
---
> 📝 **หมายเหตุ**: เอกสารนี้อ้างอิงจาก Architecture Document **v1.8.0** - Monitoring Stack deploy บน ASUSTOR AS5403T

View File

@@ -0,0 +1,247 @@
# Backup Strategy สำหรับ LCBP3-DMS
> 📍 **Deploy on:** ASUSTOR AS5403T (Infrastructure Server)
> 🎯 **Backup Target:** QNAP TS-473A (Application & Database)
> 📄 **Version:** v1.8.0
---
## Overview
ระบบ Backup แบบ Pull-based: ASUSTOR ดึงข้อมูลจาก QNAP เพื่อความปลอดภัย
หาก QNAP ถูกโจมตี ผู้โจมตีจะไม่สามารถลบ Backup บน ASUSTOR ได้
```
┌─────────────────────────────────────────────────────────────────┐
│ BACKUP ARCHITECTURE │
├─────────────────────────────────────────────────────────────────┤
│ │
│ QNAP (Source) ASUSTOR (Backup Target) │
│ 192.168.10.8 192.168.10.9 │
│ │
│ ┌──────────────┐ SSH/Rsync ┌──────────────────────┐ │
│ │ MariaDB │ ─────────────▶ │ /volume1/backup/db/ │ │
│ │ (mysqldump) │ Daily 2AM │ (Restic Repository) │ │
│ └──────────────┘ └──────────────────────┘ │
│ │
│ ┌──────────────┐ ┌──────────────────────┐ │
│ │ Redis RDB │ ─────────────▶ │ /volume1/backup/ │ │
│ │ + AOF │ Daily 3AM │ redis/ │ │
│ └──────────────┘ └──────────────────────┘ │
│ │
│ ┌──────────────┐ ┌──────────────────────┐ │
│ │ App Config │ ─────────────▶ │ /volume1/backup/ │ │
│ │ + Volumes │ Weekly Sun │ config/ │ │
│ └──────────────┘ └──────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────┘
```
---
## 1. MariaDB Backup
### 1.1 Daily Database Backup Script
```bash
#!/bin/bash
# File: /volume1/np-dms/scripts/backup-mariadb.sh
# Run on: ASUSTOR (Pull from QNAP)
DATE=$(date +%Y%m%d_%H%M%S)
BACKUP_DIR="/volume1/backup/db"
QNAP_IP="192.168.10.8"
DB_NAME="lcbp3_db"
DB_USER="root"
DB_PASSWORD="${MARIADB_ROOT_PASSWORD}"
echo "🔄 Starting MariaDB backup at $DATE"
# Create backup directory
mkdir -p $BACKUP_DIR
# Remote mysqldump via SSH
ssh admin@$QNAP_IP "docker exec mariadb mysqldump \
--single-transaction \
--routines \
--triggers \
-u $DB_USER -p$DB_PASSWORD $DB_NAME" > $BACKUP_DIR/lcbp3_$DATE.sql
# Compress
gzip $BACKUP_DIR/lcbp3_$DATE.sql
# Add to Restic repository
restic -r $BACKUP_DIR/restic-repo backup $BACKUP_DIR/lcbp3_$DATE.sql.gz
# Keep only last 30 days of raw files
find $BACKUP_DIR -name "lcbp3_*.sql.gz" -mtime +30 -delete
echo "✅ MariaDB backup complete: lcbp3_$DATE.sql.gz"
```
### 1.2 Cron Schedule (ASUSTOR)
```cron
# MariaDB daily backup at 2 AM
0 2 * * * /volume1/np-dms/scripts/backup-mariadb.sh >> /var/log/backup-mariadb.log 2>&1
```
---
## 2. Redis Backup
### 2.1 Redis Backup Script
```bash
#!/bin/bash
# File: /volume1/np-dms/scripts/backup-redis.sh
# Run on: ASUSTOR (Pull from QNAP)
DATE=$(date +%Y%m%d_%H%M%S)
BACKUP_DIR="/volume1/backup/redis"
QNAP_IP="192.168.10.8"
echo "🔄 Starting Redis backup at $DATE"
mkdir -p $BACKUP_DIR
# Trigger BGSAVE on QNAP Redis
ssh admin@$QNAP_IP "docker exec cache redis-cli BGSAVE"
sleep 10
# Copy RDB and AOF files
scp admin@$QNAP_IP:/share/np-dms/services/cache/data/dump.rdb $BACKUP_DIR/redis_$DATE.rdb
scp admin@$QNAP_IP:/share/np-dms/services/cache/data/appendonly.aof $BACKUP_DIR/redis_$DATE.aof
# Compress
tar -czf $BACKUP_DIR/redis_$DATE.tar.gz \
$BACKUP_DIR/redis_$DATE.rdb \
$BACKUP_DIR/redis_$DATE.aof
# Cleanup raw files
rm $BACKUP_DIR/redis_$DATE.rdb $BACKUP_DIR/redis_$DATE.aof
echo "✅ Redis backup complete: redis_$DATE.tar.gz"
```
### 2.2 Cron Schedule
```cron
# Redis daily backup at 3 AM
0 3 * * * /volume1/np-dms/scripts/backup-redis.sh >> /var/log/backup-redis.log 2>&1
```
---
## 3. Application Config Backup
### 3.1 Weekly Config Backup Script
```bash
#!/bin/bash
# File: /volume1/np-dms/scripts/backup-config.sh
# Run on: ASUSTOR (Pull from QNAP)
DATE=$(date +%Y%m%d)
BACKUP_DIR="/volume1/backup/config"
QNAP_IP="192.168.10.8"
echo "🔄 Starting config backup at $DATE"
mkdir -p $BACKUP_DIR
# Sync Docker compose files and configs
rsync -avz --delete \
admin@$QNAP_IP:/share/np-dms/ \
$BACKUP_DIR/np-dms_$DATE/ \
--exclude='*/data/*' \
--exclude='*/logs/*' \
--exclude='node_modules'
# Compress
tar -czf $BACKUP_DIR/config_$DATE.tar.gz $BACKUP_DIR/np-dms_$DATE
# Cleanup
rm -rf $BACKUP_DIR/np-dms_$DATE
echo "✅ Config backup complete: config_$DATE.tar.gz"
```
### 3.2 Cron Schedule
```cron
# Config weekly backup on Sunday at 4 AM
0 4 * * 0 /volume1/np-dms/scripts/backup-config.sh >> /var/log/backup-config.log 2>&1
```
---
## 4. Retention Policy
| Backup Type | Frequency | Retention | Storage Est. |
| :---------- | :-------- | :-------- | :----------- |
| MariaDB | Daily | 30 days | ~5GB/month |
| Redis | Daily | 7 days | ~500MB |
| Config | Weekly | 4 weeks | ~200MB |
| Restic | Daily | 6 months | Deduplicated |
---
## 5. Restic Repository Setup
```bash
# Initialize Restic repository (one-time)
restic init -r /volume1/backup/restic-repo
# Set password in environment
export RESTIC_PASSWORD="your-secure-backup-password"
# Check repository status
restic -r /volume1/backup/restic-repo snapshots
# Prune old snapshots (keep 30 daily, 4 weekly, 6 monthly)
restic -r /volume1/backup/restic-repo forget \
--keep-daily 30 \
--keep-weekly 4 \
--keep-monthly 6 \
--prune
```
---
## 6. Verification Script
```bash
#!/bin/bash
# File: /volume1/np-dms/scripts/verify-backup.sh
echo "📋 Backup Verification Report"
echo "=============================="
echo ""
# Check latest MariaDB backup
LATEST_DB=$(ls -t /volume1/backup/db/*.sql.gz 2>/dev/null | head -1)
if [ -n "$LATEST_DB" ]; then
echo "✅ Latest DB backup: $LATEST_DB"
echo " Size: $(du -h $LATEST_DB | cut -f1)"
else
echo "❌ No DB backup found!"
fi
# Check latest Redis backup
LATEST_REDIS=$(ls -t /volume1/backup/redis/*.tar.gz 2>/dev/null | head -1)
if [ -n "$LATEST_REDIS" ]; then
echo "✅ Latest Redis backup: $LATEST_REDIS"
else
echo "❌ No Redis backup found!"
fi
# Check Restic repository
echo ""
echo "📦 Restic Snapshots:"
restic -r /volume1/backup/restic-repo snapshots --latest 5
```
---
> 📝 **หมายเหตุ**: เอกสารนี้อ้างอิงจาก Architecture Document **v1.8.0**

View File

@@ -0,0 +1,209 @@
# Disaster Recovery Plan สำหรับ LCBP3-DMS
> 📍 **Version:** v1.8.0
> 🖥️ **Primary Server:** QNAP TS-473A (Application & Database)
> 💾 **Backup Server:** ASUSTOR AS5403T (Infrastructure & Backup)
---
## RTO/RPO Targets
| Scenario | RTO | RPO | Priority |
| :-------------------------- | :------ | :----- | :------- |
| Single backend node failure | 0 min | 0 | P0 |
| Redis failure | 5 min | 0 | P0 |
| MariaDB failure | 10 min | 0 | P0 |
| QNAP total failure | 2 hours | 15 min | P1 |
| Data corruption | 4 hours | 1 day | P2 |
---
## 1. Quick Recovery Procedures
### 1.1 Service Not Responding
```bash
# Check container status
docker ps -a | grep <service-name>
# Restart specific service
docker restart <container-name>
# Check logs for errors
docker logs <container-name> --tail 100
```
### 1.2 Redis Failure
```bash
# Check status
docker exec cache redis-cli ping
# Restart
docker restart cache
# Verify
docker exec cache redis-cli ping
```
### 1.3 MariaDB Failure
```bash
# Check status
docker exec mariadb mysql -u root -p -e "SELECT 1"
# Restart
docker restart mariadb
# Wait for startup
sleep 30
# Verify
docker exec mariadb mysql -u root -p -e "SHOW DATABASES"
```
---
## 2. Full System Recovery
### 2.1 Recovery Prerequisites (ASUSTOR)
ตรวจสอบว่า Backup files พร้อมใช้งาน:
```bash
# SSH to ASUSTOR
ssh admin@192.168.10.9
# List available backups
ls -la /volume1/backup/db/
ls -la /volume1/backup/redis/
ls -la /volume1/backup/config/
# Check Restic snapshots
restic -r /volume1/backup/restic-repo snapshots
```
### 2.2 QNAP Recovery Script
```bash
#!/bin/bash
# File: /volume1/np-dms/scripts/disaster-recovery.sh
# Run on: ASUSTOR (Push to QNAP)
QNAP_IP="192.168.10.8"
BACKUP_DIR="/volume1/backup"
echo "🚨 Starting Disaster Recovery..."
echo "================================"
# 1. Restore Docker Network
echo "1⃣ Creating Docker network..."
ssh admin@$QNAP_IP "docker network create lcbp3 || true"
# 2. Restore config files
echo "2⃣ Restoring configuration files..."
LATEST_CONFIG=$(ls -t $BACKUP_DIR/config/*.tar.gz | head -1)
tar -xzf $LATEST_CONFIG -C /tmp/
rsync -avz /tmp/np-dms/ admin@$QNAP_IP:/share/np-dms/
# 3. Start infrastructure services
echo "3⃣ Starting MariaDB..."
ssh admin@$QNAP_IP "cd /share/np-dms/mariadb && docker-compose up -d"
sleep 30
# 4. Restore database
echo "4⃣ Restoring database..."
LATEST_DB=$(ls -t $BACKUP_DIR/db/*.sql.gz | head -1)
gunzip -c $LATEST_DB | ssh admin@$QNAP_IP "docker exec -i mariadb mysql -u root -p\$MYSQL_ROOT_PASSWORD lcbp3_db"
# 5. Start Redis
echo "5⃣ Starting Redis..."
ssh admin@$QNAP_IP "cd /share/np-dms/services && docker-compose up -d cache"
# 6. Restore Redis data (if needed)
echo "6⃣ Restoring Redis data..."
LATEST_REDIS=$(ls -t $BACKUP_DIR/redis/*.tar.gz | head -1)
tar -xzf $LATEST_REDIS -C /tmp/
scp /tmp/redis_*.rdb admin@$QNAP_IP:/share/np-dms/services/cache/data/dump.rdb
ssh admin@$QNAP_IP "docker restart cache"
# 7. Start remaining services
echo "7⃣ Starting application services..."
ssh admin@$QNAP_IP "cd /share/np-dms/services && docker-compose up -d"
ssh admin@$QNAP_IP "cd /share/np-dms/npm && docker-compose up -d"
# 8. Health check
echo "8⃣ Running health checks..."
sleep 60
curl -f https://lcbp3.np-dms.work/health || echo "⚠️ Frontend not ready"
curl -f https://backend.np-dms.work/health || echo "⚠️ Backend not ready"
echo ""
echo "✅ Disaster Recovery Complete"
echo "⚠️ Please verify system functionality manually"
```
---
## 3. Data Corruption Recovery
### 3.1 Point-in-Time Recovery (Database)
```bash
# List available Restic snapshots
restic -r /volume1/backup/restic-repo snapshots
# Restore specific snapshot
restic -r /volume1/backup/restic-repo restore <snapshot-id> --target /tmp/restore/
# Apply restored backup
gunzip -c /tmp/restore/lcbp3_*.sql.gz | \
ssh admin@192.168.10.8 "docker exec -i mariadb mysql -u root -p\$MYSQL_ROOT_PASSWORD lcbp3_db"
```
### 3.2 Selective Table Recovery
```bash
# Extract specific tables from backup
gunzip -c /volume1/backup/db/lcbp3_YYYYMMDD.sql.gz | \
grep -A1000 "CREATE TABLE \`documents\`" | \
grep -B1000 "UNLOCK TABLES" > /tmp/documents_table.sql
# Restore specific table
ssh admin@192.168.10.8 "docker exec -i mariadb mysql -u root -p\$MYSQL_ROOT_PASSWORD lcbp3_db" < /tmp/documents_table.sql
```
---
## 4. Communication & Escalation
### 4.1 Incident Response
| Severity | Response Time | Notify |
| :------- | :------------ | :----------------------------- |
| P0 | Immediate | Admin Team + Management |
| P1 | 30 minutes | Admin Team |
| P2 | 2 hours | Admin Team (next business day) |
### 4.2 Post-Incident Checklist
- [ ] Identify root cause
- [ ] Document timeline of events
- [ ] Verify all services restored
- [ ] Check data integrity
- [ ] Update monitoring alerts if needed
- [ ] Create incident report
---
## 5. Testing Schedule
| Test Type | Frequency | Last Tested | Next Due |
| :---------------------- | :-------- | :---------- | :------- |
| Backup Verification | Weekly | - | - |
| Single Service Recovery | Monthly | - | - |
| Full DR Test | Quarterly | - | - |
---
> 📝 **หมายเหตุ**: เอกสารนี้อ้างอิงจาก Architecture Document **v1.8.0**

View File

@@ -0,0 +1,423 @@
# ADR-004: RBAC Implementation with 4-Level Scope
**Status:** Accepted
**Date:** 2025-11-30
**Decision Makers:** Development Team, Security Team
**Related Documents:**
- [System Architecture](../02-architecture/02-01-system-architecture.md)
- [Access Control Requirements](../01-requirements/01-04-access-control.md)
---
## Context and Problem Statement
LCBP3-DMS ต้องจัดการสิทธิ์การเข้าถึงที่ซับซ้อน:
- **Multi-Organization:** หลายองค์กรใช้ระบบร่วมกัน แต่ต้องแยกข้อมูล
- **Project-Based:** แต่ละ Project มี Contracts แยกกัน
- **Hierarchical Permissions:** สิทธิ์ระดับบนครอบคลุมระดับล่าง
- **Dynamic Roles:** Role และ Permission ต้องปรับได้โดยไม่ต้อง Deploy
### Key Requirements
1. User หนึ่งคนสามารถมีหลาย Roles ในหลาย Scopes
2. Permission Inheritance (Global → Organization → Project → Contract)
3. Fine-grained Access Control (e.g., "ดู Correspondence ได้เฉพาะ Project A")
4. Performance (Check permission ต้องเร็ว < 10ms)
---
## Decision Drivers
- **Security:** ป้องกันการเข้าถึงข้อมูลที่ไม่มีสิทธิ์
- **Flexibility:** ปรับ Roles/Permissions ได้ง่าย
- **Performance:** Check permission รวดเร็ว
- **Usability:** Admin กำหนดสิทธิ์ได้ง่าย
- **Scalability:** รองรับ Users/Organizations จำนวนมาก
---
## Considered Options
### Option 1: Simple Role-Based (No Scope)
**แนวทาง:** Users มี Roles (Admin, Editor, Viewer) เท่านั้น ไม่มี Scope
**Pros:**
- ✅ Very simple implementation
- ✅ Easy to understand
**Cons:**
- ❌ ไม่รองรับ Multi-organization
- ❌ Superadmin เห็นข้อมูลทุก Organization
- ❌ ไม่ยืดหยุ่น
### Option 2: Organization-Only Scope
**แนวทาง:** Roles ผูกกับ Organization เท่านั้น
**Pros:**
- ✅ แยกข้อมูลระหว่าง Organizations ได้
- ✅ Moderate complexity
**Cons:**
- ❌ ไม่รองรับ Project/Contract level permissions
- ❌ User ใน Organization เห็นทุก Project
### Option 3: **4-Level Hierarchical RBAC** ⭐ (Selected)
**แนวทาง:** Global → Organization → Project → Contract
**Pros:**
-**Maximum Flexibility:** ครอบคลุมทุก Use Case
-**Inheritance:** Global Admin เห็นทุกอย่าง
-**Isolation:** Project Manager เห็นแค่ Project ของตน
-**Fine-grained:** Contract Admin จัดการแค่ Contract เดียว
-**Dynamic:** Roles/Permissions configurable
**Cons:**
- ❌ Complex implementation
- ❌ Performance concern (need optimization)
- ❌ Learning curve for admins
---
## Decision Outcome
**Chosen Option:** Option 3 - 4-Level Hierarchical RBAC
### Rationale
เลือก 4-Level RBAC เนื่องจาก:
1. **Business Requirements:** Project มีหลาย Contracts ที่ต้องแยกสิทธิ์
2. **Future-proof:** รองรับการเติบโตในอนาคต
3. **CASL Integration:** ใช้ library ที่รองรับ complex permissions
4. **Redis Caching:** แก้ปัญหา Performance ด้วย Cache
---
## Implementation Details
### Database Schema
```sql
-- Roles with Scope
CREATE TABLE roles (
role_id INT PRIMARY KEY AUTO_INCREMENT,
role_name VARCHAR(100) NOT NULL,
scope ENUM('Global', 'Organization', 'Project', 'Contract') NOT NULL,
description TEXT,
is_system BOOLEAN DEFAULT FALSE
);
-- Permissions
CREATE TABLE permissions (
permission_id INT PRIMARY KEY AUTO_INCREMENT,
permission_name VARCHAR(100) NOT NULL UNIQUE,
description TEXT,
module VARCHAR(50),
scope_level ENUM('GLOBAL', 'ORG', 'PROJECT')
);
-- Role-Permission Mapping
CREATE TABLE role_permissions (
role_id INT,
permission_id INT,
PRIMARY KEY (role_id, permission_id),
FOREIGN KEY (role_id) REFERENCES roles(role_id) ON DELETE CASCADE,
FOREIGN KEY (permission_id) REFERENCES permissions(permission_id) ON DELETE CASCADE
);
-- User Role Assignments with Scope Context
CREATE TABLE user_assignments (
id INT PRIMARY KEY AUTO_INCREMENT,
user_id INT NOT NULL,
role_id INT NOT NULL,
organization_id INT NULL,
project_id INT NULL,
contract_id INT NULL,
assigned_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
FOREIGN KEY (user_id) REFERENCES users(user_id) ON DELETE CASCADE,
FOREIGN KEY (role_id) REFERENCES roles(role_id) ON DELETE CASCADE,
FOREIGN KEY (organization_id) REFERENCES organizations(id) ON DELETE CASCADE,
FOREIGN KEY (project_id) REFERENCES projects(id) ON DELETE CASCADE,
FOREIGN KEY (contract_id) REFERENCES contracts(id) ON DELETE CASCADE,
CONSTRAINT chk_scope CHECK (
(organization_id IS NOT NULL AND project_id IS NULL AND contract_id IS NULL) OR
(organization_id IS NULL AND project_id IS NOT NULL AND contract_id IS NULL) OR
(organization_id IS NULL AND project_id IS NULL AND contract_id IS NOT NULL) OR
(organization_id IS NULL AND project_id IS NULL AND contract_id IS NULL)
)
);
```
### CASL Ability Rules
```typescript
// ability.factory.ts
import { AbilityBuilder, PureAbility } from '@casl/ability';
export type AppAbility = PureAbility<[string, any]>;
@Injectable()
export class AbilityFactory {
async createForUser(user: User): Promise<AppAbility> {
const { can, cannot, build } = new AbilityBuilder<AppAbility>(PureAbility);
// Get user assignments (from cache or DB)
const assignments = await this.getUserAssignments(user.user_id);
for (const assignment of assignments) {
const role = await this.getRole(assignment.role_id);
const permissions = await this.getRolePermissions(role.role_id);
for (const permission of permissions) {
// permission format: 'correspondence.create', 'project.view'
const [subject, action] = permission.permission_name.split('.');
// Apply scope-based conditions
switch (assignment.scope) {
case 'Global':
can(action, subject);
break;
case 'Organization':
can(action, subject, {
organization_id: assignment.organization_id,
});
break;
case 'Project':
can(action, subject, {
project_id: assignment.project_id,
});
break;
case 'Contract':
can(action, subject, {
contract_id: assignment.contract_id,
});
break;
}
}
}
return build();
}
}
```
### Permission Guard
```typescript
// permission.guard.ts
@Injectable()
export class PermissionGuard implements CanActivate {
constructor(
private reflector: Reflector,
private abilityFactory: AbilityFactory,
private redis: Redis
) {}
async canActivate(context: ExecutionContext): Promise<boolean> {
// Get required permission from decorator
const permission = this.reflector.get<string>(
'permission',
context.getHandler()
);
if (!permission) return true;
const request = context.switchToHttp().getRequest();
const user = request.user;
// Check cache first (30 min TTL)
const cacheKey = `user:${user.user_id}:permissions`;
let ability = await this.redis.get(cacheKey);
if (!ability) {
ability = await this.abilityFactory.createForUser(user);
await this.redis.set(cacheKey, JSON.stringify(ability.rules), 'EX', 1800);
}
const [action, subject] = permission.split('.');
const resource = request.params || request.body;
return ability.can(action, subject, resource);
}
}
```
### Usage Example
```typescript
@Controller('correspondences')
@UseGuards(JwtAuthGuard, PermissionGuard)
export class CorrespondenceController {
@Post()
@RequirePermission('correspondence.create')
async create(@Body() dto: CreateCorrespondenceDto) {
// Only users with create permission can access
}
@Get(':id')
@RequirePermission('correspondence.view')
async findOne(@Param('id') id: string) {
// Check if user has view permission for this project
}
}
```
---
## Permission Checking Flow
```mermaid
sequenceDiagram
participant Client
participant Guard as Permission Guard
participant Redis as Redis Cache
participant Factory as Ability Factory
participant DB as Database
Client->>Guard: Request with JWT
Guard->>Redis: Get user permissions (cache)
alt Cache Hit
Redis-->>Guard: Cached permissions
else Cache Miss
Guard->>Factory: createForUser(user)
Factory->>DB: Get user_assignments
Factory->>DB: Get role_permissions
Factory->>Factory: Build CASL ability
Factory-->>Guard: Ability object
Guard->>Redis: Cache permissions (TTL: 30min)
end
Guard->>Guard: Check permission.can(action, subject, context)
alt Permission Granted
Guard-->>Client: Allow access
else Permission Denied
Guard-->>Client: 403 Forbidden
end
```
---
## 4-Level Scope Hierarchy
```
Global (ทั้งระบบ)
├─ Organization (ระดับองค์กร)
│ ├─ Project (ระดับโครงการ)
│ │ └─ Contract (ระดับสัญญา)
│ │
│ └─ Project B
│ └─ Contract B
└─ Organization 2
└─ Project C
```
### Example Assignments
```typescript
// User A: Superadmin (Global)
{
user_id: 1,
role_id: 1, // Superadmin
organization_id: null,
project_id: null,
contract_id: null
}
// Can access EVERYTHING
// User B: Document Control in TEAM Organization
{
user_id: 2,
role_id: 3, // Document Control
organization_id: 3, // TEAM
project_id: null,
contract_id: null
}
// Can manage documents in TEAM organization (all projects)
// User C: Project Manager for LCBP3
{
user_id: 3,
role_id: 6, // Project Manager
organization_id: null,
project_id: 1, // LCBP3
contract_id: null
}
// Can manage only LCBP3 project (all contracts within)
// User D: Contract Admin for Contract-1
{
user_id: 4,
role_id: 7, // Contract Admin
organization_id: null,
project_id: null,
contract_id: 5 // Contract-1
}
// Can manage only Contract-1
```
---
## Consequences
### Positive
1.**Fine-grained Control:** แยกสิทธิ์ได้ละเอียดมาก
2.**Flexible:** User มีหลาย Roles ใน Scopes ต่างกันได้
3.**Inheritance:** Global → Org → Project → Contract
4.**Performant:** Redis cache ทำให้เร็ว (< 10ms)
5.**Auditable:** ทุก Assignment บันทึกใน DB
### Negative
1.**Complexity:** ซับซ้อนในการ Setup และ Maintain
2.**Cache Invalidation:** ต้อง Invalidate ถูกต้องเมื่อเปลี่ยน Roles
3.**Learning Curve:** Admin ต้องเข้าใจ Scope hierarchy
4.**Testing:** ต้อง Test ทุก Combination
### Mitigation Strategies
- **Complexity:** สร้าง Admin UI ที่ใช้งานง่าย
- **Cache:** Auto-invalidate เมื่อมีการเปลี่ยนแปลง
- **Documentation:** เขียน Guide ชัดเจน
- **Testing:** Integration tests ครอบคลุม Permissions
---
## Compliance
เป็นไปตาม:
- [Requirements Section 4](../01-requirements/01-04-access-control.md) - Access Control
- [Backend Plan Section 2 RBAC](../../docs/2_Backend_Plan_V1_4_5.md#rbac)
---
## Related ADRs
- [ADR-005: Redis Usage Strategy](./ADR-005-redis-usage-strategy.md) - Permission caching
- [ADR-001: Unified Workflow Engine](./ADR-001-unified-workflow-engine.md) - Workflow permission guards
---
## References
- [CASL Documentation](https://casl.js.org/v6/en/guide/intro)
- [RBAC Best Practices](https://csrc.nist.gov/publications/detail/sp/800-162/final)

View File

@@ -0,0 +1,352 @@
# ADR-007: API Design & Error Handling Strategy
**Status:** ✅ Accepted
**Date:** 2025-12-01
**Decision Makers:** Backend Team, System Architect
**Related Documents:** [Backend Guidelines](../03-implementation/03-02-backend-guidelines.md), [ADR-005: Technology Stack](./ADR-005-technology-stack.md)
---
## Context and Problem Statement
ระบบ LCBP3-DMS ต้องการมาตรฐานการออกแบบ API ที่ชัดเจนและสม่ำเสมอทั้งระบบ รวมถึงกลยุทธ์การจัดการ Error และ Validation ที่เหมาะสม
### ปัญหาที่ต้องแก้:
1. **API Consistency:** ทำอย่างไรให้ API response format สม่ำเสมอทั้งระบบ
2. **Error Handling:** จัดการ error อย่างไรให้ client เข้าใจและแก้ไขได้
3. **Validation:** Validate request อย่างไรให้ครอบคลุมและให้ feedback ที่ดี
4. **Status Codes:** ใช้ HTTP status codes อย่างไรให้ถูกต้องและสม่ำเสมอ
---
## Decision Drivers
- 🎯 **Developer Experience:** Frontend developers ต้องใช้ API ได้ง่าย
- 🔒 **Security:** ป้องกัน Information Leakage จาก Error messages
- 📊 **Debuggability:** ต้องหา Root cause ของ Error ได้ง่าย
- 🌍 **Internationalization:** รองรับภาษาไทยและอังกฤษ
- 📝 **Standards Compliance:** ใช้มาตรฐานที่เป็นที่ยอมรับ (REST, JSON:API)
---
## Considered Options
### Option 1: Standard REST with Custom Error Format
**รูปแบบ:**
```typescript
// Success
{
"data": { ... },
"meta": { "timestamp": "..." }
}
// Error
{
"error": {
"code": "VALIDATION_ERROR",
"message": "Validation failed",
"details": [...]
}
}
```
**Pros:**
- ✅ Simple และเข้าใจง่าย
- ✅ Flexible สำหรับ Custom needs
- ✅ ไม่ต้อง Follow spec ที่ซับซ้อน
**Cons:**
- ❌ ไม่มี Standard specification
- ❌ ต้องสื่อสารภายในทีมให้ชัดเจน
- ❌ อาจไม่สม่ำเสมอหากไม่ระวัง
### Option 2: JSON:API Specification
**รูปแบบ:**
```typescript
{
"data": {
"type": "correspondences",
"id": "1",
"attributes": { ... },
"relationships": { ... }
},
"included": [...]
}
```
**Pros:**
- ✅ มาตรฐานที่เป็นที่ยอมรับ
- ✅ มี Libraries ช่วย
- ✅ รองรับ Relationships ได้ดี
**Cons:**
- ❌ ซับซ้อนเกินความจำเป็น
- ❌ Verbose (ข้อมูลซ้ำซ้อน)
- ❌ Learning curve สูง
### Option 3: GraphQL
**Pros:**
- ✅ Client เลือกข้อมูลที่ต้องการได้
- ✅ ลด Over-fetching/Under-fetching
- ✅ Strong typing
**Cons:**
- ❌ Complexity สูง
- ❌ Caching ยาก
- ❌ ไม่เหมาะกับ Document-heavy system
- ❌ Team ยังไม่มีประสบการณ์
---
## Decision Outcome
**Chosen Option:** **Option 1 - Standard REST with Custom Error Format + NestJS Exception Filters**
### Rationale
1. **Simplicity:** ทีมคุ้นเคยกับ REST API และ NestJS มี Built-in support ที่ดี
2. **Flexibility:** สามารถปรับแต่งตาม Business needs ได้ง่าย
3. **Performance:** Lightweight กว่า JSON:API และ GraphQL
4. **Team Capability:** ทีมมีประสบการณ์ REST มากกว่า GraphQL
---
## Implementation Details
### 1. Success Response Format
```typescript
// Single resource
{
"data": {
"id": 1,
"document_number": "CORR-2024-0001",
"subject": "...",
...
},
"meta": {
"timestamp": "2024-01-01T00:00:00Z",
"version": "1.0"
}
}
// Collection with pagination
{
"data": [
{ "id": 1, ... },
{ "id": 2, ... }
],
"meta": {
"pagination": {
"page": 1,
"limit": 20,
"total": 100,
"totalPages": 5
},
"timestamp": "2024-01-01T00:00:00Z"
}
}
```
### 2. Error Response Format
```typescript
{
"error": {
"code": "VALIDATION_ERROR",
"message": "Validation failed on input data",
"statusCode": 400,
"timestamp": "2024-01-01T00:00:00Z",
"path": "/api/correspondences",
"details": [
{
"field": "subject",
"message": "Subject is required",
"value": null
}
]
}
}
```
### 3. HTTP Status Codes
| Status | Use Case |
| ------------------------- | ------------------------------------------- |
| 200 OK | Successful GET, PUT, PATCH |
| 201 Created | Successful POST |
| 204 No Content | Successful DELETE |
| 400 Bad Request | Validation error, Invalid input |
| 401 Unauthorized | Missing or invalid JWT token |
| 403 Forbidden | Insufficient permissions (RBAC) |
| 404 Not Found | Resource not found |
| 409 Conflict | Duplicate resource, Business rule violation |
| 422 Unprocessable Entity | Business logic error |
| 429 Too Many Requests | Rate limit exceeded |
| 500 Internal Server Error | Unexpected server error |
### 4. Global Exception Filter
```typescript
// File: backend/src/common/filters/global-exception.filter.ts
import {
ExceptionFilter,
Catch,
ArgumentsHost,
HttpException,
HttpStatus,
} from '@nestjs/common';
@Catch()
export class GlobalExceptionFilter implements ExceptionFilter {
catch(exception: unknown, host: ArgumentsHost) {
const ctx = host.switchToHttp();
const response = ctx.getResponse();
const request = ctx.getRequest();
let status = HttpStatus.INTERNAL_SERVER_ERROR;
let code = 'INTERNAL_SERVER_ERROR';
let message = 'An unexpected error occurred';
let details = null;
if (exception instanceof HttpException) {
status = exception.getStatus();
const exceptionResponse = exception.getResponse();
if (typeof exceptionResponse === 'object') {
code = (exceptionResponse as any).error || exception.name;
message = (exceptionResponse as any).message || exception.message;
details = (exceptionResponse as any).details;
} else {
message = exceptionResponse;
}
}
// Log error (but don't expose internal details to client)
console.error('Exception:', exception);
response.status(status).json({
error: {
code,
message,
statusCode: status,
timestamp: new Date().toISOString(),
path: request.url,
...(details && { details }),
},
});
}
}
```
### 5. Custom Business Exception
```typescript
// File: backend/src/common/exceptions/business.exception.ts
export class BusinessException extends HttpException {
constructor(message: string, code: string = 'BUSINESS_ERROR') {
super(
{
error: code,
message,
},
HttpStatus.UNPROCESSABLE_ENTITY
);
}
}
// Usage
throw new BusinessException(
'Cannot approve correspondence in current status',
'INVALID_WORKFLOW_TRANSITION'
);
```
### 6. Validation Pipe Configuration
```typescript
// File: backend/src/main.ts
app.useGlobalPipes(
new ValidationPipe({
whitelist: true, // Strip properties not in DTO
forbidNonWhitelisted: true, // Throw error if unknown properties
transform: true, // Auto-transform payloads to DTO instances
transformOptions: {
enableImplicitConversion: true,
},
exceptionFactory: (errors) => {
const details = errors.map((error) => ({
field: error.property,
message: Object.values(error.constraints || {}).join(', '),
value: error.value,
}));
return new HttpException(
{
error: 'VALIDATION_ERROR',
message: 'Validation failed',
details,
},
HttpStatus.BAD_REQUEST
);
},
})
);
```
---
## Consequences
### Positive Consequences
1.**Consistency:** API responses มีรูปแบบสม่ำเสมอทั้งระบบ
2.**Developer Friendly:** Frontend developers ใช้งาน API ได้ง่าย
3.**Debuggability:** Error messages ให้ข้อมูลเพียงพอสำหรับ Debug
4.**Security:** ไม่เปิดเผย Internal error details ให้ Client
5.**Maintainability:** ใช้ NestJS built-in features ทำให้ Maintain ง่าย
### Negative Consequences
1.**No Standard Spec:** ไม่ใช่ Standard เช่น JSON:API จึงต้องเขียน Documentation ชัดเจน
2.**Manual Documentation:** ต้อง Document API response format เอง
3.**Learning Curve:** Team members ใหม่ต้องเรียนรู้ Error code conventions
### Mitigation Strategies
- **Documentation:** ใช้ Swagger/OpenAPI เพื่อ Auto-generate API docs
- **Code Generation:** Generate TypeScript interfaces สำหรับ Frontend จาก DTOs
- **Error Code Registry:** มี Centralized list ของ Error codes พร้อมคำอธิบาย
- **Testing:** เขียน Integration tests เพื่อ Validate response formats
---
## Related ADRs
- [ADR-005: Technology Stack](./ADR-005-technology-stack.md) - เลือกใช้ NestJS
- [ADR-004: RBAC Implementation](./ADR-004-rbac-implementation.md) - Error 403 Forbidden
---
## References
- [NestJS Exception Filters](https://docs.nestjs.com/exception-filters)
- [HTTP Status Codes](https://httpstatuses.com/)
- [REST API Best Practices](https://restfulapi.net/)
---
**Last Updated:** 2025-12-01
**Next Review:** 2025-06-01

File diff suppressed because it is too large Load Diff