690503:0135 Update workflow #01
CI / CD Pipeline / build (push) Failing after 6m6s
CI / CD Pipeline / deploy (push) Has been skipped

This commit is contained in:
2026-05-03 01:35:05 +07:00
parent d239b58387
commit 2c24991f88
85 changed files with 6335 additions and 100 deletions
@@ -2,6 +2,7 @@
**Status:** Accepted
**Date:** 2026-02-24
**Last Amended:** 2026-05-02
**Decision Makers:** Development Team, System Architect
**Related Documents:**
@@ -42,6 +43,34 @@ LCBP3-DMS ต้องจัดการเอกสารหลายประ
---
## Clarifications
### Session 2026-05-02 (Round 1 — ADR-001-add.md merge)
- Q: Event handling — Outbox Pattern หรือ BullMQ (ADR-008)? → A: **BullMQ only** — WorkflowEngine enqueues BullMQ job โดยตรง ไม่มี outbox table; สอดคล้อง ADR-008
- Q: Concurrency control — Optimistic Lock vs Redis Redlock vs แยก concern? → A: **แยก concern**`version_no` optimistic lock สำหรับ state transition; Redis Redlock เฉพาะ Document Numbering (ADR-002)
- Q: Context schema — validate ที่ไหน และ scope ระดับใด? → A: **Two-phase validation** (save-time + transition-time); schema scope **per `workflow_definition` version**
- Q: Condition Engine library? → A: **`json-logic-js` in-process** ใน `WorkflowDslService`; fallback to custom parser if production issues
- Q: Auto-action worker — extend existing หรือ dedicated queue? → A: **Dedicated `workflow-events` BullMQ queue** แยกจาก `notification-queue`
### Session 2026-05-02 (Round 2 — ADR-001 full review)
- Q: DDL gap — เพิ่ม `version_no` + `context_schema` ใน DDL? → A: **yes**`version_no INT NOT NULL DEFAULT 1` ใน `workflow_instances`; `context_schema JSON NULL` ใน `workflow_definitions`
- Q: ConflictException retry strategy? → A: **409 ขึ้น frontend** via `BusinessException` (ADR-007); frontend แสดง toast "กรุณาลองใหม่" — ไม่ auto-retry
- Q: Redis cache TTL/invalidation strategy? → A: **TTL 1h + event invalidation** เมื่อ admin save/activate DSL; key `wf:def:{workflow_code}:{version}`
- Q: WorkflowEventsWorker concurrency/retry config? → A: **concurrency 5, retry 3 + exponential backoff + dead-letter queue**
- Q: RBAC สำหรับ DSL authoring? → A: **Super Admin เท่านั้น** (`system.manage_all`) — create/update/activate/deactivate workflow definitions
### Session 2026-05-02 (Round 3 — ADR-019 compliance + ops)
- Q: `action_by_user_id INT NULL` ใน `workflow_histories` — ADR-019 compliance? → A: **คง INT FK + `@Exclude()`** บน Entity; เพิ่ม `action_by_user_uuid VARCHAR(36) NULL` สำหรับ API response
- Q: `validateContext()` fail ที่ transition-time — HTTP status? → A: **422 Unprocessable Entity** via `ValidationException` (ADR-007 Validation tier) พร้อม field-level errors
- Q: Dead-letter queue `workflow-events-failed` — ops procedure? → A: **n8n webhook alert + Bull Board UI** สำหรับ manual requeue
- Q: n8n webhook URL — เก็บที่ไหน? → A: **`N8N_WEBHOOK_URL` environment variable** ใน `docker-compose.yml`; อ่านผ่าน `ConfigService`
- Q: `context_schema.required` — enforce จริงหรือไม่? → A: **enforce strictly** — required field หาย → throw 422 `ValidationException`; ไม่ block transition
---
## Decision Drivers
- **DRY Principle:** Don't Repeat Yourself - ลดการเขียน Code ซ้ำ
@@ -206,8 +235,9 @@ CREATE TABLE workflow_definitions (
workflow_code VARCHAR(50) NOT NULL,
version INT NOT NULL DEFAULT 1,
description TEXT NULL,
dsl JSON NOT NULL, -- Raw DSL from user
compiled JSON NOT NULL, -- Validated and optimized for Runtime
dsl JSON NOT NULL, -- Raw DSL from user
compiled JSON NOT NULL, -- Validated and optimized for Runtime
context_schema JSON NULL, -- JSON Schema for context validation (two-phase)
is_active BOOLEAN DEFAULT TRUE,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
@@ -221,6 +251,7 @@ CREATE TABLE workflow_instances (
entity_type VARCHAR(50) NOT NULL, -- e.g. "correspondence", "rfa"
entity_id VARCHAR(50) NOT NULL,
current_state VARCHAR(50) NOT NULL,
version_no INT NOT NULL DEFAULT 1, -- Optimistic lock (@VersionColumn) — ป้องกัน race condition
status ENUM('ACTIVE', 'COMPLETED', 'CANCELLED', 'TERMINATED') DEFAULT 'ACTIVE',
context JSON NULL,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
@@ -235,7 +266,8 @@ CREATE TABLE workflow_histories (
from_state VARCHAR(50) NOT NULL,
to_state VARCHAR(50) NOT NULL,
action VARCHAR(50) NOT NULL,
action_by_user_id INT NULL,
action_by_user_id INT NULL, -- Internal FK (@Exclude() in Entity) — ห้าม expose ใน API
action_by_user_uuid VARCHAR(36) NULL, -- UUID สำหรับ API response (ADR-019)
comment TEXT NULL,
metadata JSON NULL,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
@@ -250,6 +282,14 @@ CREATE TABLE workflow_histories (
"workflow": "CORRESPONDENCE_ROUTING",
"version": 1,
"description": "Standard correspondence routing",
"context_schema": {
"type": "object",
"properties": {
"requiresLegal": { "type": "number" },
"hasRecipient": { "type": "boolean" }
},
"required": []
},
"states": [
{
"name": "DRAFT",
@@ -261,7 +301,10 @@ CREATE TABLE workflow_histories (
"role": ["Admin"],
"user": "123"
},
"condition": "context.requiresLegal > 0",
"condition": {
"type": "json-logic",
"rule": { ">": [{ "var": "requiresLegal" }, 0] }
},
"events": [
{
"type": "notify",
@@ -299,6 +342,8 @@ CREATE TABLE workflow_histories (
}
```
> **⚠️ หมายเหตุ:** `condition` ต้องใช้ JSON Logic format (`{ "type": "json-logic", "rule": {...} }`) เท่านั้น — ห้ามใช้ JS string expression (`"context.x === true"`) เพราะเป็น security risk (code injection)
### NestJS Module Structure
```typescript
@@ -325,6 +370,11 @@ export class WorkflowEngineService {
order: { version: 'DESC' },
});
// Validate initial context against context_schema (save-time phase 1)
if (definition.compiled.contextSchema) {
this.dslService.validateContext(initialContext, definition.compiled.contextSchema);
}
// Initial state directly from compiled DSL
const initialState = definition.compiled.initialState;
@@ -333,6 +383,7 @@ export class WorkflowEngineService {
entityType,
entityId,
currentState: initialState,
versionNo: 1, // TypeORM @VersionColumn — optimistic lock
status: WorkflowStatus.ACTIVE,
context: initialContext,
});
@@ -345,19 +396,46 @@ export class WorkflowEngineService {
comment?: string,
payload: Record<string, unknown> = {}
) {
// Evaluation via WorkflowDslService
// Validate context values against schema (transition-time phase 2)
if (definition.compiled.contextSchema) {
this.dslService.validateContext(instance.context, definition.compiled.contextSchema);
}
// Evaluation via WorkflowDslService (uses json-logic-js in-process)
const evaluation = this.dslService.evaluate(compiled, instance.currentState, action, context);
// Update state to target State
instance.currentState = evaluation.nextState;
// Optimistic lock: update state only if current_state + version_no match
// ❌ ไม่ใช้ Redis Redlock ใน workflow transition (Redlock เฉพาะ Document Numbering ADR-002)
const updated = await this.instanceRepo
.createQueryBuilder()
.update(WorkflowInstance)
.set({
currentState: evaluation.nextState,
versionNo: () => 'version_no + 1',
})
.where('id = :id AND current_state = :state AND version_no = :ver', {
id: instance.id,
state: instance.currentState,
ver: instance.versionNo,
})
.execute();
if (updated.affected === 0) {
throw new ConflictException('Concurrent transition detected — please retry');
}
if (compiled.states[evaluation.nextState].terminal) {
instance.status = WorkflowStatus.COMPLETED;
}
// Process background events asynchronously
// Dispatch events async via dedicated BullMQ queue 'workflow-events' (ADR-008)
// ❌ ห้าม dispatch events แบบ sync ใน request thread
if (evaluation.events && evaluation.events.length > 0) {
this.eventService.dispatchEvents(instance.id, evaluation.events, context);
await this.workflowEventsQueue.add('dispatch', {
instanceId: instance.id,
events: evaluation.events,
context,
});
}
}
}
@@ -365,6 +443,80 @@ export class WorkflowEngineService {
---
## 🏭 Production Architecture
### Runtime Flow
```
[ API / Service Layer ]
[ WorkflowEngineService ]
- validate context (two-phase: save-time + transition-time)
- evaluate condition (json-logic-js in-process, WorkflowDslService)
- optimistic lock: UPDATE WHERE current_state = ? AND version_no = ?
- write workflow_histories
- enqueue BullMQ job → queue: 'workflow-events'
[ DB (workflow_instances + workflow_histories) ]
↓ (async, dedicated queue)
[ WorkflowEventsWorker (BullMQ: 'workflow-events') ]
┌───────────────┐
│ n8n │ (webhook / notification dispatch)
└───────────────┘
```
### Production Rules (Non-Negotiable)
| # | Rule | Detail |
|---|------|--------|
| 1 | **Source of Truth** | Workflow state = DB only — ห้ามเก็บ state ใน memory/cache |
| 2 | **Deterministic Execution** | ทุก transition MUST declared ใน DSL — ห้าม dynamic transition |
| 3 | **No Inline Code Execution** | Condition MUST ใช้ JSON Logic format — ห้าม JS string eval |
| 4 | **Async Side Effects** | ทุก event MUST ผ่าน BullMQ `workflow-events` queue — ห้าม sync dispatch |
| 5 | **Idempotency** | Transition MUST safe to retry — optimistic lock ป้องกัน double-apply |
| 6 | **Instance Isolation** | In-progress instances ใช้ `workflow_definition` version เดิม — ห้าม rebind |
### Concurrency Control (แยก concern)
| Concern | Mechanism | Scope |
|---------|-----------|-------|
| Workflow state transition | `version_no` optimistic lock (TypeORM `@VersionColumn`) | `workflow_instances` table |
| Document Numbering | Redis Redlock (ADR-002) | Number generation only |
> ❌ **ห้ามใช้ Redis Redlock ใน workflow transition layer** — Redlock เฉพาะ Document Numbering
### Condition Engine
- **Library:** `json-logic-js` (npm) — evaluate in-process ใน `WorkflowDslService`
- **Fallback:** migrate to custom parser เมื่อพบปัญหา performance/complexity ใน production
- **Forbidden:** arbitrary JS string evaluation (`eval`, `new Function`, string conditions)
### Context Schema Validation
- `context_schema` stored per `workflow_definition` version (รองรับ schema evolution)
- **Phase 1 (Save-time):** validate schema structure เมื่อ admin save DSL
- **Phase 2 (Transition-time):** validate context values ตรง schema ก่อน evaluate condition
- **Required field enforcement:** `required` array ใน schema **enforce strictly** — missing required field → throw `ValidationException` (ADR-007) → HTTP 422 + field-level errors
- **Failure response:** `{ field: "<context_field>", message: "required field missing" }` — ไม่ block transition — caller ต้องแก้ context แล้ว retry
### Event Queue
- Queue name: `workflow-events` (dedicated BullMQ queue — แยกจาก `notification-queue`)
- Worker: `WorkflowEventsWorker` — config:
- **concurrency:** 5
- **attempts:** 3 (exponential backoff)
- **dead-letter queue:** `workflow-events-failed` หลัง attempts หมด
- **n8n webhook URL:** `N8N_WEBHOOK_URL` env var (ใน `docker-compose.yml`) — อ่านผ่าน `ConfigService`; ห้าม hardcode
- **Dead-letter ops:**
- เมื่อ job ตกใน `workflow-events-failed` → trigger n8n webhook แจ้ง ops team
- Manual requeue ผ่าน **Bull Board UI** (admin panel)
- ❌ ไม่ auto-requeue — ป้องกัน retry loop ถ้าเป็น permanent bug
- ❌ ไม่ใช้ Outbox Pattern (polling DB table) — BullMQ มี retry/dead-letter/persistence อยู่แล้ว
---
## Consequences
### Positive
@@ -388,7 +540,8 @@ export class WorkflowEngineService {
- **Complexity:** สร้าง UI Builder สำหรับ Workflow Design ในอนาคต
- **Learning Curve:** เขียน Documentation และ Examples ที่ชัดเจน
- **Performance:** ใช้ Redis Cache สำหรับ Workflow Definitions
- **Performance:** Redis Cache สำหรับ `workflow_definitions` — key: `wf:def:{workflow_code}:{version}`, TTL: 1h, invalidate ทันทีเมื่อ admin save/activate DSL ใหม่
- **Concurrency Conflict:** `ConflictException` ส่ง `BusinessException` (ADR-007) → 409 ไป frontend; user retry ด้วยตัวเอง — ไม่ auto-retry
- **Debugging:** สร้าง Workflow Visualization Tool
- **Testing:** เขียน Comprehensive Unit Tests สำหรับ Engine
@@ -400,6 +553,9 @@ export class WorkflowEngineService {
- [Backend Guidelines](../05-Engineering-Guidelines/05-02-backend-guidelines.md#workflow-engine-integration) - Unified Workflow Engine
- [Unified Workflow Requirements](../01-Requirements/01-03-modules/01-03-06-unified-workflow.md) - Unified Workflow Specification
- [ADR-007 Error Handling](./ADR-007-error-handling-strategy.md) - `BusinessException` + 409 conflict response pattern
- [ADR-008 Notifications](./ADR-008-email-notification-strategy.md) - BullMQ `workflow-events` queue pattern
- [ADR-016 Security](./ADR-016-security-authentication.md) - `system.manage_all` required for DSL authoring
---
@@ -409,6 +565,8 @@ export class WorkflowEngineService {
- Admin UI สำหรับจัดการ Workflow จะพัฒนาใน Phase 2
- ต้องมี Migration Tool สำหรับ Workflow Definition Changes
- พิจารณาใช้ BPMN 2.0 Notation ในอนาคต (ถ้าต้องการ Visual Workflow Designer)
- **Required env vars:** `N8N_WEBHOOK_URL` ต้องตั้งใน `docker-compose.yml` ทุก environment ก่อน deploy
- **Bull Board UI:** ติดตั้ง `@bull-board/nestjs` สำหรับ visibility ของ `workflow-events` และ `workflow-events-failed` queues
---
@@ -429,12 +587,16 @@ export class WorkflowEngineService {
| Version | Date | Changes | Status |
|---------|------|---------|--------|
| 1.0 | 2026-02-24 | Initial version - DSL-based Unified Workflow Engine | ✅ Active |
| 1.1 | 2026-05-02 | Production hardening: JSON Logic condition engine, optimistic lock concurrency, BullMQ dedicated queue, context schema two-phase validation, async-only auto-action rule | ✅ Active |
---
## Related ADRs
- [ADR-002: Document Numbering Strategy](./ADR-002-document-numbering-strategy.md) - ใช้ Workflow Engine trigger Document Number Generation
- [ADR-002: Document Numbering Strategy](./ADR-002-document-numbering-strategy.md) - ใช้ Workflow Engine trigger Document Number Generation; Redis Redlock เฉพาะ numbering
- [ADR-007: Error Handling Strategy](./ADR-007-error-handling-strategy.md) - `ConflictException``BusinessException` → 409 pattern
- [ADR-008: Email/Notification Strategy](./ADR-008-email-notification-strategy.md) - BullMQ `workflow-events` dedicated queue
- [ADR-016: Security & Authentication](./ADR-016-security-authentication.md) - `system.manage_all` RBAC guard สำหรับ DSL authoring
- [RBAC Matrix](../01-Requirements/01-02-business-rules/01-02-01-rbac-matrix.md) - Permission Guards ใน Workflow Transitions
---