Main: revise specs to 1.5.0 (completed)

2025-12-01 01:28:32 +07:00
parent 241022ada6
commit 71c091055a
69 changed files with 28252 additions and 74 deletions
--- a/specs/05-decisions/ADR-006-redis-caching-strategy.md
+++ b/specs/05-decisions/ADR-006-redis-caching-strategy.md
@@ -0,0 +1,438 @@
+# ADR-006: Redis Usage and Caching Strategy
+
+**Status:** Accepted
+**Date:** 2025-11-30
+**Decision Makers:** Development Team, System Architect
+**Related Documents:**
+
+- [System Architecture](../02-architecture/system-architecture.md)
+- [Performance Requirements](../01-requirements/06-non-functional.md)
+
+---
+
+## Context and Problem Statement
+
+LCBP3-DMS ต้องการ High Performance ในการ:
+
+- Check Permissions (ทุก Request)
+- Document Numbering (Concurrent Safe)
+- Master Data Access (ถูกเรียกบ่อยมาก)
+- Session Management
+- Background Job Queue
+
+**Challenges:**
+
+- Database queries ช้า (แม้มี indexing)
+- Concurrent access ต้องมี Locking mechanism
+- Permission checking ต้องเร็ว (< 10ms)
+- Master data แทบไม่เปลี่ยน แต่ถูก query บ่อย
+
+---
+
+## Decision Drivers
+
+- **Performance:** Response time < 200ms (p90)
+- **Scalability:** รองรับ 100+ concurrent users
+- **Consistency:** Data consistency with database
+- **Reliability:** Cache must not cause data loss
+- **Cost-Effectiveness:** ใช้ Resource น้อยที่สุด
+
+---
+
+## Considered Options
+
+### Option 1: No Caching (Database Only)
+
+**Pros:**
+
+- ✅ Simple, no cache invalidation
+- ✅ Always consistent
+
+**Cons:**
+
+- ❌ Slow permission checks (JOIN tables)
+- ❌ High DB load
+- ❌ No distributed locking
+
+### Option 2: Application-Level In-Memory Cache
+
+**Pros:**
+
+- ✅ Very fast (local memory)
+- ✅ No external dependency
+
+**Cons:**
+
+- ❌ Not shared across instances
+- ❌ No distributed locking
+- ❌ Cache invalidation issues
+
+### Option 3: **Redis as Distributed Cache + Lock** ⭐ (Selected)
+
+**Pros:**
+
+- ✅ **Fast:** In-memory, < 1ms access
+- ✅ **Distributed:** Shared across instances
+- ✅ **Locking:** Redis locks for concurrency
+- ✅ **Pub/Sub:** Cache invalidation broadcasting
+- ✅ **Queue:** BullMQ for background jobs
+
+**Cons:**
+
+- ❌ External dependency
+- ❌ Requires Redis cluster for HA
+
+---
+
+## Decision Outcome
+
+**Chosen Option:** Redis as Distributed Cache + Lock Provider
+
+---
+
+## Redis Usage Patterns
+
+### 1. Distributed Locking (Redlock)
+
+**Use Cases:**
+
+- Document Number Generation
+- Critical Sections
+
+**Implementation:**
+
+```typescript
+const lock = await redlock.acquire([lockKey], 3000); // 3sec TTL
+try {
+  // Critical section
+} finally {
+  await lock.release();
+}
+```
+
+**Configuration:**
+
+- TTL: 2-5 seconds
+- Retry: Exponential backoff, max 3 retries
+
+---
+
+### 2. Permission Caching
+
+**Cache Structure:**
+
+```typescript
+// Key: user:{user_id}:permissions
+// Value: JSON array of CASL rules
+// TTL: 30 minutes
+await redis.set(
+  `user:${userId}:permissions`,
+  JSON.stringify(abilityRules),
+  'EX',
+  1800
+);
+```
+
+**Invalidation Strategy:**
+
+- Role changed → Invalidate all users with that role
+- User assignment changed → Invalidate that user
+- Permission modified → Invalidate all affected roles
+
+---
+
+### 3. Master Data Caching
+
+**Cached Data:**
+
+- Organizations (TTL: 1 hour)
+- Projects (TTL: 1 hour)
+- Correspondence Types (TTL: 24 hours)
+- RFA Status Codes (TTL: 24 hours)
+- Roles & Permissions (TTL: 30 minutes)
+
+**Cache Pattern:**
+
+```typescript
+async getOrganizations(): Promise<Organization[]> {
+  const cacheKey = 'master:organizations';
+  let cached = await redis.get(cacheKey);
+
+  if (!cached) {
+    const organizations = await this.orgRepo.find({ where: { is_active: true } });
+    await redis.set(cacheKey, JSON.stringify(organizations), 'EX', 3600);
+    return organizations;
+  }
+
+  return JSON.parse(cached);
+}
+```
+
+**Invalidation:**
+
+- On CREATE/UPDATE/DELETE → Invalidate immediately
+- Publish event to Redis Pub/Sub for multi-instance sync
+
+---
+
+### 4. Session Management
+
+**Structure:**
+
+```typescript
+// Key: session:{session_id}
+// Value: User session data
+// TTL: 8 hours
+interface SessionData {
+  user_id: number;
+  username: string;
+  organization_id: number;
+  last_activity: Date;
+}
+```
+
+**Refresh Strategy:**
+
+- Update `last_activity` on every request
+- Extend TTL if activity within last 1 hour
+
+---
+
+### 5. Rate Limiting
+
+**Implementation:**
+
+```typescript
+const key = `rate_limit:${userId}:${endpoint}`;
+const current = await redis.incr(key);
+if (current === 1) {
+  await redis.expire(key, 3600); // 1 hour window
+}
+if (current > limit) {
+  throw new TooManyRequestsException();
+}
+```
+
+**Limits:**
+
+- File Upload: 50 req/hour per user
+- Search: 500 req/hour per user
+- Anonymous: 100 req/hour per IP
+
+---
+
+### 6. Background Job Queue (BullMQ)
+
+**Queues:**
+
+1. **Email Queue:** Send email notifications
+2. **Notification Queue:** LINE Notify
+3. **Indexing Queue:** Elasticsearch indexing
+4. **Cleanup Queue:** Delete temp files
+5. **Report Queue:** Generate PDF reports
+
+**Configuration:**
+
+```typescript
+const emailQueue = new Queue('email', {
+  connection: redisConnection,
+  defaultJobOptions: {
+    attempts: 3,
+    backoff: {
+      type: 'exponential',
+      delay: 2000,
+    },
+    removeOnComplete: 100, // Keep last 100
+    removeOnFail: 500,
+  },
+});
+```
+
+---
+
+## Cache Invalidation Strategy
+
+### 1. Time-Based Expiration (TTL)
+
+| Data Type      | TTL        | Rationale                     |
+| :------------- | :--------- | :---------------------------- |
+| Permissions    | 30 minutes | Balance freshness/performance |
+| Master Data    | 1 hour     | Rarely changes                |
+| Session        | 8 hours    | Match JWT expiration          |
+| Search Results | 15 minutes | Data changes frequently       |
+
+### 2. Event-Based Invalidation
+
+**Pattern:**
+
+```typescript
+@Injectable()
+export class CacheInvalidationService {
+  async invalidateUserPermissions(userId: number): Promise<void> {
+    await this.redis.del(`user:${userId}:permissions`);
+
+    // Broadcast to other instances
+    await this.redis.publish(
+      'cache:invalidate',
+      JSON.stringify({
+        pattern: 'user:permissions',
+        userId,
+      })
+    );
+  }
+
+  async invalidateMasterData(entity: string): Promise<void> {
+    await this.redis.del(`master:${entity}`);
+    await this.redis.publish(
+      'cache:invalidate',
+      JSON.stringify({
+        pattern: 'master',
+        entity,
+      })
+    );
+  }
+}
+```
+
+### 3. Write-Through Cache
+
+**For Master Data:**
+
+```typescript
+async updateOrganization(id: number, dto: UpdateOrgDto): Promise<Organization> {
+  const org = await this.orgRepo.save({ id, ...dto });
+
+  // Invalidate cache immediately
+  await this.cache.invalidateMasterData('organizations');
+
+  return org;
+}
+```
+
+---
+
+## Redis Configuration
+
+### Production Setup
+
+```yaml
+# docker-compose.yml
+redis:
+  image: redis:7-alpine
+  command: >
+    redis-server
+    --appendonly yes
+    --appendfsync everysec
+    --maxmemory 2gb
+    --maxmemory-policy allkeys-lru
+  volumes:
+    - redis-data:/data
+  ports:
+    - '6379:6379'
+  healthcheck:
+    test: ['CMD', 'redis-cli', 'ping']
+    interval: 10s
+    timeout: 3s
+    retries: 3
+```
+
+**Key Settings:**
+
+- `appendonly yes`: AOF persistence
+- `appendfsync everysec`: Write every second (balance performance/durability)
+- `maxmemory 2gb`: Limit memory usage
+- `maxmemory-policy allkeys-lru`: Evict least recently used keys
+
+---
+
+### High Availability Considerations
+
+**Future Improvements:**
+
+1. **Redis Sentinel:** Auto-failover
+2. **Redis Cluster:** Horizontal scaling
+3. **Read Replicas:** Offload read queries
+
+**Current:** Single Redis instance (sufficient for MVP)
+
+---
+
+## Monitoring
+
+### Key Metrics
+
+```typescript
+@Injectable()
+export class RedisMonitoringService {
+  @Cron('*/5 * * * *') // Every 5 minutes
+  async captureMetrics(): Promise<void> {
+    const info = await this.redis.info();
+
+    // Parse and log metrics
+    metrics.record({
+      'redis.memory.used': parseMemoryUsed(info),
+      'redis.memory.peak': parseMemoryPeak(info),
+      'redis.keyspace.hits': parseHits(info),
+      'redis.keyspace.misses': parseMisses(info),
+      'redis.connections.active': parseConnections(info),
+    });
+  }
+}
+```
+
+**Alert Thresholds:**
+
+- Memory usage > 80%
+- Hit rate < 70%
+- Connections > 90% of max
+
+---
+
+## Consequences
+
+### Positive
+
+1. ✅ **Fast Permission Check:** < 5ms (vs 50ms from DB)
+2. ✅ **Reduced DB Load:** 70% reduction in queries
+3. ✅ **Distributed Locking:** No race conditions
+4. ✅ **Queue Management:** Background jobs reliable
+5. ✅ **Scalability:** รองรับ Multi-instance deployment
+
+### Negative
+
+1. ❌ **Dependency:** Redis ต้อง Available เสมอ
+2. ❌ **Memory Limit:** ต้อง Monitor และ Evict
+3. ❌ **Complexity:** Cache invalidation ซับซ้อน
+4. ❌ **Data Loss Risk:** ถ้า Redis crash (with AOF mitigates this)
+
+### Mit Strategies
+
+- **Dependency:** Health checks + Fallback to DB
+- **Memory:** Set max memory + LRU eviction policy
+- **Complexity:** Centralize invalidation logic
+- **Data Loss:** Enable AOF persistence
+
+---
+
+## Compliance
+
+เป็นไปตาม:
+
+- [System Architecture Section 3.5](../02-architecture/system-architecture.md#redis)
+- [Performance Requirements](../01-requirements/06-non-functional.md)
+
+---
+
+## Related ADRs
+
+- [ADR-002: Document Numbering Strategy](./ADR-002-document-numbering-strategy.md) - Redis locks
+- [ADR-004: RBAC Implementation](./ADR-004-rbac-implementation.md) - Permission caching
+
+---
+
+## References
+
+- [Redis Documentation](https://redis.io/docs/)
+- [Redlock Algorithm](https://redis.io/topics/distlock)
+- [BullMQ Documentation](https://docs.bullmq.io/)
+- [Cache Invalidation Strategies](https://martinfowler.com/bliki/TwoHardThings.html)