np-dms/lcbp3

Fork 0

Files

T

admin 71c5e88181

CI / CD Pipeline / build (push) Has been skipped

Details

CI / CD Pipeline / deploy (push) Has been skipped

Details

690611:1705 ADR-035-235 #00 [skip CI]

2026-06-11 17:05:17 +07:00

29 KiB

Raw Blame History

LCBP3 / NAP-DMS Context

ระบบจัดการเอกสารงานก่อสร้าง (DMS) สำหรับโครงการ LCBP3 — เน้นการควบคุม Correspondence, RFA, Transmittal, Drawing พร้อมผู้ช่วย AI แบบ on-premises ที่ทำงานภายใต้ Workflow Engine กลางและขอบเขต AI ที่เข้มงวด (ADR-023A/ADR-033)

Agent/ tooling context: สำหรับ Hermes Agent, Telegram Bridge, และ DevOps tooling → ดู specs/06-Decision-Records/CONTEXT-ADR-031.md Typhoon OCR context: สำหรับ Typhoon OCR-3B และ typhoon2.1-gemma3-4b integration → ดู specs/06-Decision-Records/ADR-032-typhoon-ocr-integration.md

Language

Documents

Correspondence: ซองจดหมาย/เอกสารทุกประเภทที่หมุนเวียนในโครงการ เป็น parent ของ RFA / Transmittal / Memo Avoid: Letter, Communication, Document (generic)

RFA (Request For Approval): Correspondence ประเภทขออนุมัติ มี revision และอ้างอิง Drawing Revision ผ่าน rfa_items Avoid: Approval Request, Submit for Approval

Transmittal: Correspondence ที่ใช้ส่งมอบเอกสาร/แบบ ไม่ใช่จดหมายปะหน้า Avoid: Delivery Note, Cover Letter

Shop Drawing: แบบที่ผู้รับเหมาจัดทำเพื่อขออนุมัติก่อนก่อสร้าง Avoid: Construction Drawing

Contract Drawing: แบบต้นฉบับตามสัญญา ไม่ใช่ Shop Drawing Avoid: Design Drawing, Blueprint

Workflow

Workflow Engine: State machine กลาง DSL-based (ADR-001) — authority เดียวของการเปลี่ยน state ของทุก Correspondence Avoid: Approval Flow, Process Engine, RFA status flow (เป็นเพียง definition หนึ่ง)

Workflow Definition: Row ใน workflow_definitions ระบุ DSL ของ flow เช่น RFA_FLOW_V1, CORRESPONDENCE_FLOW_V1 Avoid: Approval logic, Hardcoded flow

Workflow Instance: Row ใน workflow_instances = สถานะปัจจุบันของเอกสารหนึ่งฉบับ — source of truth ของ state Avoid: Status, Stage (ใช้ภายใน DSL ได้แต่ห้ามแทน instance)

Workflow Transition: การเปลี่ยน state ที่บันทึกใน workflow_histories พร้อม actor_user_id (มนุษย์เท่านั้น) Avoid: Auto-execute, AI-driven approval

Intent Classification

Intent Classifier: Service ที่แปลงคำถามธรรมชาติ (ไทย/อังกฤษปน) → Server-side Intent enum ใช้ Hybrid strategy: Pattern First → LLM Fallback (ADR-024) Avoid: NLU, NLP router, LangChain router

Server-side Intent: Enum ของคำขอที่ AI Gateway รองรับ — สร้างจาก ai_intent_definitions table ไม่ใช่ hardcode Avoid: Tool, LLM tool, LangChain tool

Pattern Layer: ชั้นแรกของ Intent Classifier — keyword/regex match จาก ai_intent_patterns table, cache ใน Redis TTL 5 min Avoid: Rule engine, NLU pipeline

LLM Fallback: ชั้นที่สอง of Intent Classifier — synchronous Ollama call (gemma4:e4b Q8_0) เมื่อ Pattern Layer ไม่ match, ใช้ semaphore max=3 Avoid: BullMQ-based classification, async intent routing

AI

AI Document Assistant: ผู้ช่วยที่ให้ Insight + Suggest + Notify โดยไม่เปลี่ยน state ของเอกสารเอง (ADR-023A) Avoid: AI Document Controller, AI Agent, Autonomous Agent

AI Gateway: NestJS module ที่เป็นจุดเข้าเดียวของทุกคำขอ AI — enforce CASL + tenant scope ก่อนส่งงานเข้า BullMQ Avoid: AI Service (generic), Tool Layer

Document Chunk: Row ใน ai_document_chunks (MariaDB) เก็บ chunk text + metadata, ground truth สำหรับ re-embed Avoid: ai_embeddings, embedding row

Vector Point: Point ใน Qdrant — เก็บแค่ chunk_public_id, vector, และ payload { project_public_id, document_public_id, chunk_index } Avoid: Embedding (ambiguous), Vector record

RAG Query: Pipeline: embed query → QdrantService.search(projectPublicId, vector) → ดึง chunk_text จาก MariaDB → ส่งเข้า LLM พร้อม context Avoid: Semantic search (overloaded), Vector search (incomplete)

OCR Service: Container สำเร็จรูป (FastAPI Sidecar บน Desk-5439) ทำหน้าที่ประมวลผล OCR และสื่อสารผ่าน X-API-Key ป้องกันความปลอดภัย (ADR-032/033) Avoid: OCR microservice (ที่ขาดการป้องกัน)

Prompt Version: Immutable snapshot ของ prompt template ใน ai_prompts table — ทุกครั้งที่ admin กด "บันทึก" จะสร้าง version ใหม่ (version_number เพิ่มทีละ 1) version เก่ายังอยู่ใน history ลบได้ยกเว้น active version (ADR-029) Avoid: Prompt config, Prompt setting, Editable prompt

Active Prompt: Prompt Version ที่มี is_active = 1 ต่อ prompt_type — ใช้โดยทั้ง OCR Sandbox และ processMigrateDocument พร้อมกัน, cached ใน Redis TTL 60s; invalidated เมื่อ admin activate version อื่น (ADR-029) Avoid: Production prompt (sandbox และ migrate-document ใช้เดียวกัน)

Prompt Template: String ที่มี {{ocr_text}} placeholder บังคับ — backend validate ก่อน save; processor แทนที่ด้วย OCR output ก่อนส่งเข้า Ollama (ADR-029) Avoid: Prompt string, Prompt text (ambiguous)

Human-in-the-loop: ทุก AI suggestion ต้องผ่านการ accept/reject โดย user ก่อนกลายเป็น state change — บันทึกใน ai_audit_logs Avoid: Auto-apply, AI auto-execute

AI Tool Layer: Bridge layer ระหว่าง AI Gateway กับ business modules — dispatch โดย AI Gateway หลังได้ Server-side Intent, enforce CASL ภายใน tool เอง (ADR-025) Avoid: LLM function calling, Tool plugin, LangChain tool

Tool Registry: Static map ใน AiToolRegistryService ที่ map ServerIntent → tool handler — Intent ที่ไม่มีใน registry route ไป RAG หรือ FALLBACK Avoid: Dynamic plugin registry, Runtime-loaded tools

ToolResult DTO: LLM-friendly response object จาก tool — มีเฉพาะ publicId + business codes, ไม่มี INT id (ADR-019), ไม่มี TypeORM relations Avoid: Raw entity, Full entity response

ToolCallResult: Result wrapper ที่ tool คืนให้ Gateway: { ok: true, data } หรือ { ok: false, reason, message } — ไม่ throw exception Avoid: Throw exception from tool, Untyped error

Relationships

A Correspondence has a 1:1 specialization to RFA / Transmittal / etc. (table inheritance)
A RFA has 1:N RFA Revisions, each linking to one or more Shop Drawing Revisions via rfa_items
A Workflow Instance governs exactly one Correspondence; its current state is projected into entity columns (e.g. rfa_revisions.rfa_status_code_id) but workflow_instances is the source of truth
A Prompt Version lives in ai_prompts; exactly one per prompt_type has is_active = 1 — this is the Active Prompt consumed by both OCR Sandbox and processMigrateDocument; cached in Redis TTL 60s
A Document Chunk (MariaDB) has a 1:1 Vector Point in Qdrant via shared chunk_public_id (UUIDv7)
An AI Document Assistant suggestion produces an ai_audit_logs row; if user accepts, it triggers a normal Workflow Transition (AI never writes the transition itself)
Qdrant queries MUST be filtered by project_public_id — enforced at compile time by QdrantService signature
An Intent Classifier receives user query → returns Server-side Intent + confidence; Pattern Layer (DB table) checked first, LLM Fallback (Ollama sync) used only when pattern miss
An Intent Definition (ai_intent_definitions) has 1:N Intent Patterns (ai_intent_patterns); Admin จัดการได้ runtime
AI Gateway dispatches to AI Tool Layer directly (server-side) after receiving Intent — LLM never calls tools itself; Tool Registry maps Intent → handler; each handler returns ToolCallResult wrapper
A ToolResult DTO contains only publicId + business codes — injected into LLM prompt as JSON context (v1, max 500 tokens)

AI authority scope (resolved)

Scope	Allowed?	Mechanism
Read-only insight (summarise, explain)	✅	AI Gateway → service → CASL-guarded query
Suggest action (UI shows button)	✅	Response shape `{ suggestedAction, confidence, reasoning }`
Auto-trigger side-effects (notify, alert, comment)	✅	BullMQ job (ADR-008); MUST NOT change workflow state
Auto-execute workflow transition	❌	Forbidden Tier 1 — every transition needs human `actor_user_id`

Upload pipeline (resolved)

Stage	Mode	Queue	Notes
1. Upload → temp + return `tempUploadId`	Sync	—	<1s
2. ClamAV scan + MIME whitelist	Sync	—	block ก่อน commit (ADR-016)
3. User commit (metadata + ย้าย permanent)	Sync	—	สร้าง `documents` row, ใช้ `Idempotency-Key`
4. Classification/Tagging (3 pages แรก)	Async	`ai-realtime`	suggest metadata; user accept/reject (human-in-the-loop)
5. RAG Embedding (full doc; OCR ถ้า text-layer < 100 chars/page)	Async	`ai-batch`	trigger AUTO หลัง commit, parallel กับ stage 4
6. Qdrant upsert + `ai_document_chunks.embedded_at = NOW()`	Async	(worker)	gap = DB full-text fallback

กฎ:

❌ ห้าม OCR/embed ใน HTTP request handler
✅ BullMQ jobId = chunk_public_id (UUIDv7) กัน duplicate
✅ Embed fail → graceful degrade (เอกสารยังใช้งานได้, AI feature ลด)
✅ Revision ใหม่ → chunks เก่า mark superseded_at, ไม่ลบ vector
✅ Frontend ใช้ AiStatusBanner แสดง progress

Identifier rules (ADR-019, AI subsystem)

Boundary	Identifier ที่ใช้
API (FE ↔ AI Gateway)	`publicId` (UUIDv7 string) เท่านั้น; INT `id` มี `@Exclude()`
Server-side Intent payload	`*PublicId` strings; service แปลงเป็น INT FK ภายใน
LLM context (prompt)	`publicId` + business code (`rfa_number`, `drawing_code`) ห้ามเห็น INT
Qdrant payload	`project_public_id`, `document_public_id`, `chunk_public_id`
`ai_document_chunks` internals	INT FK ใช้ได้ภายใน DB; identity ที่ expose = `chunk_public_id BINARY(16)`
Business codes (e.g. `drawing_code = "A-101"`)	รับเป็น input ได้ แต่ resolve → `publicId` ก่อน query

Forbidden (Tier 1 CI blocker):

parseInt(<*PublicId>), Number(<*PublicId>), +<*PublicId>
publicId ?? id ?? '' fallback chain
DTO ที่มีทั้ง { id, uuid, publicId }

AI integration architecture (resolved)

มีแล้ว (Infrastructure):

AI Gateway — NestJS module, CASL-guarded, enqueue jobs ไป BullMQ
n8n — Workflow orchestrator บน QNAP (Migration Phase + simple routing)
Ollama — Local LLM inference บน Admin Desktop (ADR-034: typhoon2.5-np-dms + typhoon-np-dms-ocr + nomic-embed-text)
QdrantService — Vector search แบบ project-isolated
AiRagService — RAG pipeline (embed query → Qdrant → LLM context)
OcrService / sidecar — ระบบประมวลผล OCR ปลอดภัยด้วย API Key และ dynamic model swapping (ADR-033)

Glossary Updates (from ADR-034)

Term	Definition	Avoid
Thai-Optimized Model	โมเดล AI ที่ถูก fine-tune มาสำหรับภาษาไทยโดยเฉพาะ (เช่น Typhoon series จาก SCB10X)	Generic model, English-only model
Model Unload/Load	กระบวนการยกเลิกโหลดโมเดลจาก VRAM และโหลดโมเดลใหม่เข้าไปแทน เพื่อสลับการใช้งานระหว่างโมเดลต่างๆ	Model switching (ambiguous), Hot swap
Cold Start Penalty	ความล่าช้า 5-15 วินาทีที่เกิดจากการโหลดโมเดล weights เข้า VRAM หลังจากโมเดลถูก unload (keep_alive: 0)	Initial delay, First-run latency
Canonical AI Model Identity	ชื่อโมเดลหลักที่ระบบ backend, admin console และเอกสารสถาปัตยกรรมใช้อ้างอิงร่วมกันเป็น source of truth เดียว	Alias-only model name, temporary deploy tag
Adaptive OCR Residency	นโยบาย keep_alive ของ OCR model ที่ปรับตาม VRAM headroom และ active model ขณะนั้น แทนการค้างหรือ unload แบบตายตัว	Fixed keep_alive, always-resident OCR
Execution Profile	สัญญาณเชิงนโยบายที่ caller ส่งมาเพื่อบอกระดับความเร็ว/ความแม่นยำ/บริบทที่ต้องการ โดย backend map ต่อไปเป็น model และ parameters ที่อนุญาต	Free-form model key, direct model override
Canonical Profile Set	ชุดค่า `Execution Profile` มาตรฐานที่คงที่ระดับ contract เช่น `fast`, `balanced`, `thai-accurate`, `large-context` แทนการแตก profile ตาม internal pipeline	Job-specific routing key, per-endpoint profile taxonomy
Policy-Enforced Profile Override	กฎที่ backend มีสิทธิ์บังคับ profile สำหรับงานที่มีผลต่อข้อมูลหรือ metadata โดยไม่ยึดค่าที่ caller ส่งมา	Caller-controlled quality for write-affecting jobs, advisory-only governance
LLM-First GPU Ownership	นโยบายจัดลำดับสิทธิ์ VRAM ที่ให้ main LLM และ OCR path มาก่อน embedding/reranking; retrieval side ใช้ GPU ได้เฉพาะเมื่อมี headroom ผ่าน policy	Flat shared GPU pool, equal-priority GPU consumers
CPU Fallback Retrieval	พฤติกรรม degrade ของ embedding/reranking ที่สลับกลับไปใช้ CPU ทันทีเมื่อ GPU headroom ไม่พอ โดยไม่รอคิว GPU	GPU wait queue for retrieval, hard failure on low VRAM
Selective Realtime Concurrency	นโยบายเพิ่ม concurrency ของ `ai-realtime` ได้เฉพาะ job type ที่ไม่แตะ OCR path หรือ model switching; pause/resume coordination หลักยังคงอยู่	Global realtime concurrency uplift, scheduler rewrite
Lightweight Realtime Job	งานใน `ai-realtime` ที่ไม่เรียก OCR, ไม่บังคับ model switch, และไม่พึ่ง GPU-heavy generation path จึงมีสิทธิ์อยู่ใน concurrency uplift set	RAG query, OCR-triggering job, GPU-heavy generation
Generation-Centric RAG Query	การจัดประเภท `rag-query` ว่าเป็นงาน generation เป็นหลัก โดย retrieval ทำหน้าที่เตรียม context และยอม degrade ได้	Retrieval-first RAG, search-only job
Restricted Large-Context Profile	โปรไฟล์ `large-context` เป็นความสามารถพิเศษที่จำกัดใช้เฉพาะ admin หรือ special workflows ที่ backend อนุญาต ไม่ใช่ตัวเลือกทั่วไปของ `rag-query`	Public long-context option, caller-driven context inflation
Big Bang AI Runtime Rollout	การเปลี่ยน runtime policy, model identity, และ GPU scheduling หลายส่วนพร้อมกันในรอบ deploy เดียว เพราะระบบยังไม่เปิด production	Phase-gated rollout, incremental policy cutover
Big Bang Cutover Gate	เกณฑ์ผ่านก่อน cutover ที่บังคับให้ policy contract, model switching, adaptive OCR residency, และ RAG fallback ต้องผ่านครบทั้งชุด ไม่รับ partial success	Best-effort rollout, partial completion gate
Executable-First Verification	เกณฑ์ยืนยันผลหลักของ AI runtime rollout ต้องอิง test, log, metric, หรือ trace ที่รันซ้ำได้ แต่แต่ละแกนต้องมี manual validation path สำหรับยืนยันพฤติกรรมเชิงใช้งานจริงประกบเสมอ	Manual-only signoff, unverifiable smoke check
Single-Name Canonical Model Policy	เมื่อประกาศ canonical model identity ใหม่ ชื่อเดียวกันต้องถูกใช้สอดคล้องกันทุกชั้นของระบบที่ผู้ใช้และนักพัฒนาเห็น ส่วนชื่อ base runtime จริงเป็น implementation detail ใน ops/runtime internals เท่านั้น	Dual naming, mixed canonical and base model labels
Canonical OCR Identity	OCR model ต้องใช้ชื่อ canonical เดียวทุกชั้นของระบบเช่น `np-dms-ocr` โดยไม่เปิดชื่อ runtime เดิมเป็น public/internal contract หลัก	Legacy OCR runtime label as primary name, mixed OCR naming
Profile-Only Parameter Governance	API caller ส่งได้เพียง `Execution Profile`; ค่า temperature, top_p, max tokens และ runtime parameters จริงถูกกำหนดโดย backend policy เท่านั้น	Caller parameter override, free-form runtime tuning
Integrated Retrieval Acceleration Policy	การเร่งความเร็ว retrieval เช่น BGE embedding/reranking บน GPU เป็นส่วนหนึ่งของ AI runtime resource policy เดียวกับ main model และ OCR ไม่ใช่งาน optimization แยกอิสระ	Standalone retrieval tuning, separate GPU policy for RAG only

System readiness summary (resolved)

Component	สถานะ	หมายเหตุ
Infrastructure	✅ พร้อม	NestJS + Next.js + MariaDB + Redis + Elasticsearch
Workflow Engine	✅ พร้อม	DSL-based, ADR-001/021
AI Boundary	✅ พร้อม	ADR-023A — Ollama isolation, no direct DB access
RAG Pipeline	✅ พร้อม	Qdrant service ป้องกันการรั่วไหลระหว่างโปรเจกต์
Intent Router	✅ พร้อม	ADR-024 Active — Intent Classifier (Pattern→LLM Fallback) ทำงานเสร็จสมบูรณ์
AI Tool Layer	✅ พร้อม	ADR-025 Active — Tool Layer Bridge functions พัฒนาเสร็จสมบูรณ์
Document Chat UI	✅ พร้อม	ADR-026 Active — แผงควบคุม Side-panel Chat UI พัฒนาเสร็จสมบูรณ์
AI Admin Console	✅ พร้อม	ADR-027 Active — แผงควบคุม Dynamic prompt & model control
Dynamic Prompt Mgmt	✅ พร้อม	ADR-029 Active — พัฒนาเสร็จสมบูรณ์ทั้ง Entity, API, Sandbox, Cache และ UI
Active Model & OCR Switch	✅ พร้อม	ADR-033 Active — สลับโมเดลแบบ Synchronous, GPU VRAM Auto-release และ API Key sidecar protection

Flagged ambiguities

"approval logic" ในเอกสารเก่าใช้คาบเกี่ยวระหว่าง rfa_approve_codes (business outcome เช่น 1A/1B) กับ workflow_definitions (state transition rules) — resolved: เป็นคนละสิ่ง
"ai_embeddings" vs "ai_document_chunks" — resolved: ใช้ ai_document_chunks (metadata + text) + Qdrant (vector only); ห้ามเก็บ vector ใน MariaDB
"Tool Layer" ในเอกสาร AI — resolved: ไม่ใช่ LLM-callable tools, เป็น Server-side Intents ที่ NestJS controlใน AI Gateway
"AI = Document Controller" — resolved: ใช้ AI Document Assistant (Suggest + Insight) แทน เพื่อกัน scope creep ไปทาง autonomous agent
OpenRAG vs ADR-023A — resolved: ADR-023A เป็น canonical source — ใช้ Qdrant + nomic-embed-text สำหรับ vector search; Elasticsearch ใช้สำหรับ keyword/full-text เท่านั้น; specs/03-Data-and-Storage/03-07-OpenRAG.md เป็นเอกสาร reference แต่ไม่ใช่ active spec
".agents/ กับ Production AI" — resolved: .agents/ คือ Dev AI toolkit (ช่วยเขียนโค้ด); Production AI คือ AI Gateway + n8n + Ollama — เป็นคนละ layer กัน
"np-dms-ai" vs typhoon2.5-np-dms:latest — resolved: ถ้าเดินตาม AI refactor ใหม่ np-dms-ai คือ Canonical AI Model Identity ใหม่ของระบบ ไม่ใช่แค่ deploy alias
"OCR keep_alive" — resolved: policy ใหม่ควรถูกอธิบายเป็น Adaptive OCR Residency ตาม VRAM headroom และ active model ไม่ใช่ fixed 0 หรือ fixed 300
"model.key ใน API job request" — resolved: caller ไม่ควรเลือกชื่อโมเดลตรง ๆ; ควรส่ง Execution Profile แล้วให้ backend policy เป็นคน map ไป model/parameters ที่อนุญาต
"profile names" — resolved: ใช้ Canonical Profile Set แบบเล็กและเสถียร (fast, balanced, thai-accurate, large-context) แทนการแตกชื่อ profile ตาม job ภายใน
"profile สำหรับ migrate-document / auto-fill-document / OCR extraction" — resolved: ใช้ Policy-Enforced Profile Override; backend บังคับ profile เองสำหรับงานที่มีผลต่อข้อมูล ไม่เปิดให้ caller เลือกคุณภาพอย่างอิสระ
"BGE-M3 / Reranker บน GPU" — resolved: ถ้าย้ายขึ้น GPU ต้องอยู่ใต้ LLM-First GPU Ownership; LLM/OCR มี priority สูงกว่า retrieval path เสมอ
"embed/rerank ตอน VRAM ไม่พอ" — resolved: ใช้ CPU Fallback Retrieval; retrieval path ต้อง degrade ไป CPU ทันที ไม่รอ GPU queue
"ai-realtime = 2" — resolved: ใช้ Selective Realtime Concurrency; เพิ่มได้เฉพาะงาน realtime ที่ไม่ชนกับ OCR/model switching และยังคง pause/resume model เดิมเป็นแกนหลัก
"งานไหนได้สิทธิ์ realtime concurrency 2" — resolved: จำกัดเฉพาะ Lightweight Realtime Job; ไม่รวม rag-query
"rag-query ควรถูกมองเป็นอะไร" — resolved: ใช้ Generation-Centric RAG Query; main model path เป็น policy หลัก ส่วน retrieval เป็นขั้นเตรียม context ที่ fallback CPU ได้
"large-context ใช้กับอะไร" — resolved: ใช้ Restricted Large-Context Profile; จำกัดเฉพาะ admin/special workflows ไม่เปิดเป็นตัวเลือกทั่วไปของ rag-query
"rollout ของ AI refactor" — resolved: ใช้ Big Bang AI Runtime Rollout แม้มีหลาย runtime policy changes พร้อมกัน เพราะระบบยังไม่เปิด production
"อะไรคือเกณฑ์ผ่านของ big bang" — resolved: ใช้ Big Bang Cutover Gate; ต้องผ่านครบทั้ง policy contract, model switching, adaptive OCR residency และ RAG fallback
"evidence แบบไหนนับว่าผ่าน gate" — resolved: ใช้ Executable-First Verification เป็นหลัก แต่ต้องมี manual validation path ควบคู่ในแต่ละแกน
"np-dms-ai ควรตั้งชื่ออย่างไรในระบบ" — resolved: ใช้ Single-Name Canonical Model Policy; np-dms-ai เป็นชื่อเดียวทุกชั้นที่ผู้ใช้และนักพัฒนาเห็น
"np-dms-ocr ควรเดินตาม naming policy เดียวกันไหม" — resolved: ใช้ Canonical OCR Identity; np-dms-ocr เป็นชื่อ canonical เดียวทุกชั้นเหมือน np-dms-ai
"temperature/topP/maxTokens ใครคุม" — resolved: ใช้ Profile-Only Parameter Governance; caller ส่งได้แค่ profile ส่วน runtime parameters จริงให้ backend policy คุมทั้งหมด
"BGE GPU uplift อยู่ใน scope เดียวกันไหม" — resolved: ใช้ Integrated Retrieval Acceleration Policy; retrieval acceleration เป็นส่วนหนึ่งของ runtime resource policy เดียวกัน

ADRs ที่เกี่ยวข้องกับ AI Runtime Layer

ADR	หัวข้อ	ตัดสินใจอะไร	สถานะ
ADR-024	Intent Classification Strategy	Hybrid: Pattern First → LLM Fallback	✅ Accepted
ADR-025	AI Tool Layer Architecture	Bridge pattern, CASL enforcement, response shape	✅ Accepted
ADR-026	Document Chat UI Pattern	Side-panel vs modal vs separate page	✅ Accepted
ADR-027	AI Admin Console & Dynamic Control	Admin Panel + dynamic model/prompt/intent control	✅ Accepted
ADR-028	Migration Architecture Refactor	Staging Queue & post-migration cleanup	✅ Active
ADR-029	Dynamic Prompt Management	`ai_prompts` table, versioned OCR extraction prompt	✅ Active
ADR-032	Typhoon OCR Integration	Typhoon OCR-3B + typhoon2.1-gemma3-4b on Admin Desktop	✅ Active
ADR-033	Active Model & OCR Management	Synchronous Model switch, GPU VRAM Auto-release, Sidecar API Key protection	✅ Active

หมายเหตุ: ADR-023A ยังคงเป็น canonical สำหรับ infrastructure — ADR-024/025/026/027 เพิ่ม runtime layer; ADR-028 ปรับ Migration Pipeline; ADR-033 จัดระบบโมเดลและ OCR

สิ่งที่ควรทำในอนาคต (Future Maintenance & Security Tasks)

Axios Dependency: ได้รับการอัปเกรด dependencies เป็นรุ่นปลอดภัยล่าสุดและแก้ไขช่องโหว่ Prototype Pollution เรียบร้อยแล้ว (pnpm audit CLEAN 100%)
ความปลอดภัยของ Sidecar และ GPU: นำระบบ API Key Header verification (X-API-Key) และกลไก Unload model (keep_alive: 0) มาประยุกต์ใช้อย่างสมบูรณ์บนเครื่องประมวลผลโลคัล Desk-5439

29 KiB Raw Blame History