Modern LLM System Architecture (2025) – End‑to‑End Flow

From pre‑processing and orchestration to agentic execution, governance, and delivery.

Users & Channels Pre‑processing & Orchestration Model Inference Tools & Data Post‑processing & Governance Observability & Audit Users & Channels Web, Mobile, API, Voice Chat UI, Email, Agents Locale & consent notices Ingress & Triage Intent routing • Language detect Rate limits • Abuse screening PII redaction • Risk tiering Jurisdiction/policy gates Context Builder Conversation state & memory RAG: retrieve docs/web/db Prompt‑injection defences Tool selection & capability map Schema/format expectations Prompt Assembler System + developer + user Retrieved context & citations Guardrail reminders & roles LLM Core (Transformer) Tokenisation • Attention Instruction‑tuned & aligned Adapters/LoRA as needed Decoding & Constraints Temperature • Top‑p • Stops Constrained JSON/regex/CFG Log‑probs • Length caps Function‑Calling Loop Emit tool call → execute Return results → continue Budgets • Circuit‑breakers Tool‑result attribution Test‑time Reasoning Plan‑and‑execute • ReAct Self‑critique • Consistency Deliberate multi‑pass Tool Registry & Sandboxes Calculators • Code exec SQL/Vector DB • Search/Web Email/Calendar • CRM/APIs Allow/deny • Permissions Data Sources Vector DB (embeddings) Document stores & files Enterprise systems/APIs Web retrieval (guarded) Policy & Budgets Spend/time caps • Rate limits Jurisdiction & licensing Data‑minimisation Safety Filters Toxicity • Self‑harm • Extremism PII leakage & secrets IP/quotations policy Verification & Grounding Schema/JSON validation Code/SQL tests in sandbox Groundedness checks (RAG) Link/number validation Human‑in‑the‑Loop (as needed) Approvals • Escalations High‑risk actions Formatting & Delivery Citations • Tables • Redactions Tone/locale • Accessibility Channels (chat/email/docs) Obs & Audit Telemetry Evals Event log Approvals Retention Provenance Agentic Controller Planner • Memory • Budget Manager • Policy‑aware executor Policy Engine Org policy & jurisdiction Safety configs & blocklists Data residency & privacy Licensing/usage controls Legend Pre‑processing & orchestration Model inference Tools / policy Data sources Safety, verification, delivery Observability & audit / policy Agentic controller scope Notes: Apply data‑minimisation, GDPR/CCPA compliance, and content provenance where applicable. Maintain immutable audit trails for sensitive workflows.

Tip: Use this as a teaching aid. Walk left→right to explain the flow, then highlight the dashed agentic area to show planning, tools, and verification loops. The Observability strip underscores governance throughout.