Enterprise agents create value when they can execute workflows, not just generate text.
Most enterprise tasks are multi-step and cross-functional. To complete them reliably, an agent must be able to:
- break a business goal into executable tasks,
- invoke the right tools in the right sequence,
- recover safely from errors and retries,
- and resume from saved state with full context.
Delivering this in production requires strong orchestration plus dependable connectivity to APIs, databases, document systems, and internal platforms. APIs trigger actions in SaaS and line-of-business applications, databases provide live operational state for correct decisions, document systems provide policy and procedure context, and internal platforms connect execution to real enterprise workflows. If any layer is missing, handoffs fail and end-to-end execution becomes unreliable.
So, how can we achieve all of these?
First: Kill the "Multi-Agent Committee" hype. Not every workflow needs autonomous agents talking to each other. In fact, for 80% of enterprise processes, a multi-agent topology is an over-engineered nightmare that destroys determinism. What enterprises actually need are rigid, code-driven state machines that use single LLMs as pure functional operators—not autonomous coordinators.
In practice, this means abandoning the fantasy of a "coordinator agent" that dynamically plans and assigns tasks. Instead, use hardcoded routing. A traditional state machine translates a business objective into a workflow plan, assigns subtasks, and enforces guardrails at each stage. Single-purpose LLM calls then execute focused responsibilities. This separation improves quality because each call is heavily constrained, while the code manages sequencing, dependency checks, and rollback or escalation decisions when something fails.
Code-driven orchestration also enables true safe parallelism. Independent subtasks can run concurrently to reduce cycle time without the unpredictable latency and compounding hallucinations of agents trying to agree with each other.
For most teams, the best baseline is:
- LangGraph for strict graph-based orchestration and hardcoded control flow (not dynamic planning)
- OpenAI Agents SDK strictly for structured tool calling, not autonomous delegation
- Only use CrewAI/multi-agent patterns when human-like brainstorming or creative exploration is required
- A state-machine topology with explicit code-based routing, not an LLM planner
- Shared state backbone (Redis + Postgres) for handoffs, checkpoints, and consistency
- Observability by default (OpenTelemetry + Grafana) for traceable execution
Second: Stop pretending that standardized tool interfaces (like MCP) are a silver bullet. Exposing a clean JSON Schema doesn't solve the real enterprise bottleneck: implicit business logic. Tool integration isn't just about common contracts; it's about context.
In practice, an agent might know how to call the Salesforce API because of a beautiful OpenAPI spec, but standardizing the interface doesn't teach it whether it's politically or operationally safe to do so. A unified error taxonomy doesn't stop an agent from updating a record it shouldn't have touched. The reality is that "plug-and-play" agents are a myth and heavy. Custom middleware and explicit business rules are here to stay.
While standardized interfaces are necessary, they are vastly insufficient. Teams still need deep custom glue code to map enterprise reality to agent capabilities, maintain auditability, and ensure that each invocation actually adheres to unspoken company policies.
For most teams, the realistic baseline is:
- Thick middleware wrapping MCP-compatible tool adapters with explicit business logic guardrails
- JSON Schema / OpenAPI contracts used for validation, but heavily augmented with semantic context
- OAuth2 or service-account auth profiles strictly bounded by least-privilege principles
- Idempotency keys + correlation IDs for safe retries and end-to-end tracing
- Unified error taxonomy (retryable, non-retryable, policy-blocked)
- Manual human-in-the-loop reviews for any tool call that mutates sensitive state
Third: Acknowledge the clash between Agent Autonomy and Event-Driven Architecture. If you wrap an agent in Kafka queues, dead-letter queues, and rigid timeout budgets, is it still an autonomous agent, or have you just built the world's slowest, most expensive microservice? Enterprises must accept a controversial trade-off: you either get true autonomous reasoning, or you get traditional event-driven reliability. You rarely get both without massive latency.
In practice, if you force workflow execution to be driven by explicit events rather than dynamic reasoning, you restrict the agent's ability to pivot. Each stage emitting state transitions into queues means the process is bounded by rigid backoff policies and timeout rules. While this model keeps long-running enterprise processes resilient, it directly castrates the very autonomy that makes agents appealing in the first place.
Event-driven architecture provides operational control at the cost of agent intelligence. Teams can prioritize jobs and replay failed stages, but they do so by treating the LLM as just another dumb worker in a queue. Because every transition must be event logged, execution is observable, but heavily constrained.
For most teams navigating this trade-off, the baseline is:
- Message queues (Kafka, SQS) to connect isolated LLM tasks, sacrificing true autonomous chaining
- Retry policies + dead-letter queues, accepting that LLMs will frequently fail in unpredictable ways
- Aggressive timeout budgets because agents will hallucinate and get stuck in loops
- Strict workflow state machines instead of dynamic LLM planning
- Human approval events as forced bottlenecks to prevent autonomous disasters
- Structured event logs + trace IDs to debug the inevitable collisions between autonomy and queues
Finally: Your complex memory architecture might already be legacy tech. We are still building elaborate stateful management systems (Redis + Postgres + Vector DBs) based on the limitations of 8k context windows. With the advent of multi-million token context windows, the most contrarian (and perhaps most effective) approach to state is simply dumping the entire historical event log into the prompt. Stop building complex RAG pipelines for state when brute-force context stuffing works better and requires zero architecture.
In practice, while orchestrators should persist task checkpoints and tool outputs, the need to separate memory into fragmented "durable layers" is waning. Instead of complicated semantic retrieval and working context juggling, you can pass the full historical transcript. Workflows can resume exactly where they stopped simply by re-reading the entire thread.
Brute-force context management strengthens quality because the LLM sees the entire historical context, not just the chunks retrieved by a flawed similarity search. It enforces policy constraints by having the entire policy document in the prompt, providing a complete audit trail for what was literally injected into the model's brain at execution time.
For most forward-looking teams, the debatable baseline is:
- Postgres for raw event logs, checkpoints, and audit records (the source of truth)
- Massive Context Windows (1M+ tokens) instead of complex short-term/medium-term memory layers
- Zero Vector Stores for state—dump SOPs and historical cases directly into the prompt
- Session and task IDs that fetch the entire transcript to bind prompts to workflows
- Checkpoint and resume APIs that rebuild the full context window on the fly
- Retention and redaction policies applied directly to the unstructured transcript
Orchestration and tool connectivity are the execution backbone of enterprise agents. If a platform cannot coordinate tools reliably under real production constraints, it cannot deliver sustained business outcomes.
AI Enterprise Agent Series (1) - Secure by Design
AI Enterprise Agent Series (3) - Operations Reliability
AI Enterprise Agent Series (4) - Governance
AI Enterprise Agent Series (5) -Improving Delivery Through Platform Experience

No comments:
Post a Comment