DevOps
A routing pattern that puts a semantic router in front of MCP tools so the LLM only sees the subset it needs. Uses vector embeddings and cosine similarity to match user intent to tools dynamically. Achieves up to 96% reduction in input tokens when dealing with large tool catalogs.
Full FlowZap Code
Host {
n1: circle label="User sends prompt"
n2: rectangle label="Agent extracts intent"
n3: rectangle label="Send intent to router"
n4: rectangle label="Receive routed result"
n5: rectangle label="Agent responds to user"
n1.handle(right) -> n2.handle(left)
n2.handle(right) -> n3.handle(left)
n3.handle(bottom) -> Router.n6.handle(top) [label="Intent + tool request"]
n4.handle(right) -> n5.handle(left)
}
Router {
n6: rectangle label="Receive intent"
n7: rectangle label="Semantic match via embeddings"
n8: diamond label="Which MCP server?"
n9: rectangle label="Forward to Server A"
n10: rectangle label="Forward to Server B"
n11: rectangle label="Normalize and return result"
n6.handle(right) -> n7.handle(left)
n7.handle(right) -> n8.handle(left)
n8.handle(bottom) -> n9.handle(top) [label="Route A"]
n8.handle(right) -> n10.handle(left) [label="Route B"]
n9.handle(bottom) -> ServerA.n12.handle(top) [label="Call Server A"]
n10.handle(bottom) -> ServerB.n14.handle(top) [label="Call Server B"]
n11.handle(top) -> Host.n4.handle(bottom) [label="Final result"]
}
ServerA {
n12: rectangle label="Execute tool A"
n13: rectangle label="Return A result"
n12.handle(right) -> n13.handle(left)
n13.handle(top) -> Router.n11.handle(bottom) [label="Result A"]
}
ServerB {
n14: rectangle label="Execute tool B"
n15: rectangle label="Return B result"
n14.handle(right) -> n15.handle(left)
n15.handle(top) -> Router.n11.handle(left) [label="Result B"]
}
Related templates
DevOps
A caching and compression layer that sits between agents and MCP servers, intercepting redundant context requests before they hit the wire. Uses TTL-based cache invalidation, Brotli compression, and semantic caching. Can achieve up to 95%+ token reduction and significantly lower LLM bills.
DevOps
The simplest MCP pattern — direct connection between host application and MCP server over stdio or HTTP. No extra hops, lowest latency, easiest debugging. Perfect for MVPs, hackathons, and single-team setups where security governance is not yet a concern.
DevOps
An API gateway pattern that sits between agents and MCP servers to handle authentication, rate limits, and auditing. The gateway enforces OAuth 2.0, SAML, SSO, tool-level rate limiting, and team-based quotas. Essential for multi-team or multi-tenant MCP deployments.
DevOps
A multi-agent mesh pattern where agents communicate through a shared context broker backed by MCP. Enables coordinated tool access and state synchronization across multiple specialized agents (planner, coder, reviewer, operator). Supports both orchestrated and choreographed interaction patterns.
DevOps
A resilience pattern that wraps MCP calls with health-aware gates using three states: Closed (normal), Open (failures detected, fast-fail), and Half-Open (testing recovery). Prevents cascading failures when tools become unresponsive. Essential for production-grade reliability.
Architecture
A single-agent AI architecture where one agent handles everything: parsing requests, reasoning, calling tools via MCP, and generating responses. This is the default architecture for prototypes and simple automations—easy to debug but hits context-window limits quickly and is hard to parallelize. Ideal for MVPs and solo builders shipping fast.