Second, it’s about where the project is headed. Because archgw is built as a proxy server for agents, it’s designed to support emerging low-level protocols like A2A and MCP in a consistent, unified way—so developers can focus purely on high-level agent logic. This borrows from the same design decision that made Envoy successful for microservices: offload infrastructure concerns to a specialized layer, and keep application code clean. In our next big release, you will be able to run archgw as a sidecar proxy for improved orchestration and observability of agents. Something that other projects just won't be able to do.
Kong was designed for APIs. Envoy was built for microservices. Arch is built for agents.
And since everything is an API, Kong also supports MCP natively (among many other protocols, including all LLMs): https://konghq.com/blog/product-releases/securing-observing-...
This was the insight behind Envoy's initial design. Handle north/south and east/west traffic equally well as a universal data plane.
It looks like you’re even building on Envoy as the foundation for the system which just makes it more compelling.
vm_config: runtime: "envoy.wasm.runtime.v8" code: local: filename: "/etc/envoy/proxy-wasm-plugins/prompt_gateway.wasm"
…is where the integration happens. There's no need to modify envoy.bootstrap.wasm; instead, Arch loads the WASM modules at runtime using standard Envoy config templating. The filters (prompt_gateway for ingress, and llm_gateway for egress sit in the request path and do things like prompt inspection, model routing, header rewrites, and telemetry collection.
And was pleased with what I was able to do. Thanks
And importantly, some things just can’t be done well in a framework. For example, enforcing global rate limits across LLMs isn’t realistic when each agent instance holds its own local state. That kind of cross-cutting concern needs to live in infrastructure—not in application code.