A2A: when the workflow IS the broker

The reference architecture distinguishes request-based and message-driven agent communication. For in-process orchestration, MAF's workflow runtime IS the broker. For distributed agent-to-agent, agent-framework-a2a is what you reach for.

June 3, 2026 · Engineers designing multi-agent communication

MAFA2ACommunicationArchitecture

The reference architecture's Chapter 5 splits agent-to-agent communication into two paradigms: request-based and message-driven. For the first round of this project I wrote both — a OrchestratorClient for request/reply and an InProcessBroker with topic subscriptions and a correlation-id-based request/response helper. About 300 lines.

Then I read MAF more carefully and deleted all of it.

The realisation

The MAF workflow runtime — SequentialBuilder, ConcurrentBuilder, HandoffBuilder, WorkflowBuilder — already handles message routing, ordering, back-pressure, error propagation, and OpenTelemetry tracing. For agents in the same process, the workflow IS the broker. There is no value in a separate pub/sub layer.

What's actually useful from a dedicated agent-to-agent package is the distributed case — when agents run as separate services and need to call each other over HTTP. That's exactly what agent-framework-a2a is for. It implements the A2A protocol.

So the project's communication/ module shrank to two helpers around A2A:

from agent_framework_a2a import A2AAgent, A2AExecutor


def wrap_remote_agent(*, url, name=None, id=None, description=None) -> A2AAgent:
    return A2AAgent(url=url, name=name, id=id, description=description)


def as_workflow_executor(agent, *, stream: bool = False) -> A2AExecutor:
    return A2AExecutor(agent=agent, stream=stream)

That's it. 30 lines. The complicated thing is choosing when to reach for them.

In-process: just use the builder

sequenceDiagram
    autonumber
    actor User
    participant Builder as SequentialBuilder
    participant Planner
    participant Researcher
    participant Writer
    User->>Builder: workflow.run(prompt)
    Note over Builder: routing + back-pressure<br/>+ tracing handled here
    Builder->>Planner: pass conversation
    Planner-->>Builder: append msg
    Builder->>Researcher: pass conversation
    Researcher-->>Builder: append msg
    Builder->>Writer: pass conversation
    Writer-->>Builder: append msg
    Builder-->>User: get_outputs()

Six agents in the same process, four workflow shapes to compose them. No queue, no topic, no envelope. The builder is the queue.

Why I find this convincing: every MAF agent invocation already gets an OTel span. Every workflow edge gets a executor.process * span. Every tool call gets a span. If you put a pub/sub layer in front, you have to either bridge those spans through your broker — and write the trace-context propagation yourself — or accept losing trace correlation across the broker boundary. Both are bad outcomes that the workflow runtime avoids by being the broker.

Distributed: A2A as the wire format

When agents run as separate services, you need a wire format. A2A is that — REST endpoints with a documented schema for "send a task" and "stream the response."

The consumer side:

from agent_framework.orchestrations import SequentialBuilder
from multi_agent.agents import make_orchestrator, make_writer
from multi_agent.communication import wrap_remote_agent
from multi_agent.providers import build_chat_client

client = build_chat_client()
orchestrator = make_orchestrator(client)
writer = make_writer(client)
remote_researcher = wrap_remote_agent(url="https://researcher.example.com/a2a")

workflow = SequentialBuilder(
    participants=[orchestrator, remote_researcher, writer],
).build()

The A2AAgent looks like any other MAF Agent to the workflow. The workflow doesn't know the researcher is remote. That's the whole point.

The producer side (exposing your local agent over A2A) is A2AExecutor. You wrap the agent and it gives you the executor that a workflow on the consumer's side can stream from. The repo doesn't ship a hosted A2A server because it runs in-process by default; if you wanted one, the MAF A2A samples are the path.

sequenceDiagram
    autonumber
    actor User
    participant Orch as Orchestrator (local)
    participant A2A as A2AAgent wrapper
    participant Remote as Remote Researcher service
    participant LLM
    User->>Orch: prompt
    Orch->>A2A: agent.run("research X")
    A2A->>Remote: POST /a2a/tasks
    Remote->>LLM: complete
    LLM-->>Remote: result
    Remote-->>A2A: A2A response
    A2A-->>Orch: AgentResponse
    Orch-->>User: aggregated answer

The A2A wrapper preserves the Agent interface. The orchestrator code doesn't change.

The streaming rule

The reference architecture's Chapter 5 on Request-Based communication has a strong recommendation worth quoting: stream only between the orchestrator and the end client, not between the orchestrator and internal expert agents.

sequenceDiagram
    autonumber
    User->>Orch: prompt
    rect rgba(255, 220, 220, 0.4)
        Note over Orch,SpecB: internal calls = NON-streaming
        Orch->>SpecA: full request
        SpecA-->>Orch: full response
        Orch->>SpecB: full request
        SpecB-->>Orch: full response
    end
    rect rgba(220, 255, 220, 0.4)
        Note over User,Orch: only this edge streams
        Orch-->>User: chunk 1
        Orch-->>User: chunk 2
        Orch-->>User: chunk N
    end

The rationale is operational: streaming across multiple hops makes error recovery complex (the orchestrator has to reconcile fragmented streams), increases architectural complexity (ordering, session state, message ordering at every boundary), and obscures observability (independent streams are hard to trace as one workflow).

MAF's workflow runtime honours this by default — internal SequentialBuilder edges are request/reply; only the workflow's run(..., stream=True) exposes streaming to the caller. If you swap an internal call for an A2AAgent you should keep it non-streaming. The repo's as_workflow_executor(agent, stream=False) defaults to that.

What I was missing

The first time I read the architecture I thought "request-based" and "message-driven" were two implementations I'd have to choose between. They're not. They're two ends of a spectrum:

Synchronous request/reply, in-process — workflow runtime.
Synchronous request/reply, over HTTP — A2A.
Asynchronous, durable queue — a real broker (Service Bus, Kafka, RabbitMQ). MAF doesn't ship this; you bring your own and emit messages from inside workflow executors.

Many systems use all three. The orchestrator might fan out to local agents via the workflow, call a remote validation agent via A2A, and emit a "task completed" event to Service Bus for downstream analytics. Three transports, one orchestration layer.

What I won't do again

I won't write a pub/sub in front of the workflow. The workflow IS the pub/sub for in-process orchestration, and bridging real spans through a custom broker is more work than it pays for.

The communication chapter is one of the chapters where MAF gives you exactly what the architecture asks for, and the right code is the code that uses it. The hard work — back-pressure, ordering, error propagation, trace context — is already done. Let it stay done.