Five MAF orchestration shapes — adding Group Chat and Magentic
The first ten posts treated MAF as having four orchestration patterns. The official docs say five. Here are the two I missed — Group Chat and Magentic — with the API surface, when to pick each, and the test path that catches them at build time.
The earlier posts in this series covered four orchestration patterns: Sequential, Concurrent, Handoff, and Custom Graph. That's the set the project shipped with — and it's what most multi-agent diagrams show.
Then I read the 804-page official Microsoft Agent Framework PDF cover to cover and discovered the docs document five orchestrations, not four. The two I'd missed are Group Chat and Magentic. This post adds them, with offline tests, and explains when each one is the right call.
The full set
| Pattern | Topology | When to pick |
|---|---|---|
| Sequential | linear | Refinement pipeline — each step builds on the last |
| Concurrent | fan-out / fan-in | Independent perspectives in parallel |
| Handoff | mesh | Agents transfer control to each other (no central orchestrator) |
| Group Chat | star | Multiple agents take turns; a manager (or selector) picks who speaks |
| Magentic | star + planning | Open-ended task; manager maintains a task ledger and progress ledger |
| Custom graph | arbitrary DAG | When none of the above fit — loops, conditionals, sub-workflows |
The first four are mesh / fan-out / linear variations on the same idea. Magentic is qualitatively different — it adds planning to the orchestrator's job, complete with a paper-grounded design (Magentic-One) and three independent loop caps (rounds / stalls / resets).
Group Chat
agent_framework.orchestrations.GroupChatBuilder takes:
participants— the agents in the room- exactly one of:
orchestrator_agent,orchestrator, orselection_func max_rounds— hard cap on iteration
The orchestrator picks who speaks next each round. For testing and simple cases, a deterministic selection_func is cheaper than spinning up a manager LLM:
def _round_robin_selector(participant_names: list[str]):
"""Build a GroupChatSelectionFunction that alternates speakers."""
index = {"value": 0}
def select(state) -> str:
name = participant_names[index["value"] % len(participant_names)]
index["value"] += 1
return name
return select
def build_group_chat_workflow(client=None, *, max_rounds: int = 4):
client = client or build_chat_client()
writer = make_writer(client, require_per_service_call_history_persistence=True)
critic = make_critic(client, require_per_service_call_history_persistence=True)
return GroupChatBuilder(
participants=[writer, critic],
selection_func=_round_robin_selector(["writer", "critic"]),
max_rounds=max_rounds,
output_from="all",
).build()
The whole thing — including OTel wrapper and metrics — is in workflows/group_chat.py.
When to pick Group Chat over Handoff
The line is subtle. From the docs:
| Group Chat | Handoff |
|---|---|
| Star topology with a manager picking speakers | Mesh topology with agents transferring control to each other |
| Iterative refinement (writer ↔ reviewer rounds) | Routing the conversation to the right specialist |
| Manager owns the orchestration | Each agent owns its handoff decision |
| Shared context — all agents see history | Full context handed off; receiver owns the task |
A writer-critic loop is Group Chat. A customer-support triage that routes to refund/order/return agents is Handoff.
Magentic
MagenticBuilder is the heaviest pattern. The manager maintains two ledgers and adapts in real time:
- Task ledger — facts, plan, educated guesses (updated on initial planning and on replans)
- Progress ledger — is the task complete? are we looping? next speaker? (updated every round)
The Python signature with all the knobs:
def build_magentic_workflow(
client=None,
*,
max_round_count: int = 6, # total rounds before forced termination
max_stall_count: int = 3, # consecutive stalls before manager replans
max_reset_count: int = 2, # full plan resets before final termination
enable_plan_review: bool = False, # HITL signoff on initial plan + replans
) -> Workflow:
manager = _make_manager(client)
researcher = make_researcher(client, require_per_service_call_history_persistence=True)
writer = make_writer(client, require_per_service_call_history_persistence=True)
critic = make_critic(client, require_per_service_call_history_persistence=True)
return MagenticBuilder(
participants=[researcher, writer, critic],
manager_agent=manager,
max_round_count=max_round_count,
max_stall_count=max_stall_count,
max_reset_count=max_reset_count,
enable_plan_review=enable_plan_review,
output_from="all",
).build()
Why three independent loop caps? Each protects against a different failure mode:
| Cap | What it prevents |
|---|---|
max_round_count |
Runaway iteration when neither agent signals completion |
max_stall_count |
The manager wastes rounds asking the same question — triggers a replan |
max_reset_count |
The plan itself is unsolvable — give up after N full plan rewrites |
Without all three, Magentic can loop indefinitely on tasks it shouldn't try in the first place.
When to pick Magentic
The official docs put it bluntly:
"Magentic orchestration has the same architecture as the Group Chat orchestration pattern, with a very powerful manager that uses planning to coordinate agent collaboration. If your scenario requires simpler coordination without complex planning, consider using the Group Chat pattern instead."
In other words: try Group Chat first. Reach for Magentic when the task has open structure (no fixed pipeline), requires research + computation across multiple specialists, and the manager genuinely needs to plan.
A literal example from the Magentic-One paper: "Prepare a report comparing energy efficiency of ResNet-50, BERT-base, and GPT-2 on Azure Standard_NC6s_v3 VMs, including CO₂ estimates for 24-hour training, with tables and a final recommendation per task type." That's the kind of task where a manager picking the next speaker every round buys you something a fixed pipeline can't.
The three orchestrator events Magentic emits
Worth knowing because they're observable from your event-handling code:
async for ev in workflow_run.watch_stream_async():
if isinstance(ev, MagenticPlanCreatedEvent):
print(f"[plan] {ev.full_task_ledger.text}")
elif isinstance(ev, MagenticReplannedEvent):
print(f"[replan] {ev.full_task_ledger.text}")
elif isinstance(ev, MagenticProgressLedgerUpdatedEvent):
ledger = ev.progress_ledger
print(f"[progress] complete={ledger.is_request_satisfied}, "
f"loop={ledger.is_in_loop}, next={ledger.next_speaker}")
These give you a live view into the manager's state machine. Wire them into your observability dashboard alongside the standard workflow.* spans.
Testing without burning tokens
Both workflows are built offline without contacting an LLM. The MAF builders just wire participants and config — the LLM is only called when you await workflow.run(prompt). That means we can verify the build path with unit tests:
# tests/test_group_chat_workflow.py
def test_group_chat_workflow_builds_offline() -> None:
workflow = build_group_chat_workflow(client=_DummyClient(), max_rounds=2)
assert workflow is not None
@pytest.mark.parametrize("max_rounds", [1, 2, 4, 8])
def test_group_chat_workflow_respects_max_rounds(max_rounds: int) -> None:
workflow = build_group_chat_workflow(client=_DummyClient(), max_rounds=max_rounds)
assert workflow is not None
# tests/test_magentic_workflow.py
@pytest.mark.parametrize("rounds,stalls,resets", [(2,1,1), (4,2,1), (6,3,2), (10,5,3)])
def test_magentic_workflow_loop_caps(rounds, stalls, resets) -> None:
workflow = build_magentic_workflow(
client=_DummyClient(),
max_round_count=rounds,
max_stall_count=stalls,
max_reset_count=resets,
)
assert workflow is not None
These tests run in milliseconds in CI. They don't catch runtime bugs in the orchestrations, but they catch the most common kind of failure: I've passed the wrong kwarg, or the participants list is empty, or the build is missing an orchestrator. (Group Chat requires one of orchestrator_agent / orchestrator / selection_func — forget to pass one and the builder raises ValueError. The test catches it.)
The full suite is now 83 tests, all offline, ~2 seconds. The four runtime patterns plus the two new ones, plus content / structured / approval helpers.
Running them
# Group Chat — writer ↔ critic round-robin, max 4 rounds
make group_chat PROMPT="Draft a one-paragraph product launch announcement."
# Magentic — planning manager + researcher + writer + critic
make magentic PROMPT="Plan a one-week curriculum for a beginner Python class."
Both honor the same MODEL= override the other workflows do. The Ollama default (granite4.1:3b) handles Group Chat fine. Magentic genuinely needs a stronger model — try qwen3.5:latest if you have it pulled, or point OLLAMA_MODEL= at a 7B+ model.
What this changes about the project
The orchestration patterns table in README.md now lists six rows (five MAF patterns + custom graph) instead of four. Two new modules — workflows/group_chat.py and workflows/magentic.py — each under 150 lines. Twelve new tests. Two new Makefile targets.
And one piece of advice the reading of the PDF surfaced and the post hammered home: try Group Chat before Magentic. Most of the time the simpler manager is enough, and the three loop caps in Magentic exist because the manager can get stuck. Save Magentic for the open-ended tasks that actually justify a planning loop.