The shape
A coordinator agent needs to fan out to N specialist agents in parallel. If any fails fatally, cancel the rest. If they all succeed, collect their results. Standard fan-out / fan-in shape.
golang.org/x/sync/errgroup handles this in ~15 lines.
The basic pattern
import "golang.org/x/sync/errgroup"
func dispatchParallel(ctx context.Context, query Query) ([]Result, error) {
g, ctx := errgroup.WithContext(ctx)
results := make([]Result, len(specialists))
for i, agent := range specialists {
i, agent := i, agent // capture
g.Go(func() error {
resp, err := agent.Run(ctx, query)
if err != nil { return err }
results[i] = resp
return nil
})
}
if err := g.Wait(); err != nil {
return nil, err
}
return results, nil
}
errgroup.WithContext returns a derived context that’s cancelled when any g.Go‘d function returns an error. That cancellation propagates to all other in-flight goroutines via ctx.
The bounded variant
Without a limit, fan-out spawns N goroutines. For large N, that’s wasteful or hostile to downstream services. Bound concurrency with g.SetLimit:
g, ctx := errgroup.WithContext(ctx)
g.SetLimit(8) // max 8 concurrent goroutines
for _, item := range items {
item := item
g.Go(func() error {
return process(ctx, item)
})
}
g.Wait()
The SetLimit was added to errgroup later than the original API; it’s the right default for any fan-out where N isn’t small.
The collect-some-errors variant
By default, the first error cancels everything. Sometimes you want to collect all the errors and decide afterwards:
g, ctx := errgroup.WithContext(ctx)
var (
errs []error
mu sync.Mutex
)
for _, agent := range specialists {
agent := agent
g.Go(func() error {
if err := agent.Run(ctx, query); err != nil {
mu.Lock()
errs = append(errs, err)
mu.Unlock()
}
return nil // don't propagate; collect instead
})
}
g.Wait()
if len(errs) > 0 {
return nil, fmt.Errorf("%d agents failed: %w", len(errs), errors.Join(errs...))
}
errors.Join (Go 1.20+) wraps multiple errors into one. Useful for the “some agents failed; tell me about all of them” reporting case.
The pitfall: shared state
The classic mistake — appending to a shared slice without a lock:
// WRONG — concurrent append, race
var results []Result
g.Go(func() error {
r := agent.Run(ctx, query)
results = append(results, r) // RACE
return nil
})
Fix: pre-allocate with indices (first pattern) or use a mutex (collect pattern). The race detector catches this if your tests exercise the parallel path.
The pitfall: outer context not propagated
If the parent of errgroup.WithContext doesn’t have a cancellation handler, an early error stops the goroutines but the caller doesn’t know to stop work. Always pass the derived context to your worker goroutines (the pattern above does); don’t use the parent context inside the worker.
What Genie’s coordinator does
agents/financial_supervisor fans out to specialists (analyzer, forecaster, anomaly_detector, recommender) in parallel with the bounded variant. Limit of 6. First error from any specialist cancels the rest — the supervisor has nothing useful to do without all specialists’ input.
The pattern shows up everywhere a multi-agent system parallelises. Worth knowing exactly; worth not over-engineering past errgroup + SetLimit for most cases.
For truly large fan-outs (thousands of items), errgroup’s limit isn’t ideal — a worker-pool pattern with a job channel is more efficient. The threshold is around 1000 concurrent goroutines; below that, errgroup wins on readability.