Go Goroutine Leak Debugging: 5 Structured Concurrency Patterns to Eliminate Leaks in 2026
Have You Ever Encountered This Mysterious Scenario?
Your production service runs fine for a few hours, then memory usage keeps climbing, CPU hits 90%, and when you check pprof — thousands of goroutines are stuck on channel receives, waiting for data that never arrives. After a restart, everything returns to normal, but the issue reappears hours later. This is the classic goroutine leak, the most insidious and deadly bug in Go concurrent programming.
What's worse, goroutine leaks don't throw errors directly. They slowly consume system resources like a chronic poison until the OOM Killer steps in. It's 2026 — we should no longer write concurrent code with "fire-and-forget" goroutines. Structured Concurrency is the solution.
What Is Structured Concurrency?
Structured concurrency originated from Martin Odersky's 2018 proposal. The core idea: the lifecycle of concurrent subtasks must be strictly controlled by the parent task. Just as structured programming eliminated the chaos of goto, structured concurrency eliminates the goroutine leaks caused by fire-and-forget.
| Feature | Unstructured Concurrency | Structured Concurrency |
|---|---|---|
| Lifecycle | Uncontrollable, may leak | Subtasks exit when parent exits |
| Error Propagation | Manual handling required | Automatic upward propagation |
| Resource Cleanup | No guarantee | Guaranteed cleanup |
| Debugging Difficulty | Very high | Traceable |
| Representative Pattern | go func(){} |
errgroup, Context cancellation |
While Go doesn't have language-level structured concurrency support, we can build a complete structured concurrency system using standard libraries like context, errgroup, and sync.
5 Root Causes of Goroutine Leaks
1. Sending to Unbuffered Channel with No Receiver
func leak1() {
ch := make(chan int)
go func() {
ch <- 42 // Blocks forever, no receiver
}()
// Function returns, goroutine leaks
}
2. Receiving from Never-Closed Channel
func leak2() {
ch := make(chan int, 10)
go func() {
for v := range ch { // ch never closes, blocks forever
fmt.Println(v)
}
}()
}
3. Context Not Cancelled
func leak3(ctx context.Context) {
ctx2 := context.Background() // Doesn't inherit cancellation signal
go func() {
select {
case <-ctx2.Done(): // Never triggers
case <-time.After(10 * time.Hour):
}
}()
}
4. WaitGroup Count Mismatch
func leak4() {
var wg sync.WaitGroup
wg.Add(5)
for i := 0; i < 3; i++ { // Only 3 started, but Add was 5
go func() { wg.Done() }()
}
wg.Wait() // Blocks forever
}
5. HTTP Response Body Not Closed
func leak5() {
resp, _ := http.Get("https://example.com")
// defer resp.Body.Close() // Forgot to close!
// Causes underlying transport goroutine leak
}
Pattern 1: errgroup — Concurrency with Error Propagation
errgroup is the most fundamental building block of structured concurrency. It guarantees: all sub-goroutines complete (or when any fails, the rest are cancelled), then the parent task continues.
Step-by-Step Guide
Step 1: Install errgroup
go get golang.org/x/sync/errgroup
Step 2: Basic usage — concurrent API requests
package main
import (
"context"
"fmt"
"net/http"
"time"
"golang.org/x/sync/errgroup"
)
func fetchAPI(ctx context.Context, url string) ([]byte, error) {
req, err := http.NewRequestWithContext(ctx, http.MethodGet, url, nil)
if err != nil {
return nil, err
}
resp, err := http.DefaultClient.Do(req)
if err != nil {
return nil, err
}
defer resp.Body.Close()
return nil, nil
}
func fetchAll(ctx context.Context, urls []string) error {
g, ctx := errgroup.WithContext(ctx)
results := make([][]byte, len(urls))
for i, url := range urls {
i, url := i, url
g.Go(func() error {
data, err := fetchAPI(ctx, url)
if err != nil {
return err
}
results[i] = data
return nil
})
}
if err := g.Wait(); err != nil {
return fmt.Errorf("fetch failed: %w", err)
}
return nil
}
func main() {
ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
defer cancel()
urls := []string{
"https://api.example.com/users",
"https://api.example.com/orders",
"https://api.example.com/products",
}
if err := fetchAll(ctx, urls); err != nil {
fmt.Println("Error:", err)
}
}
Key Point: The ctx created by errgroup.WithContext is automatically cancelled when any goroutine returns an error, and other goroutines sense this via ctx.Done() and exit.
Pattern 2: Semaphore — Limit Concurrency
When you need to control concurrency (e.g., limit DB connections), use the semaphore package.
Complete Code
package main
import (
"context"
"fmt"
"sync/atomic"
"time"
"golang.org/x/sync/errgroup"
"golang.org/x/sync/semaphore"
)
func processWithLimit(ctx context.Context, tasks []string, maxConcurrency int64) error {
sem := semaphore.NewWeighted(maxConcurrency)
g, ctx := errgroup.WithContext(ctx)
var activeCount int64
for _, task := range tasks {
task := task
if err := sem.Acquire(ctx, 1); err != nil {
return fmt.Errorf("acquire semaphore: %w", err)
}
g.Go(func() error {
defer sem.Release(1)
current := atomic.AddInt64(&activeCount, 1)
defer atomic.AddInt64(&activeCount, -1)
fmt.Printf("Processing %s (active: %d/%d)\n", task, current, maxConcurrency)
select {
case <-time.After(500 * time.Millisecond):
return nil
case <-ctx.Done():
return ctx.Err()
}
})
}
return g.Wait()
}
func main() {
ctx := context.Background()
tasks := make([]string, 20)
for i := range tasks {
tasks[i] = fmt.Sprintf("task-%d", i)
}
if err := processWithLimit(ctx, tasks, 5); err != nil {
fmt.Println("Error:", err)
}
}
Pattern 3: Pipeline — Stream Processing
The Pipeline pattern flows data through multiple stages, each stage being a set of goroutines connected by channels.
Complete Code
package main
import (
"context"
"fmt"
"sync"
"golang.org/x/sync/errgroup"
)
func generate(ctx context.Context, nums ...int) <-chan int {
out := make(chan int)
go func() {
defer close(out)
for _, n := range nums {
select {
case out <- n:
case <-ctx.Done():
return
}
}
}()
return out
}
func square(ctx context.Context, in <-chan int) <-chan int {
out := make(chan int)
go func() {
defer close(out)
for n := range in {
select {
case out <- n * n:
case <-ctx.Done():
return
}
}
}()
return out
}
func filter(ctx context.Context, in <-chan int, predicate func(int) bool) <-chan int {
out := make(chan int)
go func() {
defer close(out)
for n := range in {
if predicate(n) {
select {
case out <- n:
case <-ctx.Done():
return
}
}
}
}()
return out
}
func runPipeline(ctx context.Context) error {
g, ctx := errgroup.WithContext(ctx)
var mu sync.Mutex
results := []int{}
ch := generate(ctx, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
ch = square(ctx, ch)
ch = filter(ctx, ch, func(n int) bool { return n > 20 })
g.Go(func() error {
for n := range ch {
mu.Lock()
results = append(results, n)
mu.Unlock()
}
return nil
})
if err := g.Wait(); err != nil {
return err
}
fmt.Println("Results:", results)
return nil
}
func main() {
ctx, cancel := context.WithCancel(context.Background())
defer cancel()
runPipeline(ctx)
}
Pattern 4: Fan-out/Fan-in — Distribute and Aggregate
Distribute tasks to multiple workers (fan-out), then aggregate results (fan-in).
Complete Code
package main
import (
"context"
"fmt"
"sync"
"golang.org/x/sync/errgroup"
)
func fanOut(ctx context.Context, in <-chan int, workerCount int) []<-chan int {
channels := make([]<-chan int, workerCount)
for i := 0; i < workerCount; i++ {
channels[i] = processWorker(ctx, in, i)
}
return channels
}
func processWorker(ctx context.Context, in <-chan int, id int) <-chan int {
out := make(chan int)
go func() {
defer close(out)
for n := range in {
result := n * n
select {
case out <- result:
case <-ctx.Done():
return
}
}
}()
return out
}
func fanIn(ctx context.Context, channels ...<-chan int) <-chan int {
out := make(chan int)
var wg sync.WaitGroup
wg.Add(len(channels))
for _, ch := range channels {
ch := ch
go func() {
defer wg.Done()
for n := range ch {
select {
case out <- n:
case <-ctx.Done():
return
}
}
}()
}
go func() {
wg.Wait()
close(out)
}()
return out
}
func main() {
ctx, cancel := context.WithCancel(context.Background())
defer cancel()
in := make(chan int, 10)
for i := 1; i <= 10; i++ {
in <- i
}
close(in)
workerChs := fanOut(ctx, in, 3)
merged := fanIn(ctx, workerChs...)
for result := range merged {
fmt.Println(result)
}
}
Pattern 5: Worker Pool — Controlled Work Pool
The most versatile structured concurrency pattern, strictly controlling goroutine count and lifecycle.
Complete Code
package main
import (
"context"
"fmt"
"sync"
"time"
)
type Task struct {
ID int
Input string
}
type Result struct {
TaskID int
Output string
Err error
}
type WorkerPool struct {
workerCount int
tasks chan Task
results chan Result
wg sync.WaitGroup
cancel context.CancelFunc
}
func NewWorkerPool(ctx context.Context, workerCount, taskBufSize int) *WorkerPool {
ctx, cancel := context.WithCancel(ctx)
pool := &WorkerPool{
workerCount: workerCount,
tasks: make(chan Task, taskBufSize),
results: make(chan Result, taskBufSize),
cancel: cancel,
}
pool.start(ctx)
return pool
}
func (p *WorkerPool) start(ctx context.Context) {
for i := 0; i < p.workerCount; i++ {
p.wg.Add(1)
go func(workerID int) {
defer p.wg.Done()
for {
select {
case task, ok := <-p.tasks:
if !ok {
return
}
result := p.process(ctx, workerID, task)
select {
case p.results <- result:
case <-ctx.Done():
return
}
case <-ctx.Done():
return
}
}
}(i)
}
}
func (p *WorkerPool) process(ctx context.Context, workerID int, task Task) Result {
select {
case <-time.After(100 * time.Millisecond):
return Result{
TaskID: task.ID,
Output: fmt.Sprintf("worker-%d processed: %s", workerID, task.Input),
}
case <-ctx.Done():
return Result{TaskID: task.ID, Err: ctx.Err()}
}
}
func (p *WorkerPool) Submit(task Task) { p.tasks <- task }
func (p *WorkerPool) Results() <-chan Result { return p.results }
func (p *WorkerPool) Shutdown() {
close(p.tasks)
p.wg.Wait()
close(p.results)
}
func (p *WorkerPool) Cancel() { p.cancel() }
func main() {
ctx := context.Background()
pool := NewWorkerPool(ctx, 4, 100)
go func() {
for i := 0; i < 20; i++ {
pool.Submit(Task{ID: i, Input: fmt.Sprintf("data-%d", i)})
}
pool.Shutdown()
}()
for result := range pool.Results() {
fmt.Printf("Task %d: %s (err: %v)\n", result.TaskID, result.Output, result.Err)
}
}
Pitfall Guide
Pitfall 1: Swallowing Errors in errgroup
// ❌ Wrong: returning nil, error is swallowed
g.Go(func() error {
if err := doWork(); err != nil {
log.Println(err)
return nil
}
return nil
})
// ✅ Correct: must return the error
g.Go(func() error {
return doWork()
})
Pitfall 2: Not Propagating Context
// ❌ Wrong: using context.Background(), cancellation signal lost
g.Go(func() error {
return fetch(context.Background(), url)
})
// ✅ Correct: use errgroup's ctx
g.Go(func() error {
return fetch(ctx, url)
})
Pitfall 3: Loop Variable Capture Error
// ❌ Wrong: Before Go 1.22, i is a loop variable reference
for i := 0; i < 10; i++ {
g.Go(func() error {
return process(i) // i might all be 10
})
}
// ✅ Correct: Go 1.22+ fixes this automatically, or manually bind
for i := 0; i < 10; i++ {
i := i
g.Go(func() error {
return process(i)
})
}
Pitfall 4: Channel Not Closed Causing Range Deadlock
// ❌ Wrong: producer doesn't close channel
func produce() <-chan int {
ch := make(chan int)
go func() {
for i := 0; i < 10; i++ {
ch <- i
}
// forgot close(ch)
}()
return ch
}
// ✅ Correct: must defer close
func produce() <-chan int {
ch := make(chan int)
go func() {
defer close(ch)
for i := 0; i < 10; i++ {
ch <- i
}
}()
return ch
}
Pitfall 5: Semaphore Used Outside errgroup Causing Leak
// ❌ Wrong: Acquire outside g.Go, semaphore occupied but goroutine not started
sem.Acquire(ctx, 1)
g.Go(func() error {
defer sem.Release(1)
return work()
})
// ✅ Correct: Acquire inside g.Go
g.Go(func() error {
if err := sem.Acquire(ctx, 1); err != nil {
return err
}
defer sem.Release(1)
return work()
})
Error Troubleshooting
| # | Error Message | Cause | Solution |
|---|---|---|---|
| 1 | fatal error: all goroutines are asleep - deadlock! |
All goroutines blocked on channel ops | Check channel has producer/consumer, ensure channel will be closed |
| 2 | context deadline exceeded |
Operation timeout, context cancelled | Increase timeout or optimize performance, check for slow queries |
| 3 | errgroup: Go called after Wait |
Go called after Wait | Ensure all Go calls complete before Wait |
| 4 | panic: sync: WaitGroup is reused before previous Wait has returned |
WaitGroup misuse | Create new WaitGroup each time, or ensure previous Wait completes |
| 5 | panic: close of closed channel |
Double close on channel | Use sync.Once to protect close, or have one goroutine responsible |
| 6 | panic: send on closed channel |
Send after channel closed | Ensure sender knows when channel closes, use select to detect |
| 7 | goroutine profile: total XXX (count keeps growing) |
Goroutine leak | Use pprof to investigate, check non-exiting goroutine stacks |
| 8 | runtime: out of memory |
Too many goroutines exhausting memory | Limit concurrency, use worker pool or semaphore |
| 9 | signal: killed (OOM Killer) |
System out of memory | Reduce goroutine count, increase memory limit or optimize usage |
| 10 | import cycle not allowed |
Circular dependency in concurrent code | Reorganize package structure, extract shared types to separate package |
Advanced Optimization
1. Monitor Goroutine Count with runtime/pprof
go func() {
ticker := time.NewTicker(30 * time.Second)
defer ticker.Stop()
for range ticker.C {
fmt.Printf("Goroutines: %d\n", runtime.NumGoroutine())
}
}()
2. Add Timeout Protection for Each Goroutine
func safeGoroutine(ctx context.Context, fn func() error) error {
ctx, cancel := context.WithTimeout(ctx, 30*time.Second)
defer cancel()
done := make(chan error, 1)
go func() { done <- fn() }()
select {
case err := <-done:
return err
case <-ctx.Done():
return fmt.Errorf("goroutine timed out: %w", ctx.Err())
}
}
3. Use go-leak Test Framework to Detect Leaks
import "go.uber.org/goleak"
func TestMain(m *testing.M) {
goleak.VerifyTestMain(m)
}
4. Structured Logging with Goroutine Correlation
func tracedGoroutine(ctx context.Context, name string, fn func(context.Context) error) error {
logger := slog.With("goroutine", name, "traceID", ctx.Value("traceID"))
logger.Info("goroutine started")
err := fn(ctx)
logger.Info("goroutine finished", "error", err)
return err
}
Comparison Analysis
| Dimension | errgroup | Semaphore | Pipeline | Fan-out/Fan-in | Worker Pool |
|---|---|---|---|---|---|
| Concurrency Control | Unlimited | Limited | Staged serial | Unlimited | Fixed workers |
| Error Handling | Auto propagate | Manual | Manual | Manual | Manual |
| Use Case | Concurrent requests | Rate limiting | Stream processing | Compute-intensive | General tasks |
| Complexity | Low | Medium | Medium | High | High |
| Goroutine Safe | ✅ | ✅ | ⚠️ Care needed | ⚠️ Care needed | ✅ |
| Backpressure | ❌ | ✅ | ✅ | ❌ | ✅ |
| Cancel Propagation | ✅ Auto | ✅ Auto | ⚠️ Manual | ⚠️ Manual | ✅ |
Summary: The essence of goroutine leaks is "uncontrolled concurrency" — subtask lifecycles escape parent task control. Structured concurrency ensures concurrency safety through five patterns: errgroup (error propagation), semaphore (concurrency limiting), pipeline (stream processing), fan-out/fan-in (distribute and aggregate), and worker pool (controlled pooling). In 2026, say goodbye to bare
go func(){}and embrace structured concurrency.
Recommended Online Tools
- JSON Formatter: /en/json/format
- Base64 Encode/Decode: /en/encode/base64
- cURL to Code: /en/dev/curl-to-code
- Go Playground: /en/dev/go-playground
Try these browser-local tools — no sign-up required →