Markdown source

Worker Pool Isolation Pattern Markdown source

Readable source view for humans. The raw Markdown endpoint remains available for crawlers and agent readers.

---
title: "Worker Pool Isolation Pattern"
description: "Separate worker pools per task type so a slow or failing dependency can't starve unrelated work — the bulkhead pattern applied to concurrent processing."
kind: snippet
maturity: seedling
confidence: medium
origin: ai-drafted
author: "Agent"
directedBy: "krow"
tags: [architecture, concurrency]
published: 2026-04-07
modified: 2026-04-21
wordCount: 470
readingTime: 3
related: [pipeline-stage-communication, go-dns-scanner-4000qps, aimd-rate-limiting]
url: https://krowdev.com/snippet/worker-pool-isolation/
---
## Agent Context

- Canonical: https://krowdev.com/snippet/worker-pool-isolation/
- Markdown: https://krowdev.com/snippet/worker-pool-isolation.md
- Full corpus: https://krowdev.com/llms-full.txt
- Kind: snippet
- Maturity: seedling
- Confidence: medium
- Origin: ai-drafted
- Author: Agent
- Directed by: krow
- Published: 2026-04-07
- Modified: 2026-04-21
- Words: 470 (3 min read)
- Tags: architecture, concurrency
- Related: pipeline-stage-communication, go-dns-scanner-4000qps, aimd-rate-limiting
- Content map:
  - h2: The Problem
  - h2: The Fix: One Pool Per Concern
  - h2: Sizing
  - h2: Key Details
  - h2: When to Use This
  - h2: Sources
- Crawl policy: same canonical content is exposed through HTML, Markdown, and llms-full; no crawler-specific content gate.

Run different categories of work in separate, bounded pools. A spike in one category can't starve the others. This pairs naturally with [pipeline stage communication](/snippet/pipeline-stage-communication/) — each stage gets its own pool. For rate-sensitive pools, add [AIMD rate limiting](/note/aimd-rate-limiting/).

## The Problem

A single shared worker pool handles API calls, file processing, and database writes. The API starts responding slowly. Workers pile up waiting on API responses. File processing and database writes — which are fine — queue behind them and stall. One slow dependency takes down everything.

This is the same failure mode the [Go DNS scanner](/article/go-dns-scanner-4000qps/) had to avoid when network-bound probes and local parsing shared the same concurrency budget.

## The Fix: One Pool Per Concern

```go
type WorkerPool struct {
    name    string
    workers int
    queue   chan Job
    sem     chan struct{} // bounds concurrency
}

func NewPool(name string, workers, queueSize int) *WorkerPool {
    p := &WorkerPool{
        name:    name,
        workers: workers,
        queue:   make(chan Job, queueSize),
        sem:     make(chan struct{}, workers),
    }
    go p.run()
    return p
}

pools := map[string]*WorkerPool{
    "api":   NewPool("api", 10, 100),
    "files": NewPool("files", 4, 50),
    "db":    NewPool("db", 8, 200),
}
```

The API pool fills up? The file and database pools keep moving. Each pool has its own concurrency limit and backpressure via its own queue.

## Sizing

| Pool | Size by | Watch for |
|------|---------|-----------|
| I/O-bound (API calls, network) | Number of connections you can sustain | Queue depth growing = upstream is slow |
| CPU-bound (parsing, transforms) | Number of cores | CPU saturation = pool is too large |
| External writes (DB, storage) | Connection pool limit of the backend | Timeouts = reduce pool or batch writes |

Start small, measure, increase. A pool that's too large creates more contention than it solves.

## Key Details

**Bounded queues, not unbounded.** An unbounded queue hides backpressure — memory grows silently until the process crashes. Use a buffered channel or ring buffer with a hard cap. When the queue is full, reject or apply backpressure to the caller.

**Per-pool timeouts.** API calls might need a 30-second timeout. File operations might need 5 seconds. A shared timeout is wrong for both. Set deadlines per pool based on the expected latency profile of that work type.

**Monitor each pool independently.** Track queue depth, active workers, completion rate, and error rate per pool. A healthy aggregate hides a sick pool.

## When to Use This

- Multiple dependency types with different latency profiles
- Any system where one slow path shouldn't block unrelated fast paths
- Worker counts that need independent tuning per workload

This is the bulkhead pattern from ship design — compartments that prevent a hull breach from flooding the entire vessel. Same idea, applied to goroutines.

## Sources

- Microsoft Learn, [Bulkhead pattern](https://learn.microsoft.com/en-us/azure/architecture/patterns/bulkhead)
- Go, [Concurrency patterns: pipelines and cancellation](https://go.dev/blog/pipelines)