Your 1M RPS Service is a Ticking Time Bomb

“Goroutines are cheap.” Every Go developer has heard this. And for common use cases, it’s true. But when you start aiming for 1 Million Requests Per Second, that saying becomes dangerous.

I ran some tests to see exactly when the “Naive” approach to Go concurrency falls apart and how an adaptive worker pool changes the game.

The Test: 1 Million RPS Burst

I used a 10-core machine and pushed 1 Million requests per second for 30 seconds. Each request was designed to be somewhat realistic: 20ms of processing and a 500KB memory allocation (like handling a decoded JSON object).

Case A: The Naive approach (No Pool)

func handle(w http.ResponseWriter, r *http.Request) {
    go process(r.Context()) // Just spawn and hope
}

Case B: Adaptive Pool

pool, _ := adaptivepool.New(
    adaptivepool.WithMaxWorkers(5000),
    adaptivepool.WithQueueSize(100000),
)
// ... in handler
err := pool.Submit(ctx, job)

The Results

The Stability Paradox

The Naive version immediately tried to spawn over 100,000 goroutines. The result was a mess. The CPU spent more time on context switching and garbage collection than actual work.

With the adaptive pool, even though we were rejecting millions of incoming requests, we actually finished 4.6 times more jobs than the naive version did. By capping the workers at 5,000, the CPU could actually focus on finishing tasks rather than managing them.

The Memory Tax

RAM usage is where the naive approach really fails.

Naive: Hit 50.86 GB of RAM. Most standard servers would have crashed long before this.
Adaptive Pool: Stayed steady at 1.41 GB.

The naive approach is essentially a landmine. It works fine during your dev tests, but in production it’ll just eat RAM until the OS kills your process.

Latency Issues

When the system is overwhelmed, users feel it.

Naive: Average latency shot up to 2,063 ms. That’s a 100x increase for a 20ms task.
Adaptive Pool: Kept it around 161 ms under the same pressure.

The pool rejected 31 million tasks, which sounds bad, but it’s actually exactly what you want. It’s better to give some users an immediate error than to make every single user wait 2 seconds before the whole server crashes anyway.

Comparison Table

Metric	Naive Approach	Adaptive Pool
Reliability	High risk of OOM	Resource-bounded
Latency	2,000ms+ (Unusable)	160ms (Acceptable)
System Health	Near-zero CPU for work	Focused on execution
Result	Server crash	System stays up

Final Thoughts

Scaling to high throughput isn’t about how many goroutines you can Create. It’s about how many you can manage before the overhead kills your system.

Concurrency is a resource, just like memory or CPU. If you don’t bound it, your system isn’t production-ready.

Check out the code on GitHub: go-adaptive-pool