Your 1M RPS Service is a Ticking Time Bomb
“Goroutines are cheap.” Every Go developer has heard this. And for common use cases, it’s true. But when you start aiming for 1 Million Requests Per Second, that saying becomes dangerous.
I ran some tests to see exactly when the “Naive” approach to Go concurrency falls apart and how an adaptive worker pool changes the game.
The Test: 1 Million RPS Burst
Section titled “The Test: 1 Million RPS Burst”I used a 10-core machine and pushed 1 Million requests per second for 30 seconds. Each request was designed to be somewhat realistic: 20ms of processing and a 500KB memory allocation (like handling a decoded JSON object).
Case A: The Naive approach (No Pool)
Section titled “Case A: The Naive approach (No Pool)”func handle(w http.ResponseWriter, r *http.Request) { go process(r.Context()) // Just spawn and hope}Case B: Adaptive Pool
Section titled “Case B: Adaptive Pool”pool, _ := adaptivepool.New( adaptivepool.WithMaxWorkers(5000), adaptivepool.WithQueueSize(100000),)// ... in handlererr := pool.Submit(ctx, job)The Results
Section titled “The Results”The Stability Paradox
Section titled “The Stability Paradox”The Naive version immediately tried to spawn over 100,000 goroutines. The result was a mess. The CPU spent more time on context switching and garbage collection than actual work.
With the adaptive pool, even though we were rejecting millions of incoming requests, we actually finished 4.6 times more jobs than the naive version did. By capping the workers at 5,000, the CPU could actually focus on finishing tasks rather than managing them.
The Memory Tax
Section titled “The Memory Tax”RAM usage is where the naive approach really fails.
- Naive: Hit 50.86 GB of RAM. Most standard servers would have crashed long before this.
- Adaptive Pool: Stayed steady at 1.41 GB.
The naive approach is essentially a landmine. It works fine during your dev tests, but in production it’ll just eat RAM until the OS kills your process.
Latency Issues
Section titled “Latency Issues”When the system is overwhelmed, users feel it.
- Naive: Average latency shot up to 2,063 ms. That’s a 100x increase for a 20ms task.
- Adaptive Pool: Kept it around 161 ms under the same pressure.
The pool rejected 31 million tasks, which sounds bad, but it’s actually exactly what you want. It’s better to give some users an immediate error than to make every single user wait 2 seconds before the whole server crashes anyway.
Comparison Table
Section titled “Comparison Table”| Metric | Naive Approach | Adaptive Pool |
|---|---|---|
| Reliability | High risk of OOM | Resource-bounded |
| Latency | 2,000ms+ (Unusable) | 160ms (Acceptable) |
| System Health | Near-zero CPU for work | Focused on execution |
| Result | Server crash | System stays up |
Final Thoughts
Section titled “Final Thoughts”Scaling to high throughput isn’t about how many goroutines you can Create. It’s about how many you can manage before the overhead kills your system.
Concurrency is a resource, just like memory or CPU. If you don’t bound it, your system isn’t production-ready.
Check out the code on GitHub: go-adaptive-pool