> Consider we have an API receiving 50 requests per second. Say we want to keep the average response time under 200 ms so that users don’t get frustrated. We can estimate the number of concurrent requests that should be in the system and adjust the number of threads accordingly: L=λW=50×0.2=10 So, we need around 10 concurrent workers processing requests!
Uh... What? Concept seems misapplied
Example: if we only want to keep it under 1 sec (our users are more patient), then we need 50 x 1 = 50, even MORE workers?