队列理论简介

Why Disaster Happens at the Edges: An Introduction to Queue Theory

样例代码


  • When it comes to IT performance, amateurs look at averages. Professionals look at distributions.
  • The greater the variance in a system, the more of those outlier experiences there will be — and the more expensive it is to handle or avoid them.
  • Queues are everywhere in digital systems: executors, sockets, locks. Any process that operates asynchronously probably depends on a queue.

A queue’s performance depends on several factors including:

  1. Arrival rate: how many jobs arrive at the queue in a certain amount of time
  2. Service rate: how many jobs can be served in a certain amount of time
  3. Service time: how long it takes to process each job
  4. Service discipline: how jobs are prioritized (FIFO/LIFO/Priority)

This suggests an important rule of thumb: for decent quality of performance, keep utilization below 75%.
This means provisioning, not for typical loads but extreme ones. Without overcapacity, queues will form and latency will increase.


The lesson: whatever its source, variance is the enemy of performance.


The question then is what to do about it. It’s basically a trade-off between latency and error rate.
If we don’t return errors, or find an alternative way to shed the excess load, then latency will inevitably increase.


  • One approach, as we’ve seen, is to cap queue size and shed any job over the limit. However it’s handled, the goal is to remove excess requests from the overloaded queue.
  • Another closely related approach is to throttle the arrival rate. Instead of regulating the absolute number of jobs in the queue, we can calculate how much work the queue can handle in a given amount of time and start shedding load when the arrival rate exceeds the target.

Architecture Takeaways

To sum up: Variance is the enemy of performance and the source of much of the latency we encounter when using software.

To keep latency to a minimum:

  • As a rule of thumb, target utilization below 75%
  • Steer slower workloads to paths with lower utilization
  • Limit variance as much as possible when utilization is high
  • Implement backpressure in systems where it is not built-in
  • Use throttling and load shedding to reduce pressure on downstream queues

It follows that developers should aim to design and implement software that delivers not just high average performance, but consistent performance, i,e, with low variance.
Even an incremental reduction in variance can improve the user experience more than an equivalent increase in raw average speed.

The mean isn’t meaningless. It’s just deceptive.
To transform user experience, the place to put your effort isn’t in the middle of the curve but on the ends.
It’s the rare events — the outliers — that tell you what you need to know.

Developer Weekly 16

Write-Ahead Log
复杂度应对之道 – COLA应用架构

代码防腐实用技术
Flink学习–Flink on Yarn
技术分享:Prometheus是怎么存储数据的(陈皓)
程序员修炼之道读书笔记
When you need file storage for your project, website, or application, Web3.Storage is here for you.
万字详文阐释程序员修炼之道
容器日志采集利器Log-Pilot
Java线上问题排查神器Arthas快速上手与原理浅谈
监控日志系列—- Filebeat原理
深入Spring Boot:ClassLoader的继承关系和影响
天涯 kkndme 神贴聊房价
Java Performance Tuning