Queueing Theory
The most complete free simulator on the market: 5 analytical models (M/M/1, M/M/c, Erlang-B/C), Discrete Event Simulation (DES), NHPP, priority queues, queueing networks and real-time animation — in the browser, no install required.
Ready-made scenarios
Choose a typical scenario or configure your own parameters below.
Model and parameters
Model multiple customer types with distinct proportions and priorities. With PRIO discipline, higher-priority customers jump the queue — the tool calculates separate Wq and Lq per type.
| Type | Share % | Priority | Own μ? |
|---|
Every system starts in transient regime — the queue is still converging to equilibrium. Set the warm-up period to discard it from metrics and see the 🟢 Steady-state or 🟡 Transient badge.
Queueing Theory: Complete Guide with Practical Examples
Queueing Theory is the branch of applied mathematics that models systems where entities — customers, patients, data packets, industrial parts — arrive, wait when all servers are busy and are then served. Created in 1909 by Danish engineer Agner Krarup Erlang to dimension Copenhagen’s telephone exchanges, it has become in one century the basis for decisions in:
- Industrial engineering: line balancing, dock sizing, production bottlenecks
- Healthcare: ICU capacity, hospital triage, outpatient queues
- Technology: web servers, databases, API gateways, network printers
- Telecom: PBX trunks, call centers, network buffers
- Retail and logistics: airport check-in, supermarket checkouts, toll booths, gas stations
Software such as Arena (Rockwell Automation), Simul8 and AnyLogic use Discrete Event Simulation (DES) for the most complex cases. For the vast majority of real-world problems, however, the analytical formulas in this tool deliver exact results — in milliseconds.
Why this is the best free queueing simulator online
No other free tool combines — without installation or sign-up — the features this simulator offers:
| Feature | This simulator | Free web calculators | Arena / Simul8 |
|---|---|---|---|
| M/M/1, M/M/c, M/D/1, M/M/1/K, Erlang-B | ✅ | partial | ✅ |
| M/G/c (Weibull, Lognormal, Erlang-k…) | ✅ | ❌ | ✅ |
| Discrete Event Simulation (DES) | ✅ | ❌ | ✅ |
| Real-time queue animation | ✅ | ❌ | ✅ |
| NHPP — variable λ per time slot | ✅ | ❌ | ✅ |
| Priority queues (VIP, urgency) | ✅ | ❌ | ✅ |
| Queue networks — series, parallel, conditional | ✅ | ❌ | ✅ |
| Distribution fitting with KS test | ✅ | ❌ | ❌ |
| Multi-server comparison table | ✅ | ❌ | ❌ |
| 24-step guided tutorial | ✅ | ❌ | ❌ |
| Free, no installation, no login | ✅ | ✅ | ❌ |
| Works on mobile | ✅ | partial | ❌ |
Arena and Simul8 cost thousands of dollars per year and require installation on Windows. JaamSim and SimPy are free but require programming. This simulator delivers 24 out of 24 evaluated features — the highest score among all free tools and matching or surpassing commercial tools in accessibility.
When to use this simulator vs. Arena/Simul8
Use this simulator when you need to:
- Quickly size call centers, bank queues, service desks, ICU beds, toll booths
- Validate with DES if the system has non-exponential distributions or variable load
- Model a production network with stages in series, parallel, or conditional routing
- Teach queueing theory — the 22 pre-configured scenarios and 24-step tutorial are ideal for classroom use
Use Arena/Simul8 when you have: systems with hundreds of interconnected queues with complex feedback loops, 3D factory modeling, or highly customized business logic requiring advanced scripting.
Kendall Notation — how to describe any queue
Every queueing system is described by the compact notation A/S/c/K:
| Position | What it represents | Common symbols |
|---|---|---|
| A | Arrival process | M (Poisson/random), D (fixed/deterministic) |
| S | Service time distribution | M (exponential), D (deterministic/fixed) |
| c | Number of parallel servers | 1, 2, 3… |
| K | Maximum system capacity | number or ∞ (omitted = unlimited) |
Reading the notation — examples
| Real system | Notation | Reading |
|---|---|---|
| Hospital triage desk | M/M/1 | Random arrivals, variable service, 1 nurse |
| Bank branch with 2 tellers | M/M/2 | Random arrivals, variable service, 2 parallel tellers |
| Paced assembly line | M/D/1 | Random arrivals, fixed cycle time, 1 station |
| Parking lot with 15 spaces | M/M/1/15 | Random arrivals, 1 “server”, maximum capacity 15 |
| PBX with 4 trunks | M/M/4/4 | Erlang-B: 4 servers, no queue, excess is lost |
The three fundamental parameters
λ — Arrival rate
Average number of entities arriving per unit of time. Always obtained from real observation or historical data.
| Scenario | λ | Unit |
|---|---|---|
| Bank branch (peak hour) | 20 | customers/hour |
| Web server in production | 120 | requests/second |
| Emergency room | 15 | patients/hour |
| Toll booth (per lane) | 120 | cars/hour |
| Airport check-in desk | 180 | passengers/hour |
How to measure λ: count arrivals during a representative period (e.g., 1 hour) and divide by time. Repeat on different days and times to capture variability.
μ — Service rate of one server
Average number of services that a single server completes per unit of time. Average service time is 1/μ.
| Scenario | μ | Average service time |
|---|---|---|
| Bank teller | 15 /h | 4 minutes |
| Physician in clinic | 4 /h | 15 minutes |
| Hospital pharmacy (dispensing) | 80 /h | 45 seconds |
| Maintenance technician | 2 /h | 30 minutes |
| Database worker | 250 /s | 4 ms |
Tip: if service time is highly variable (CV > 1), prefer the M/M/c model. If nearly constant (CV < 0.15), use M/D/1 or M/D/c for more accurate results.
ρ — Server utilization
ρ represents the fraction of time each server is busy. For stable systems: ρ < 1. When ρ ≥ 1 the queue grows without bound.
The exponential effect of utilization
The relationship between ρ and waiting time is not linear — it is explosive. For the M/M/1 model:
ρ Wq (multiples of 1/μ) 50% 1× 70% 2.3× 80% 4× 90% 9× 95% 19× 99% 99× Going from 80% to 90% utilization more than doubles the waiting time.
M/M/1 Model — the fundamental case
When to use: single-server system with random arrivals and variable service time.
Formulas
Example: Emergency room (triage)
15 patients arrive per hour at a triage desk with 1 nurse. Average triage time: 3 minutes (μ = 20/h).
The nurse is busy 75% of the time. If the flow rises to 18/h (ρ = 0.90), Wq jumps to 27 minutes. With 2 nurses (M/M/2), Wq drops to under 2 minutes.
Example: Web server (single thread)
120 req/s, average processing time of 6 ms (μ = 167 req/s).
For latencies below 10 ms total, multiple threads are required (M/M/c). With 2 threads, Wq drops to ~2 ms.
M/M/c Model — multiple parallel servers (Erlang-C)
When to use: several identical servers draw from the same queue — call centers, bank tellers, airport desks, thread pools.
Erlang-C formula
where is traffic in Erlangs and .
Example: Bank branch with 2 tellers
20 customers/hour, each teller serves 15 customers/hour. Model M/M/2.
Erlang-C → P(wait) ≈ 17.5% → Wq ≈ 22 seconds
With only 1 teller (ρ = 1.33): system unstable — infinite queue. The second teller is essential. A third would reduce Wq to ~5 seconds.
Example: Call center — SLA-based staffing
90 calls/hour, average handling time 4 minutes (μ = 15/h). SLA: 80% of calls answered in ≤ 20 seconds.
| Agents (c) | ρ | P(wait) | P(Wq ≤ 20s) |
|---|---|---|---|
| 7 | 85.7% | 35.1% | ~52% ❌ |
| 8 | 75.0% | 18.4% | ~81% ✅ |
| 9 | 66.7% | 8.4% | ~95% |
| 10 | 60.0% | 3.6% | ~99% |
With 8 agents the SLA is met by a slim margin. In practice, 9 agents are recommended to absorb absences and traffic spikes.
Example: Airport check-in
180 passengers/hour, service time 2 minutes (μ = 30/h), 8 desks.
If one desk closes (c=7, ρ = 85.7%), Wq rises to ~5.6 min — passengers may miss flights on peak days.
Example: Database — worker pool sizing
200 queries/s, average execution 4 ms (μ = 250/s). Pool of 4 workers.
With ρ = 20%, the system is well within capacity. Wq < 0.1 ms. With 2 workers (ρ = 40%), still excellent. The pool of 4 provides headroom for bursts up to 480 queries/s without degradation.
Example: Gas station
20 cars/hour arrive, average fueling 4 minutes (μ = 15/h), 3 pumps.
P(wait) ≈ 2.4%. Wq ≈ 13 seconds. Well-dimensioned system. With 2 pumps (ρ = 0.667), Wq would rise to ~1.5 min.
Example: Maintenance team
3 service calls per hour arrive for a 2-technician team. Each repair takes 30 min on average (μ = 2/h).
P(wait) ≈ 34.5%. Wq ≈ 34 minutes before a technician arrives. The cost of machine downtime during this period often justifies analyzing a 3rd technician (which would reduce Wq to ~5 min).
M/D/1 Model — fixed service time (Pollaczek-Khinchine)
When to use: service time is constant — paced production line, network printer, pharmacy with standardized protocol, electronic toll collection.
P-K formula for zero variance
Key insight: with fixed service time the queue is exactly half that of the equivalent M/M/1 case.
Example: Hospital pharmacy
40 orders/hour arrive. Standardized dispensing time: 45 seconds fixed (μ = 80/h).
With variable time (M/M/1, same mean): Wq ≈ 45 seconds. Standardizing the dispensing process cut waiting time in half.
Example: Supermarket checkout (standardized time)
15 customers/hour, fixed time of 3 min per customer (μ = 20/h).
Training cashiers to achieve a standardized handling time halves the perceived wait — without hiring anyone.
Example: Electronic toll collection
120 cars/hour per lane. Passage always takes 20 seconds (μ = 180/h).
If the system fell back to manual toll collection (variable time, same mean), Wq would double to ~40 s. During peak hours with λ = 160 cars/h (ρ = 0.889), M/D/1 gives Wq ≈ 2.5 min vs M/M/1 ~10 min.
Example: Paced assembly line
25 parts/hour arrive at a station with fixed cycle time of 2 minutes (μ = 30/h).
The station is running at 83% capacity. If demand rises to 28 parts/h (ρ = 0.933), M/D/1 Wq reaches 14 min — a critical bottleneck.
M/M/1/K Model — finite maximum capacity
When to use: the system has a physical limit — parking spaces, waiting room seats, call buffer, number of orders in processing.
Formulas
Example: Parking lot with 15 spaces
10 cars/hour arrive. Average stay: 2 hours (μ = 0.5/h). Capacity: 15 spaces.
P(K=15) ≈ 0.1% — virtually no blocking. If demand triples (λ = 30), blocking rises sharply and cars queue on the street.
Example: Call buffer (M/M/1/5)
50 calls/hour arrive at 1 agent with μ = 20/h. Maximum queue: 5 positions.
Nearly half the calls are blocked. For blocking < 5%: increase K to ~20 or add 2 agents (M/M/2, ρ = 1.25 — still unstable!) or 3 agents (M/M/3, ρ = 0.833 — stable, Wq ≈ 2 min).
M/M/c/c Model — Erlang-B (no queue, pure loss)
When to use: excess arrivals are lost, not queued — PBX trunks, hospital beds by specialty, restaurant tables, concurrent software licenses.
Erlang-B formula
Example: PBX trunks
Office with 4 trunks. 30 calls/hour, average duration 5 min (μ = 12/h). a = 2.5 Erlangs.
| Trunks (c) | Blocking |
|---|---|
| 3 | 22.1% |
| 4 | 9.4% |
| 5 | 3.5% |
| 6 | 1.1% |
| 7 | 0.3% |
With 4 trunks, 1 in 10 calls gets a busy signal. For ≤ 2% blocking (commercial standard): 5 trunks.
Example: ICU beds
10 beds. 2 admissions/day arrive. Average stay 5 days (μ = 0.2/day). a = 10 Erlangs.
| Beds | Blocking |
|---|---|
| 10 | 34.2% |
| 14 | 10.2% |
| 18 | 2.5% |
| 22 | 0.5% |
With 10 beds, 1 in 3 critical patients finds no vacancy. For blocking < 5%: 18 beds. This calculation underpins hospital capacity planning.
Example: Restaurant tables during peak hour
20 tables. 40 groups arrive/hour at peak. Average meal: 45 min (μ = 1.33 groups/hour/table). a ≈ 30 Erlangs.
B(20, 30) ≈ 54% of groups find no table.
For ≤ 10% blocking: ~28 tables. Practical alternative: a reservation system reduces random arrivals (Poisson → deterministic), dramatically cutting the excess.
Example: Floating software licenses
A company has 4 floating ERP licenses. 10 users attempt to connect/hour, each session lasts 20 min (μ = 3/h). a = 10/3 ≈ 3.33 Erlangs.
B(4, 3.33) ≈ 23% of access requests are blocked. With 6 licenses: blocking drops to ~5%. With 8: ~0.6%.
Little’s Law — the universal connection
Valid for any stable system regardless of model or distribution:
Practical example: On an assembly line there are on average 8 parts in the system. Total cycle time is 12 minutes. Throughput = L/W = 8/12 = 0.667 parts/min = 40 parts/hour.
Little’s Law allows measuring any quantity (L, W or λ) from the other two — without relying on any mathematical model.
Distribution fitting from sample
Before choosing a model, determine whether your arrivals and service times follow Poisson/Exponential. The “Distribution fitting” section automatically analyzes your sample:
Coefficient of Variation (CV = σ/mean)
| CV measured | Suggested distribution | Recommended model |
|---|---|---|
| CV < 0.15 | Nearly constant | M/D/1 or M/D/c |
| 0.15 ≤ CV < 0.65 | Erlang-k (k ≈ 1/CV²) | M/Ek/c (use simulation) |
| 0.65 ≤ CV < 1.35 | Exponential (Markovian) | M/M/c |
| CV ≥ 1.35 | Hyper-exponential | DES simulation recommended |
Kolmogorov-Smirnov test
For CV ≈ 1, the KS test formally checks whether data are compatible with an exponential distribution at 95% confidence — displaying observed D vs critical D.
How to use
- Collect successive inter-arrival times (or service times)
- Paste values in the field (comma, semicolon or line break separated)
- Click “Analyze” — the tool returns CV, histogram, suggestion and KS result
- Click “Apply λ/μ” to populate the parameter and recalculate
Discrete Event Simulation (DES)
The 🎲 Simulate button opens a full DES simulation — the same principle as Arena and Simul8, running in the browser.
How it works
- Arrivals: inter-arrival time = −ln(U)/λ (inverse-transform of the exponential)
- Exponential service: sampled via inverse-transform
- Erlang-k: sum of k exponential samples at rate k·μ
- Weibull, Lognormal, Normal, Triangular, Uniform: sampled via inverse-transform or acceptance-rejection
- Allocation: earliest available server; if all busy → virtual queue
- Blocking: M/M/1/K rejects when full; M/M/c/c rejects when all servers busy
What the results show
| Output | Meaning |
|---|---|
| Simulation vs Analytical | Validates the formulas for the given parameters |
| Wq histogram | Full waiting time distribution (real percentiles) |
| Queue over time | Dynamic evolution — identifies bursts and idle periods |
| Simulated blocking rate | Real proportion of rejected entities |
For critical analyses, use N = 500 entities. Agreement within ±10% of analytical results confirms the parameters are correct.
Non-exponential distributions — M/G/1 and M/G/c
In practice, service time does not always follow an exponential distribution. The tool lets you select seven distributions in the “Service time distribution” field, all with analytical results via P-K and Kingman formulas:
| Distribution | Parameters | Typical CV | When to use |
|---|---|---|---|
| Exponential | μ | 1.00 | Standard M/M/c — high random variability |
| Erlang-k | k phases, μ | 1/√k | Services with k sequential equal-duration steps |
| Weibull | α (shape), β (scale) | depends on α | Equipment lifetimes, repair times |
| Lognormal | μln, σln | depends on σln | Cognitive tasks, service with long tail |
| Truncated Normal | μ, σ | low (< 1) | Well-trained processes with small variability |
| Triangular | min, mode, max | low | Expert estimates without historical data |
| Uniform | a, b | 1/√3 ≈ 0.58 | Services guaranteed to take between a and b |
Coefficient of Variation (CV = σ/mean)
CV is the key indicator. The tool displays CV = value beside the selector in real time.
- CV < 1: more predictable than exponential → smaller queue than M/M/c
- CV = 1: equivalent to exponential → identical results to M/M/c
- CV > 1: more variable than exponential → larger queue than M/M/c
Pollaczek-Khinchine formula (M/G/1)
For 1 server and any distribution with mean E[S] = 1/μ and variance Var[S]:
When CV = 1 (exponential), the P-K formula reproduces the M/M/1 result exactly.
Kingman VCA approximation (M/G/c)
For c parallel servers, Kingman’s approximation corrects M/M/c by CV:
The factor (1 + CV²)/2 equals exactly 1 for CV = 1 (exponential) and drops to 0.5 when CV → 0 (constant time, equivalent to M/D/c). This approximation has error typically < 5% for ρ < 0.90.
Example: call center with lognormal service times
80 calls/hour, 4 agents, μ = 15/h. Historical data shows CV = 1.8 (typical long-tail of technical support calls).
| Model | Lq | Wq |
|---|---|---|
| M/M/c (CV = 1) | 0.174 | 7.8 s |
| M/G/c with CV = 1.8 (Kingman) | 0.312 | 14.0 s |
Real variability nearly doubles waiting time. Modelling the correct distribution drives more accurate staffing decisions.
Example: assembly line with Erlang-4 service
25 parts/hour, 1 workstation, each operation has 4 equal sequential steps of ~30 s each (μ = 30/h, k = 4 → CV = 0.5).
With Erlang-4 distribution the queue is 37.5% smaller than M/M/1 would predict. Correctly modelling the distribution can significantly change the decision on whether to add servers.
Quick guide: which model to use?
Is physical queuing possible?
├── NO → M/M/c/c (Erlang-B) trunks, beds, tables, licenses
└── YES
├── Is there a maximum capacity K?
│ └── YES → M/M/1/K parking, buffer, finite waiting room
└── NO (unlimited queue)
├── Service time distribution
│ ├── Exponential (CV ≈ 1)
│ │ ├── 1 server → M/M/1
│ │ └── c servers → M/M/c (Erlang-C)
│ ├── Fixed/constant (CV = 0) → M/D/1
│ └── General (Erlang, Weibull, Lognormal, Normal, Triangular, Uniform)
│ ├── 1 server → P-K formula (M/G/1)
│ └── c servers → Kingman VCA (M/G/c)
Frequently asked questions
1. How do I convert between time units?
Keep λ and μ in the same unit. If λ = 90/hour and you want to work in minutes: λ = 1.5/min, μ = 15/60 = 0.25/min. All W and Wq results will be in the chosen unit. The tool automatically displays in s, min or h according to magnitude.
2. Which service time distribution should I choose?
Use the distribution fitting button (🔬): paste a sample of service times and the tool calculates CV and suggests a distribution. Quick reference:
| Measured CV | Suggested distribution |
|---|---|
| CV ≈ 0 (< 0.15) | Deterministic → use M/D/1 |
| 0.15–0.65 | Erlang-k with k ≈ 1/CV² |
| 0.65–1.35 | Exponential → M/M/c |
| > 1.35 | Hyper-exponential or Lognormal → use Lognormal in the tool |
For any distribution selected, the tool automatically computes analytical KPIs via P-K (c=1) or Kingman (c>1) and allows simulation to validate.
3. The system is unstable (ρ ≥ 1). What to do?
Three options: (1) Increase μ — reduce service time; (2) Increase c — add parallel servers; (3) Reduce λ — scheduling, triage or routing. The server comparison table (M/M/c result) shows quantitatively the impact of adding each server.
4. What is the difference between Erlangs and ρ?
Erlangs (a = λ/μ) is the total offered traffic — can be > 1. ρ (= a/c) is utilization per server, between 0 and 1. Erlang-B uses a; Erlang-C uses both.
5. When to use simulation instead of formulas?
Use simulation when: (a) you want to validate analytical results; (b) the system has multiple chained stages (queueing network); (c) there are priorities or complex routing; (d) λ and μ vary throughout the day; (e) you want the full Wq distribution (percentiles), not just the mean. For all models and distributions supported by this tool, analytical and simulation results converge with N ≥ 300 — validating the modelling before moving to more complex software.
6. How does Arena relate to this tool?
Arena and Simul8 use DES for arbitrarily complex cases. The integrated simulation in this tool uses the same principle for the basic models. For models covered by closed-form formulas, analytical and simulation results converge with sufficient N — validating the modeling before moving to more complex software.
7. How to size a maintenance team?
Use M/M/c with λ = call rate and μ = service rate per technician. The comparison table shows Wq for c-1, c, c+1, c+2. Compare the marginal cost of one technician with the hourly cost of machine downtime: cost_downtime × Wq × λ = downtime cost per hour.
8. How to calculate waiting time SLA for M/M/c?
To find the smallest c guaranteeing 80% of customers with Wq ≤ 20 s, use the comparison table and identify the smallest c at which this probability exceeds 80%.
Essa ferramenta foi útil?
Seu feedback é anônimo e ajuda a melhorar essa e outras ferramentas da Utilibox.
Encontrou algum erro?