How Uber's Surge Pricing Actually Works — Real-Time Geospatial State at Planetary Scale
Surge pricing is not a slider in an admin panel. It is not a city-wide setting. It is a real-time distributed systems problem: track every driver's location across the planet, track every rider's intent-to-request across the planet, bucket both into millions of tiny geographic cells, compute a supply/demand ratio per cell every few seconds, and push the resulting multiplier back out to every app that might care — all with sub-second latency on mobile networks. The "1.4x" you see on your screen is the visible tip of one of the most sophisticated real-time geospatial systems ever deployed to consumers, and almost none of the engineering that makes it possible is visible to the rider or the driver.
This post walks through how that system is actually built, grounded in Uber's own public engineering blog posts, the open-source H3 library Uber released in 2018, and public conference talks by Uber engineers. Nothing here is insider speculation — every architectural choice described is publicly documented somewhere, if you know where to look.
Why Traditional Geo-Indexing Does Not Work at Uber Scale
The first instinct of any engineer asked to "group drivers by location" is to reach for a grid. Divide the map into squares, assign each driver to a square, count drivers per square, done. This works for a small app. It falls apart at Uber's scale for a few specific reasons:
- Square grids distort near the poles. A grid of one-degree-latitude cells looks great on a map near the equator but produces squashed rectangles in Norway and Chile.
- Neighbors are not equidistant. In a square grid, diagonal neighbors are farther away than horizontal or vertical neighbors. When you want to ask "how many drivers are near this cell," that asymmetry becomes a problem.
- Zoom levels do not nest cleanly. Aggregating from small cells to larger cells in a square grid introduces edge artifacts.
- Cell boundaries cross natural features like rivers and highways in ways that do not match real-world travel patterns.
Uber's answer to all of this was to build their own geospatial index from scratch. They called it H3, and in 2018 they released it as open source. You can read the source on GitHub. It is one of the most important pieces of infrastructure most people have never heard of.
H3 — Hexagonal Hierarchical Geospatial Indexing
H3 divides the surface of the Earth into hexagons. Every hexagon has a unique 64-bit identifier that encodes both its location and its resolution. There are 16 resolutions available, ranging from cells that cover continents down to cells roughly a square meter in size.
Hexagons have three properties that make them ideal for real-time geospatial state:
- Uniform neighbor distances. Every hex cell has exactly six neighbors, and all six are equidistant from the center. No diagonals, no special cases. Aggregating "this cell plus its neighbors" is always the same operation.
- Clean hierarchy. H3 cells nest approximately from coarse to fine resolutions, so you can zoom out from a block-level cell to a neighborhood cell to a city cell with a simple ID manipulation.
- Consistent area. Unlike latitude-longitude rectangles, H3 cells have roughly consistent area anywhere on the globe. A resolution-9 cell in Tokyo is the same size as a resolution-9 cell in Cape Town.
At Uber, H3 resolution 8 is a common choice for surge pricing — each cell covers roughly half a square kilometer, which is a good match for the "block or two" granularity that surge operates at. Some cities may use finer resolutions in dense areas and coarser resolutions in sparser zones.
Every location event in Uber's system — a driver GPS ping, a rider opening the app, a ride request being created — is tagged with the H3 cell ID at the moment it happens. That ID becomes the sharding key for everything that follows.
Geo-Sharding — The Trick That Makes Real-Time State Possible
Here is the core architectural insight: if every event is tagged with an H3 cell, you can assign each cell to a specific server. Events flow to the server that owns their cell. Every update to the cell's state happens in one place. Every query for the cell's state reads from one place. No cross-server coordination required for the common case.
This is called geo-sharding, and it is the reason surge pricing is tractable at Uber's scale. A naive implementation would have every driver location update write to a central database, creating a global bottleneck that could never keep up with millions of events per second. Geo-sharding splits the world into pieces small enough that each piece fits on one server, and the pieces rarely need to talk to each other.
The abstract architecture looks like:
- Driver apps stream location updates every few seconds over a persistent connection to Uber's mobile gateway.
- The gateway tags each event with an H3 cell ID based on the driver's current GPS coordinates.
- Events are routed to the geo-shard owning that cell using a consistent-hash function. The routing layer knows which server is currently responsible for which H3 cells.
- The geo-shard updates its in-memory state for that cell — active drivers, recent requests, and derived values like the supply/demand ratio.
- Aggregated state is continuously published to a downstream service that computes the surge multiplier and broadcasts it to apps that are watching nearby cells.
Uber has publicly discussed using Apache Kafka as the streaming backbone that carries these events between services. Each event stream is partitioned by a geo-aware key, which means downstream consumers naturally process events for the same cells together. This is textbook "locality of reference" applied to a streaming system.
Computing the Surge Multiplier
Once every H3 cell has a continuously-updating view of its local supply and demand, the actual surge calculation is conceptually simple:
supply = count of active drivers in cell (and adjacent cells)
demand = count of open requests + recent app opens + abandoned sessions
ratio = demand / max(supply, 1)
if ratio < threshold_low:
multiplier = 1.0
elif ratio > threshold_high:
multiplier = f(ratio) // non-linear curve, capped at a max
else:
multiplier = smoothed_transition
The math is deliberately vague because the exact curve is business logic, not a systems problem — and Uber keeps that curve proprietary. What matters for the infrastructure is that the inputs are cheap to compute locally (because of geo-sharding), the output is a small number, and the update frequency is bounded (the multiplier does not change on every single event, it updates on a regular tick with smoothing so it does not flicker).
A few operational details that matter a lot in practice:
- Demand is not just ride requests. A user opening the app and looking at the map is a demand signal, even if they do not tap to book. Uber uses a weighted combination of these signals to avoid gaming attacks where riders could suppress surge by not tapping the book button.
- Supply includes drivers who are nearby but not in the exact cell. Uber's aggregation usually looks at a "ring" of cells around the target cell, using H3's ring function. This prevents edge-of-cell artifacts where a driver one meter outside a cell boundary is invisible to it.
- The multiplier is smoothed over time. If a big event ends and 200 people suddenly tap "book," the multiplier does not instantly spike to 5x. A rate limiter on the multiplier itself keeps it moving in controlled steps, which is a classic control-systems trick to prevent oscillation.
- Different product lines use different logic. UberX, Uber Pool, Uber Black, and Uber Eats all have their own supply/demand graphs and their own multipliers, running on the same underlying geo-sharded infrastructure.
How the Multiplier Reaches Your Phone
Once a new multiplier is computed for a cell, it has to get back to every app that might care. Here "cares" means "is currently open and looking at a map that includes this cell, or is about to request a ride in this cell."
The dispatch is a push — Uber's mobile apps maintain a persistent connection to a gateway, and the gateway forwards multiplier updates for cells the user is currently viewing. When you drag the map in the Uber app, the app tells the gateway "I now care about these cells," and the gateway updates its subscription list. This is why surge heatmaps feel live — they are not polled, they are streamed.
A subtle point: the gateway does not need to know the exact multiplier for every cell on Earth. It only needs to know the multiplier for cells that at least one app is watching. That keeps the push fanout tractable. Cells with no active viewers do not have their multipliers broadcast — the state still updates in the geo-shards, but no network traffic is wasted sending it anywhere.
The Dispatch System — Where Surge Feeds Into Matching
Surge pricing is only one output of this geospatial state pipeline. The same real-time cell state feeds Uber's dispatch system (publicly referred to as "DISCO" in older Uber Engineering posts), which handles the actual rider-to-driver matching.
When you request a ride, the dispatch service queries the geo-shard for your cell (and nearby cells) to find candidate drivers. It then runs a matching algorithm that takes into account driver proximity, estimated time of arrival, driver ratings, and several business objectives Uber does not publish in detail. The result is a dispatched driver, and this dispatch also updates the cell state — the driver is now "committed" and is temporarily removed from the available supply pool, which can itself move the surge multiplier.
This feedback loop is the reason surge can change between opening the app and booking. Opening the app registered you as demand. Other people booked, reducing supply. The multiplier went up. The new quote reflects the new state. None of this happens in a central coordinator — it is all emergent from the geo-sharded state flowing through streams.
The Hard Parts Nobody Tells You About
Geo-sharded real-time state sounds elegant on paper. In practice, it has several properties that make it one of the more difficult architectures to operate in production.
Rebalancing when cells get hot
A concert ends. Thousands of people open the Uber app in the same few hex cells within seconds. The geo-shards owning those cells are suddenly handling 100x their normal event rate. You cannot simply add more servers — the sharding function means those specific cells are still pinned to the same server. Uber handles this with dynamic rebalancing: hot cells can be split off to a new owner, with state handover done carefully to avoid losing events. This is a classic consistent-hashing-with-moves problem, and getting it right without losing events is the hardest part of operating the system.
Cell boundary artifacts
What happens when a driver is sitting exactly on the boundary between two H3 cells? The GPS reading flickers between the two cells every few seconds, and the driver appears to teleport. Without smoothing, this would cause the supply count in both cells to jitter. The fix is to use hysteresis — a driver is not considered to have left a cell until they have been continuously outside it for a short period. This is the same trick thermostats use to avoid flickering between "heat on" and "heat off."
Clock skew and event ordering
Events from drivers on different phones arrive at the gateway with potentially inconsistent timestamps. When you are computing a real-time supply count, a late-arriving event from a driver who actually left the cell can temporarily inflate the count. Uber uses event-time watermarking — a concept borrowed from Apache Beam and Flink — to decide when a cell's state for a given time window is "final" versus still accumulating.
Cross-region consistency
Uber operates in many countries, and a driver crossing an international border (rare, but possible) touches cells in different regions with different backend clusters. The geo-shards for the cells on either side of the border may be in different data centers. Handovers require cross-region coordination that is carefully avoided in the common case but must still be correct when it happens.
The DevOps Patterns You Can Steal From This
You will likely never build a ride-hailing app. But the patterns Uber used to solve this problem show up everywhere in modern distributed systems, and they are worth learning because they apply far beyond ride-hailing.
- Sharding by a domain-specific key — H3 cell IDs are a geospatial sharding key. The same idea applies to any system where state has locality: shard by user ID for social apps, by customer ID for SaaS, by tenant ID in multi-tenant systems. Related work happens on the same server, unrelated work stays away.
- Streaming aggregation instead of query aggregation. Do not compute "active drivers in this cell" by querying a database — maintain it continuously in memory as events flow in. Apache Flink, ksqlDB, and Kinesis Data Analytics are all tools for this pattern. It is dramatically cheaper than per-query aggregation.
- Push, not poll, for real-time state. The Uber app does not poll for surge every second — it subscribes to updates for cells it cares about. This same pattern is how modern observability systems (Grafana Live, CloudWatch metric streams) deliver real-time data to dashboards without destroying the backend.
- Hot cell rebalancing. Any sharded system eventually has hot shards. Build in the ability to split and move shards without downtime before you need it. Retrofitting this is painful; designing for it is not.
- Smoothing and rate-limiting outputs. Raw state signals are noisy. Surge pricing, autoscaling, alert thresholds, cache expiries — every real-time system needs smoothing to prevent oscillation. A moving average or a PID controller is often the difference between a system that works and a system that thrashes.
These are not Uber-specific tricks. They are general distributed systems patterns that happen to be visible in the surge pricing example because the feature is so publicly exposed. Any system that tracks state with strong locality and needs to react to changes in near real time will converge on something that looks a lot like this architecture.
Frequently Asked Questions
What is H3 and why did Uber build it?
H3 is Uber's open-source hexagonal geospatial indexing library. It divides the Earth into hexagonal cells at 16 resolutions, provides uniform neighbor distances, and makes aggregation across cells clean and efficient. Uber built it because square grids have distortion and neighbor-inconsistency problems that break at planetary scale.
Why does my Uber fare sometimes jump right after I open the app?
Opening the app is itself a demand signal. If enough riders open the app in the same cell at once, the cell's supply/demand ratio can cross a threshold and trigger a higher multiplier. The fare you see reflects the state when your quote was generated, which is why waiting a moment and retrying sometimes gives a different number.
Is surge pricing the same everywhere in a city?
No. Surge is localized to H3 cells, typically a few blocks in size. Two riders standing a short distance apart can see different multipliers if they are in different cells. Moving half a block sometimes drops the fare.
How does Uber handle real-time location updates from millions of drivers?
Driver location events flow into a streaming pipeline (Uber publicly uses Kafka for much of this), geo-sharded by H3 cell. Each server owns a slice of cells and handles all state updates for those cells in memory, avoiding any central database bottleneck.
Does Uber use AWS for this?
Uber is well known for running much of its infrastructure on-premises and on customized cloud setups. The specific cloud provider is less important than the architectural pattern — which is cloud-agnostic and works on AWS, GCP, or a private data center equally well.
Next Steps
If the distributed-systems patterns in this post interested you, these guides go deeper on related topics:
- How Netflix-Scale DRM Works — another "the visible feature is the tip of the iceberg" deep dive, this time on license servers and CENC
- Kubernetes RBAC Explained — authorization patterns that share DNA with Uber's geo-sharded state
- AWS IAM Best Practices — sharding authorization decisions in cloud infrastructure
- Free DevOps resources — production architecture guides and interview prep