Unit 4 — routing baselines, residual post-processing, and a stable countdown

Pickup & Trip ETA Prediction

Routing (Unit 1) answers what path? This unit answers the harder question the rider actually stares at: how long will that path take, right now? The same road takes four minutes at 2 a.m. and eleven at 8:30. A great ETA sets expectations, feeds dispatch scoring, and anchors the fare — and a jittery, wrong one erodes trust faster than almost anything else in the app.

Uber frames ETA as a hybrid problem: a physical routing model produces a baseline, then a machine-learned model predicts the residual between that baseline and reality. You will build exactly that ladder — baseline → congestion model → residual post-processing → smoothed display — implementing each piece, checking it with python test.py, and unlocking the next with a deterministic checkpoint.

Sub-unit 1 of 20

The problem: predicting arrival time

Produce accurate pickup ETA (driver → rider) and trip ETA (pickup → destination), keep them current as traffic and position change, and present them as a stable countdown — all within a few milliseconds, because this is the highest-QPS prediction in the whole system.

Functional

Estimate pickup and trip ETA for any origin/destination pair.
Update the estimate as the driver moves and traffic shifts.
Feed ETAs to dispatch (Unit 3), pricing (Unit 5), and matching (Unit 6).
Present a smooth, non-flickering countdown to the rider.

Non-functional

Return an ETA within a few milliseconds (highest-QPS model at Uber).
Mean absolute error within a small fraction of true duration across the day.
Stable display — no large jumps between refreshes.
Continuously recalibrate from completed trips.

Constraints

True segment speeds vary by time of day, weather, and incidents.
Live probe data is noisy and arrives with delay.
The exact route a driver takes is not known in advance.
The map is a model — it cannot perfectly capture conditions on the ground.

That last constraint is the crux. As Uber's Maps team puts it, the map is not the terrain: a road graph is a model, and even a perfect shortest-path query returns an ETA conditioned on a route the rider and driver may not actually take. The entire unit is about closing the gap between the graph's answer and the real world.

Finished reading? Mark this sub-unit complete to unlock the next.