Building Real-Time Shipment Tracking and ETA Intelligence with RisingWave

Building Real-Time Shipment Tracking and ETA Intelligence with RisingWave

The Challenge: Maintaining Accurate ETAs in Real-Time Logistics

In modern logistics and supply chain systems, real-time shipment visibility is no longer a nice-to-have feature—it is a core operational requirement. Delivery vehicles continuously stream GPS location updates, traffic conditions fluctuate throughout the day, and every order comes with a promised delivery deadline that must be met. The challenge is not simply tracking where a shipment is, but continuously answering a harder question: will it arrive on time, and if not, how early can we detect that risk?

Traditional approaches fall short in two key ways. Batch pipelines introduce unavoidable latency: ETAs are computed minutes later, after the vehicle has already moved and the traffic situation has changed. Meanwhile, building a custom streaming system from scratch requires complex state management, non-trivial joins across multiple live data sources, and significant operational overhead. This creates a gap for teams that want real-time accuracy without adopting a heavyweight streaming framework.

This is the gap our demo addresses: turning raw, continuously changing logistics streams into actionable, real-time ETA and delay insights using SQL—no complex stream-processing code required.

The Solution: A SQL-Native Real-Time Logistics Pipeline with RisingWave

We built an end-to-end demo that combines Kafka for ingestion and RisingWave for continuous processing and serving. RisingWave sits at the center of the pipeline: it ingests multiple streams, maintains up-to-date state via materialized views, and exposes the latest results to a lightweight backend that powers the dashboard.

The core design principle is simple: treat each real-world signal as a stream, convert the streams into queryable tables, and let RisingWave continuously maintain the “current truth” about every shipment—its latest location, current traffic context, predicted arrival time, and delay status.

How It Works: The Data Pipeline Step by Step

The Overall Architecture:

Step 1: Ingest Three Real-Time Streams from Kafka

The pipeline starts by ingesting three Kafka topics into RisingWave as stream tables.

gps_stream represents raw GPS updates from vehicles. Each message carries a timestamp, vehicle identifier, coordinates, and instantaneous speed.

-- 1) GPS stream (JSON)
CREATE TABLE sc.gps_raw (
    event_ts TIMESTAMPTZ,
    vehicle_id VARCHAR,
    lat DOUBLE PRECISION,
    lon DOUBLE PRECISION,
    speed_kmh DOUBLE PRECISION
) WITH (
    connector = 'kafka',
    topic = 'gps_stream',
    properties.bootstrap.server = '127.0.0.1:9092',
    scan.startup.mode = 'earliest'
) FORMAT PLAIN ENCODE JSON;

traffic_stream represents traffic snapshots. Each traffic event is keyed by a region_id and includes a bounding box along with congestion and speed.

-- 2) Traffic stream (JSON)
CREATE TABLE sc.traffic_raw (
    event_ts TIMESTAMPTZ,
    region_id VARCHAR,
    min_lat DOUBLE PRECISION,
    min_lon DOUBLE PRECISION,
    max_lat DOUBLE PRECISION,
    max_lon DOUBLE PRECISION,
    congestion DOUBLE PRECISION,
    traffic_speed_kmh DOUBLE PRECISION
) WITH (
    connector = 'kafka',
    topic = 'traffic_stream',
    properties.bootstrap.server = '127.0.0.1:9092',
    scan.startup.mode = 'earliest'
) FORMAT PLAIN ENCODE JSON;

order_stream represents operational order updates, includes the order’s destination, promised delivery timestamp, current status, and update time. This stream is what turns vehicle movement into a shipment-tracking problem: it binds each order to its SLA and destination.

-- 3) Order updates stream (JSON)
CREATE TABLE sc.order_updates_raw (
    order_id VARCHAR,
    vehicle_id VARCHAR,
    dest_lat DOUBLE PRECISION,
    dest_lon DOUBLE PRECISION,
    promised_ts TIMESTAMPTZ,
    status VARCHAR,
    updated_at TIMESTAMPTZ
) WITH (
    connector = 'kafka',
    topic = 'order_stream',
    properties.bootstrap.server = '127.0.0.1:9092',
    scan.startup.mode = 'earliest'
) FORMAT PLAIN ENCODE JSON;

Step 2: Normalize Streams into Typed Events and a Shared Spatial Key

Raw streams are already structured, but the pipeline still normalizes them into typed event views that add one critical field: a shared spatial key grid_id.

For GPS events, grid_id is computed by snapping latitude and longitude into a small grid using 0.002 degree buckets, then concatenating the bucketed lat/lon into a string. And for traffic events, grid_id is computed from the center point of the traffic region’s bounding box, snapped using the same grid rule. This makes GPS and traffic events joinable using a simple equality key instead of expensive spatial predicates.

-- GPS events (typed)
CREATE MATERIALIZED VIEW sc.gps_events AS
SELECT
    event_ts,
    vehicle_id,
    lat,
    lon,
    speed_kmh,
    (
      CAST(FLOOR(lat / 0.002) AS BIGINT)::TEXT
      || '_' ||
      CAST(FLOOR(lon / 0.002) AS BIGINT)::TEXT
    ) AS grid_id
FROM sc.gps_raw;

-- Traffic events (typed)
CREATE MATERIALIZED VIEW sc.traffic_events AS
SELECT
    event_ts,
    region_id,
    min_lat,
    min_lon,
    max_lat,
    max_lon,
    congestion,
    traffic_speed_kmh,
    (
      CAST(FLOOR(((min_lat + max_lat) / 2) / 0.002) AS BIGINT)::TEXT
      || '_' ||
      CAST(FLOOR(((min_lon + max_lon) / 2) / 0.002) AS BIGINT)::TEXT
    ) AS grid_id
FROM sc.traffic_raw;

Step 3: Convert Event Streams into Latest State Snapshots

Real-time dashboards and ETA computation need the current state, not a full append-only history, so we create three snapshots using row_number() over a time ordering and keeping exactly one row.

-- Latest order snapshot (one row per order_id), e.g., order snapshot
CREATE MATERIALIZED VIEW sc.orders AS
SELECT
    order_id,
    vehicle_id,
    dest_lat,
    dest_lon,
    promised_ts,
    status,
    updated_at
FROM (
    SELECT *, row_number() OVER (PARTITION BY order_id ORDER BY updated_at DESC) AS rn
    FROM sc.order_updates
) t
WHERE rn = 1;

This is one of the places where RisingWave matters: materialized views maintain these “latest” snapshots continuously and incrementally. Every new Kafka message updates only the affected keys, keeping these state tables always fresh for downstream joins.

Step 4: Join Current Order + Vehicle + Traffic into One Live Operational Row

With the latest-state tables in place, we join the three tables using vehicle_id and grid_id. To ensure the ETA pipeline remains stable even when traffic signals are sparse, the join defines two important defaults. If traffic data is missing, congestion falls back to 0.0, and traffic_speed_kmh falls back to the vehicle’s own speed.

WITH joined AS (
    SELECT
        o.order_id,
        o.vehicle_id,
        o.status,
        o.dest_lat,
        o.dest_lon,
        o.promised_ts,
        o.updated_at,

        v.lat AS vehicle_lat,
        v.lon AS vehicle_lon,
        v.speed_kmh AS vehicle_speed_kmh,
        v.gps_ts,
        v.grid_id,

        t.region_id,
        coalesce(t.congestion, 0.0) AS congestion,
        coalesce(t.traffic_speed_kmh, v.speed_kmh) AS traffic_speed_kmh
    FROM sc.orders o
        JOIN sc.vehicle_latest v ON o.vehicle_id = v.vehicle_id
        LEFT JOIN sc.traffic_latest t ON v.grid_id = t.grid_id
)

Step 5: Compute Distance, Effective Speed, ETA, ETA Timestamp, and Delay

calc_dist computes the Haversine distance in kilometers between the vehicle’s current coordinates and the order destination. This produces distance_km, which is the foundation for travel time.

calc_dist AS (
    SELECT
        *,
        -- Haversine distance (km)
        (2 * 6371 * asin(
          sqrt(
            power(sin(radians(dest_lat - vehicle_lat)/2), 2) +
            cos(radians(vehicle_lat)) * cos(radians(dest_lat)) *
            power(sin(radians(dest_lon - vehicle_lon)/2), 2)
          )
        )) AS distance_km
    FROM joined
)

calc_speed converts raw speed signals into an effective speed. It takes the minimum of vehicle speed and traffic speed, applies a congestion penalty $1 - congestion \times 0.6$, and floors the result at 5.0 km/h to avoid unstable ETAs at extremely low speeds. This yields effective_speed_kmh.

calc_speed AS (
    SELECT
        *,
        -- Effective speed (km/h), floor at 5
        greatest(
          5.0,
          least(vehicle_speed_kmh, traffic_speed_kmh) * (1 - congestion * 0.6)
        ) AS effective_speed_kmh
    FROM calc_dist
)

calc_eta converts distance and effective speed into minutes using $\frac{distance\_km}{ effective\_speed\_kmh} \times 60.0$ and outputs eta_minutes.

calc_eta AS (
    SELECT
        *,
        -- Travel time (minutes)
        (distance_km / effective_speed_kmh * 60.0) AS eta_minutes
    FROM calc_speed
)

The SELECT computes the two business-critical timestamps. The predicted arrival timestamp eta_ts is computed as $gps\_ts + eta\_minutes \times 1\ minute$, which ties ETA to the real event time of the latest GPS update. Delay is computed as an interval $delay\_interval = eta\_ts - promised\_ts$: a positive interval means the shipment is predicted to arrive after its promised deadline.

SELECT
    order_id,
    vehicle_id,
    status,
    dest_lat, dest_lon,
    promised_ts,
    updated_at,

    vehicle_lat, vehicle_lon,
    vehicle_speed_kmh,
    gps_ts,
    grid_id,

    region_id,
    congestion,
    traffic_speed_kmh,

    distance_km,
    effective_speed_kmh,
    eta_minutes,

    -- ETA timestamp based on event time
    (gps_ts + (eta_minutes * INTERVAL '1 minute')) AS eta_ts,

    -- Delay vs promised time (interval)
    ((gps_ts + (eta_minutes * INTERVAL '1 minute')) - promised_ts) AS delay_interval
FROM calc_eta;

This entire chain is maintained as a materialized view. New GPS messages, traffic changes, or order updates cause RisingWave to incrementally update only affected orders, keeping the mv always ready for the dashboard to query.

A Look Inside the Dashboard: Real-Time Shipment Monitoring

The dashboard is powered by RisingWave’s continuously updated results rather than raw event streams. A lightweight backend exposes REST endpoints that query the live view and return two types of data: snapshot metrics for the KPI header and order rows for the table and map.

The KPI section reflects operational health in real time. It shows how many orders are actively in transit, how many are already late relative to their promised timestamp, what the average remaining ETA looks like, and when the system last refreshed. These numbers are computed in SQL inside RisingWave, meaning they update as soon as new data arrives.

The map visualization focuses on immediate situational awareness. Vehicle positions and destinations are rendered live, and routes are visually annotated to indicate risk or delay. Hovering over a vehicle provides order-level context including ETA minutes, ETA timestamp, promised timestamp, delay duration, vehicle speed, and traffic speed. This allows an operator to quickly understand what is happening without switching views.

The order list complements the map by providing sortable, filterable rows that represent the current live truth for each active delivery.

Clicking any order opens a detail panel that places predicted arrival and promised delivery side by side, making delays explicit and explainable.

Build It Yourself in Minutes

Ready to see it in action? This entire demo is open-source, with step-by-step instructions to get you up and running in under an hour.

  • RisingWave SQL scripts (source tables and materialized views)

  • A JavaScript script for generating mock data

  • Frontend and backend code for the visual dashboard

  • A Readme document

Get Started with RisingWave

  • Try RisingWave Today:

  • Talk to Our Experts: Have a complex use case or want to see a personalized demo? Contact us to discuss how RisingWave can address your specific challenges.

  • Join Our Community: Connect with fellow developers, ask questions, and share your experiences in our vibrant Slack community.

If you’d like to see a personalized demo or discuss how this could work for your use case, please contact our sales team.

The Modern Backbone for
Real-Time Data and AI
GitHubXLinkedInSlackYouTube
Sign up for our to stay updated.