Published April 2026 · Sim-validated · No live customer fleet · No paying customer yet
WRIE.
A sim-validated warehouse-robotics digital twin. The same 15,000-line C++ binary is designed to bind to either a simulator or a real fleet at the device-driver layer — no sim-to-real translation, because there is no second program. Validated against the Gazebo backend and an internal 50-robot load run; awaiting first live deployment.
Walkthrough on request · screen capture of the binary booting in Gazebo + the same binary on hardware · ~60 sec · NDA on request for buyers
One contract surface
213+
Endpoints
[ A ] · The problem
Warehouse robotics is fragmented. Every fleet runs its own language.
A modern warehouse runs a heterogeneous fleet - AGVs from one vendor, lifters from another, ASRS shelving from a third, scout robots stitched in for cycle counts. Each speaks its own protocol, exposes its own dashboard, and ships its own ‘simulator’ that drifts from the real robot inside a year. The operator ends up running four parallel stacks and a fifth stack to glue them together.
The break shows up the first time a sim-trained behaviour hits the real floor and surprises the fleet. A pose normalisation differs by 0.3°. A handover timeout differs by 200 ms. A charge-station scheduler in the sim assumes a battery curve the real cell never had. Each gap is small. Together they kill trust in the simulator, and the operator stops using it.
WRIE was built to remove the gap by removing the second program. There is one binary. The simulator and the real fleet are two backends that bind to the same control loop at the device-driver layer.
[ B ] · The argument
The sim and the real are one program with two backends. Identical at the bit level.
Fifteen thousand lines of C++17, running the same control loop, the same MAPF planner, the same VDA5050 message bus - switching only at the device-driver layer between a Gazebo backend and the real fleet on the floor. Bug fixes apply to both because there is no second place for them to apply. Sim-trained behaviours transfer to the real fleet because the code paths are the same code.
[ C ] · Architecture
One binary. Two backends. Eight layers.
Each layer is a contract - defined, versioned, tested. The implementation serves the contract; the contract never serves the implementation. Replace any layer in isolation, the rest of the stack does not notice.
Autonomy core
C++17 control loop · MAPF planner · trajectory optimisation · pose normalisation
Hardware abstraction
Pluggable device-driver layer - Gazebo backend (sim) · real-fleet backend (production)
Message bus
VDA5050-aligned MQTT · canonical pose envelope · audit log on every publish
Fleet management
Multi-Agent Path Finding · zone arbitration · charge-station scheduling · failure handover
API surface
213+ FastAPI endpoints · contract-bound · OpenAPI-generated SDKs
Control surface
React + WebSocket dashboard · live fleet state · order intake · teleoperation
Replay engine
Fleet trace recorder + deterministic re-driver - re-run any incident through the binary for forensic analysis
Test harness
Contract · property · soak suites · 1.7K+ tests · independent peer-model audit gate
[ D ] · The numbers
Endpoints (built)
213+
FastAPI · contract-bound
Robot models (sim)
18
Across 5 real-world vendors
Shared binary
15K
Lines of C++17 · sim-ready
Standard
VDA5050
Aligned messaging layer
Stage
Sim-validated
Awaiting first live deployment
Tests
1.7K+
Contract · property · soak
[ E ] · The fleet
Eighteen robots. One contract.
Across five real-world vendors. Every robot - wheel-based, lift-class, scout, or shelving system - speaks a single VDA5050-shaped surface to the binary. Vendor-specific quirks live in their drivers and stay there.
R/01
AGV - Automated Guided Vehicles
Wheel-based transport robots. Handle pallet moves, point-to-point delivery, and zone-to-zone shuttling. WRIE drives 8 AGV models from 3 vendors over a single VDA5050 surface.
R/02
LIFTER - Forklift-class robots
Vertical-lift robots for high-bay storage, pallet stacking, and inbound dock work. Tightly coupled with the warehouse graph so the planner reasons about lift height, fork engagement, and drop tolerance.
R/03
SCOUT - Inventory & inspection bots
Lightweight perimeter robots that patrol aisles, validate barcode/QR labels on shelves, and feed cycle-count data back to the WMS. Run on a faster control loop than transport robots.
R/04
ASRS - Lift integrations
Automated storage and retrieval systems - vertical and horizontal carousels, mini-loads, shuttle racks. WRIE speaks the vendor protocols (Daifuku, SSI Schäfer, Murata) and exposes them through the same job interface as a free-roaming robot.
[ F ] · Operations
What the operator actually touches.
Autonomy handles the easy 99%. The hard 1% - the stuck robots, the picks that need a human, the post-incident forensic - is where WRIE earns its keep.
O/01
Teleoperation
When a robot reports a state outside the planner's recovery envelope (stuck, perception fault, environmental anomaly), an operator is paged with full context: the live camera feed, the planner's intended path, the last successful pose, and a one-click hand-back when the robot is clear.
O/02
Pick stations
Robots arriving at a pick station present their tote and load to a human picker through a dedicated UI: target SKU, quantity, target tote, and a confirmation barcode. WRIE locks the robot in place until the pick is confirmed and the receipt is signed.
O/03
Fleet replay
Every motion message, every endpoint call, every state transition is recorded to a fleet trace. Any historical trace can be re-driven through the binary deterministically - same inputs, same outputs - to reproduce incidents and validate fixes before they ship to the floor.
O/04
Audit surface
Every endpoint emits a structured audit log: who, what, when, why (the originating order or operator), and the resulting state delta. The audit trail is queryable, retention-managed, and exportable for safety-case reviews.
[ G ] · Under the hood
What’s actually compiled into the binary.
Every layer below traces back to a real source path, a real test count, or a real architectural decision record. No aspirational names. What isn’t built yet sits in the Roadmap section.
S/01
Sim Backend
Gazebo Fortress · simulated multi-vendor LiDAR
Digital-twin runtime
Gazebo Fortress hosts the warehouse, the physics, and the robots. Ten production-vendor SDF models ship today - Clearpath (Jackal, Ridgeback, Dingo), Robotis (TurtleBot 3, TurtleBot 4), Husarion (ROSbot 2 PRO, ROSbot XL), Neobotix (MP-400, MPO-500), and Robotnik (Summit XL).
Engineering twist
Each robot SDF carries the vendor's actual LiDAR geometry - SICK 270°/20m on the Jackal, Hokuyo 360°/12m on the Dingo, dual SICK 270°/30m on the MPO-500 - so the sim's perception matches the real chassis it shadows.
S/02
Fleet Core
C++17 fleet_core · ~12,287 LOC · 15 Hz
FleetManager orchestration loop
A single compiled C++17 binary runs the FleetManager, A* pathfinder + Dijkstra, NodeReservation (ILP-coupled), QuadTree spatial index, MotionController (P-controller), BatteryModel, ObstacleHandler, BTCPP v4 behaviour engine, TCP server, REST server and a native mongocxx driver. The orchestration loop runs at 15 Hz with a 67 ms budget per cycle.
Engineering twist
Per-cycle critical path is 15–38 ms across nine phases (telemetry receive → state update → behaviour tick → task allocation → A* path generation → ILP reservation → command send → async Mongo flush → event dispatch). Hardened with UAF prevention via shared_ptr capture, per-robot async-write mutexes, NaN guards on motion targets, and bounded reads against malicious framing.
S/03
Planner
MAPF · CBS + PIBT dual solver
Multi-Agent Path Finding
Two solvers coexist behind one API. CBS (Conflict-Based Search) returns the optimal joint plan but goes exponential beyond 200 agents. PIBT (Push-and-Increment-Based Tree) runs in linear time and handles 500+ agents at the cost of optimality.
Engineering twist
Operators pick speed or optimality per use case via `/api/mapf/step` - the same endpoint feeds the 15 Hz fleet cycle, the congestion heatmap on the dashboard, and the bottleneck-node KPI. ADR-008 records the decision permanently.
S/04
TCP Protocol
ProtocolV1 · 33-field frames · CRC32
Robot ↔ Fleet wire format
Newline-delimited frames on port 65123. Each frame carries 33 fields plus a CRC32 checksum. Bidirectional 15 Hz: telemetry up, commands down. The same wire protocol is exercised today against the Gazebo backend and an internal multi-robot load harness; no live warehouse fleet on the wire yet.
Engineering twist
Asio coroutines on the TCP server handle 10,000+ concurrent connections on 4 threads. The 64 KB max-frame guard came from a real DoS scenario in the audit; bounded read is now load-bearing.
S/05
Message Bus
VDA5050 · Eclipse Mosquitto 2 · SCRAM-SHA-256
One shape of message across vendors
VDA5050-aligned MQTT 5.0 over Mosquitto 2. All six VDA5050 message types implemented (order, instantAction, factsheet, state, visualization, connection) with a 4-state lifecycle machine (INIT → IDLE → EXECUTING → ERROR).
Engineering twist
98 conformance tests cover the 6-message surface and the lifecycle. Three independent audit passes verified the mapping. MQTT auth uses SCRAM-SHA-256, and the broker rejects unauthenticated publishers - the wire is locked down before the protocol gets a chance to negotiate.
S/06
Behaviour Engine
BTCPP v4 · 700 LOC C++ · per-robot trees
Robot decision trees, ticked every cycle
Behavior Trees CPP v4 powers the per-robot decision logic. Each robot owns a tree that ticks every 15 Hz cycle alongside the planner. Trees compose recursively from leaf actions (move, dock, charge, pick, hand-off) and control nodes (sequence, fallback, parallel).
Engineering twist
Recursion guard catches runaway trees. Behaviour tick time is part of the cycle budget - if a tree blows the deadline it gets pre-empted, not allowed to starve the rest of the fleet.
S/07
ASRS + Pick Station
4 ASRS types + goods-to-person picking
Storage + retrieval + dispatch
Full ASRS subsystem with crane and shuttle controllers covering unit-load, mini-load, AutoStore, and carousel rack systems. Pick Station ships 10 endpoints, 138 tests, three concurrent stations, and three optimiser algorithms (nearest_sku, FIFO, priority).
Engineering twist
ASRS↔AMR handoff uses an explicit formula - `AMR_dispatch = ASRS_ETA − AMR_travel − buffer` - so the AMR arrives just-in-time at the dock and the buffer doesn't overflow. In load runs, throughput reached 450+ picks/hour.
S/08
Backend
FastAPI 0.100+ · Pydantic V2 · 213 endpoints
Schema-validated control plane
FastAPI on Python 3.11 with Pydantic V2 contracts everywhere. 213 endpoints across fleet, tasks, WES, WCS, WMS, inventory, MAPF, VDA5050, ASRS, pick stations, teleop, replay, and the saas_platform (auth, billing, reports). motor (async MongoDB), redis, influxdb-client, aiomqtt, pika (RabbitMQ).
Engineering twist
Rate limiting is Redis-backed. Audit logging carries correlation IDs through every layer. Stripe webhooks use timestamp tolerance + event-ID dedup against replay attacks. Token rotation revokes refresh tokens on use.
S/09
Dashboard
React 19 · R3F · Three.js · MessagePack WS
Live fleet in 3D
React 19.1 + Vite + Tailwind CSS + Zustand 5 for state. Three.js (0.183) via @react-three/fiber and @react-three/drei renders the live fleet, ASRS racks, pick stations, and MAPF congestion heatmap. WebSocket transport carries MessagePack-encoded payloads with delta compression for the live state stream.
Engineering twist
JWT auth lives in the first WebSocket message instead of the URL - no token leakage to logs. CSP headers locked down. Frontend serves from Docker after `npm ci` against a strict lockfile.
S/10
Verification
pytest + httpx + GTest + Playwright + Vitest
199 test files · 2,400+ tests
152 Python test files, 24 C++ GTest files, 9 Playwright specs, 14 Vitest suites. At the time of writing, 17/17 E2E API tests were passing and 20/20 cross-vector audit tests were passing. Three-auditor LLM verification gate before release; external audit issues that get closed are turned into regression tests.
Engineering twist
Six new multi-user load tests cover port isolation, concurrent auth, and path traversal. Internal 50-robot load run completed against the simulator backend. No paying customer fleet yet — under concept review with one warehousing partner. Every shipped fix names its commit hash so the bug ledger and the test suite are the same artefact.
[ H ] · Roadmap
What ships next.
The architecture leaves room for the next milestones on purpose. Each item below is named, scoped, and traceable to a decision in the strategic product roadmap - not aspiration, planned work.
R/01 · Sensor calibration
Real-fleet noise harvesting back into Gazebo
Today the simulated LiDAR uses Gazebo's default noise model on top of vendor-accurate geometry. The next step is to harvest live noise distributions from the running physical fleet and fold them into the SDF sensor models so sim-trained behaviours degrade gracefully under real-world perception artefacts.
R/02 · Warehouse import
BIM / IFC pipeline
Current warehouse imports come through ezdxf (DXF/CAD). The roadmap is a Revit IFC pipeline so the sim floor matches the real floor down to rack-frame tolerances without a manual conversion step.
R/03 · Localisation
EKF primary fusion + ENU canonicalisation
AMCL is wired in as a confidence fallback today and graph matching is primary. Promoting an EKF that fuses wheel odometry, IMU, and LiDAR-AMCL corrections into a single ENU-normalised pose with quaternion-only rotation is the next localisation milestone - one canonical frame across every vendor.
R/04 · Replay
Hash-chained tamper-evident traces
The current ReplayEngine queries InfluxDB time-series for any historical session via `/api/replay/sessions`, `/snapshot`, `/timeline`. The next iteration adds SHA-256 hash-chained binary traces so a recorded session can be replayed deterministically through the binary and proven untampered for safety-case reviews.
R/05 · API contracts
Idempotency keys on every state-mutating endpoint
Idempotent allocation, release, and reuse already exist on the resource layer. Extending idempotency keys to every state-mutating endpoint (and every webhook callback) is the planned hardening pass.
R/06 · Industrial bridges
OPC UA · ISO 3691-4 · MPC trajectories
OPC UA bridge for industrial control integration, ISO 3691-4 conformance documentation, and replacing the P-controller MotionController with an MPC-based trajectory optimiser are tracked in the strategic product roadmap.
R/07 · Edge resilience
Full offline mode on the C++ FMS
When the cloud link drops, the C++ Fleet Core continues to orchestrate locally. The roadmap closes the gaps - local persistence of WES order state, deferred replication, and reconciliation on reconnect - to minimise throughput loss during a long outage.
R/08 · LLM integration
Natural-language warehouse design + ops queries
A planned LLM layer that turns plain-language briefs ('add 12 mecanum AGVs, 4 pick stations, 80 racks') into a fully simulated warehouse, and lets operators ask 'why is bay 7 congested?' against the live MAPF + InfluxDB telemetry.
[ I ] · Design scenarios
Three workloads the binary is designed to handle. Simulated, not yet customer-deployed.
U/01
E-commerce fulfilment centre — order-to-dispatch in under 6 minutes.
Design scenario, exercised in the simulator. Inbound order → routed to the nearest AGV pool → planner pulls from the warehouse graph → robot retrieves tote from ASRS → delivers to pick station → human picks → confirmation triggers dispatch label. The simulator coordinates 40+ robots concurrently in a 9,500-square-metre virtual site. No live customer site yet.
U/02
Cold storage — coordinated lift stacking under temperature constraints.
Design scenario. Lifter robots stack pallets up to 11 metres in −22 °C operating windows. The control core adapts trajectory profiles for thermal contraction tolerances and schedules charge cycles around defrost windows to reduce the chance of stalls during a thaw. Simulated only; no cold-storage customer yet.
U/03
Mixed-fleet retrofit — AGV vendor migration without downtime.
Design scenario. A 3PL operator migrating from a legacy AGV vendor would run both fleets in parallel; the same binary speaks VDA5050 to both, so the floor team would not need to track which vendor is moving which pallet. Modelled in the simulator; not yet executed against a customer site.
Argument · WRIE
The sim and the real are not two systems.
They are one binary with two backends.
Next system
Back to all systems