Train delays are frequent and unavoidable. What is avoidable is to misanticipate them. The company already had a prediction algorithm, but its accuracy was insufficient: only 10% improvement compared to the reference system. Travelers received unreliable estimates, and traffic control teams couldn't rely on the predictions to make operational decisions in real time. The objective was clear: to triple the gain in accuracy to achieve 30% improvement, and to expose the predictions via a real-time API.
Prediction of delays
A rail transport company needed to improve the accuracy of its delay predictions to better inform travelers and optimize traffic regulation. We took over and improved their existing Deep Learning model, reaching the goal of 30% improvement and deployed in production in real time.

Problem

Solution
What we built
We deployed 2 Data Scientists to take the existing model, significantly improve it and put it into production.
Step 1 — Cleaning of observational data. Development of an algorithm for cleaning the input data — train observations — to eliminate noise and anomalies that degraded the performance of the model.
Step 2 — Improvement of network embeddings. Redesign of vector representations of the rail network to better capture spatial and geographic information: topology of lines, interconnections, distances and physical constraints of the network.
Step 3 — Optimization of the Deep Learning model. Improvement of the existing Transformer model with attention mechanisms: optimization of hyperparameters, enrichment of input data, adjustment of the loss function. Increase from 10% to 30% improvement compared to the reference system.
Step 4 — Real-time production. Deployment of the model in production via an API exposing delay predictions in real time, feeding both passenger information and traffic control tools.
Projects in the same category





