Data Science in Transport: SNCF Transilien Internship

Optimizing Passenger Flow: Data Science Tackles Congestion on Paris’s Transilien


The Challenge of Mass Transit: Synchronizing Millions

The Transilien SNCF, responsible for mass transit rail in the ÃŽle-de-France region, faces a monumental task: seamlessly coordinating the movements of millions of passengers with thousands of trains daily. To address this challenge, the Datalab ‘Mass Transit’ was established as an innovation hub focused on leveraging data to improve passenger flow and operational efficiency.

The goal is to synchronize millions of travelers with thousands of trains, using data to create a fluid passenger experience and an efficient transit system.

Datalab ‘Mass Transit, Transilien SNCF

Analyzing Passenger Distribution: A Data-Driven Approach

One critical area of focus is understanding how passengers distribute themselves along train platforms. Research indicates that passengers do not spread out evenly,leading to potential bottlenecks,safety concerns,and delays. Justine Lebrun, a recent graduate of ENSAI (École Nationale de la Statistique et de l’Analyze de l’Information), dedicated her end-of-study internship to this issue, and is now continuing her work as a doctoral candidate.

Identifying Factors Influencing Passenger Positioning

Lebrun’s initial research involved identifying the key factors that influence where passengers choose to stand on the platform. This included a comprehensive review of existing literature and the analysis of diverse datasets,including geospatial data. By understanding these factors, researchers can develop models to predict passenger distribution and implement strategies to encourage more even spacing.

Currently,urban planning and transportation engineering heavily rely on predictive models to manage crowd flow. Such as, Transport for london uses real-time data and predictive algorithms to manage passenger flow in the London Underground, reducing congestion and improving safety. Similarly, the New York City Subway employs various strategies, including platform markings and announcements, to encourage even distribution of passengers.

Modeling and Forecasting: Tools and Techniques

Lebrun’s work heavily relies on data processing and statistical modeling. She primarily uses the R programming language, employing techniques such as linear regression, clustering, and random forests to analyze the data and build predictive models. These models aim to forecast passenger positioning based on identified variables.

A significant portion of my work involved processing diverse data types, including geodata. I primarily used R for coding and employed methods like linear regression, clustering, and random forests.

Justine Lebrun, ENSAI Graduate and Doctoral Candidate

From Internship to Doctorate: Continued Research and Growth

Lebrun’s research is ongoing as she pursues her doctoral thesis, co-supervised by Paris-Saclay University and Gustave Eiffel University. Her thesis focuses on “Modeling / forecast / management of the positioning of travelers at the quay and on board trains in dense zone,” directly building upon her internship work.

Future Applications: Improving Comfort, Safety, and Punctuality

While Lebrun’s initial work was research-focused, the ultimate goal is to translate these findings into practical applications that improve the passenger experience. By accurately predicting passenger distribution, Transilien SNCF can implement targeted interventions, such as:

  • Optimizing train car placement to match passenger demand along the platform.
  • deploying staff to guide passengers to less crowded areas.
  • Using real-time information displays to encourage even distribution.

These measures can contribute to increased passenger comfort, enhanced safety, and improved train punctuality.

The Path to Data Science: lebrun’s Journey to ENSAI

Lebrun’s path to becoming a data scientist involved a strong foundation in mathematics and statistics. After completing a preparatory program and earning a degree in applied mathematics from the University of Rennes, she sought to deepen her knowlege in statistics and data science. ENSAI, with its reputation for providing solid statistical training and diverse specializations, proved to be the ideal choice.

Choosing a Generalist approach: Statistical Engineering

During her second year at ENSAI,Lebrun opted for the statistical engineering specialization,a generalist track that allowed her to explore various career paths in both industry and research. Her interest in ecology further influenced her decision, as she considered this course to be the most suitable for her broader interests.

Interested in a career in data science? Explore the ENSAI engineer curriculum for more information.

Related Posts

Leave a Comment