Which is the best way to visualize effectively a train timetable? Well probably there is a lot of ways to visualize this kind of data, and one of them is using the stem-and-leaf plot (called also stemplot). Using this technique the amount of data to display (hours and minutes) can be reduced.

Why are the stem-and-leaf plots useful in that case? This kind of plot is a method for showing the frequency with which certain classes of values occur. You could achieve the same by making a frequency distribution table or a histogram for the values, or you can use a stem-and-leaf plot and let the numbers themselves to show pretty much the same information.

For our case, we have a train timetable that can be displayed in that way:



Pretty boring right? Are you able to see how frequently trains leave at a given hour? which are the rush hours? how many trains leave at 11pm? In fact, it’s really hard to detect any pattern here.

Using a stem-and-leaf plot we can show the train schedule below, where for each departure time, the hour-digits are the stem (left column) and the minute-digits are the leaves (right columns). So hour-digits are not repeated over and over and minute-digits can be stacked, so we obtain kind of an histogram of the departure times.



We could reduce more the amount of data to display by grouping the departure-hours that have the same minute-departures frequency, obtaining this plot:



A great example of this technique can be found at many train stations in Japan, where this plot is widely used to display the timetables. The resulting plot is useful, as it gives a quick overview of the departures distributions and it lets also to find an specific departure really fast, which is the main purpose of a train timeable.

Time table at Yokohama\’s Minato Mirai train station in Japan illustrating the widespread use of stem-and-leaf graphs in the country. Author: Eliazar

However, the data could be displayed in a different way, so I’ve spent some time playing with the plot to find another way to visualize the data. Looking at the plot, there are a lot of departure hours that share several departure minutes, so a lot of minute-digits are repeated over and over… Also, with the stem-and-plot is easy to detect the density of departures based on hours, but, what about the density of departures based on the minutes? Maybe that does not seems really useful, but who knows? So after playing some time with these questions, I end up with this table below:



The graphic is nothing new, and the result is not a stem-and-leaf plot anymore, but it has its roots on it. As you can see, hour-digits and minute-digits are displayed only once, so here the black dots are the departure times. Also the color is used as a visual clue to communicate the density of the departures for each hour/minute departures. Althought the distance between the points is not precise at all (the spatial distance between the minutes is not proportional) , the distance between dots gives you and idea about the time to wait until the next departure.

Still some improvements can be added to the graphic, but it illustrates how data can be displayed in a graphical format in order to communicate better the information contained in the data itself.

3 Comments • Give your comment!

  1. by vpascual

    10th June 2013

    11:49 am

    Very nice approach! ;)
    Maybe the last graphic would be even better using the whole time in the cells instead of just points! (just guessing…). Good job anyway!!

  2. by Xavi Gimenez

    10th June 2013

    11:59 am

    Good suggestion! Certainly a weakness of this graphic is the time that you spend to obtain the time of a point… I will think about it! ;)

  3. by Gosh Ant

    26th April 2018

    11:36 pm

    No, I think the last one is very clear=obvious like that. Adding minutes all around would destroy its simplicity and would be just copying normal timetables

Have your say

About me

Data Visualization · Interactive · HCI · Open Data · Data Science · Digital Humanities


More info here and here: