Another day brings more data. What frustrates me most with the Ironman Athlete Tracker is that so many details are only available via the athlete page. It was relatively trivial, mistakes aside, to retrieve the full list of results and put that into a spreadsheet, but all the splits – with additional timing mats at this year’s race – were buried away. So I dusted off some old skills, dug out my books on Perl, and yesterday evening surprised myself with how easily I could remember regular expression syntax. A bit of graft using Craig Alexander’s page as a sample produced a set of code that could extract all the data from the page and spit it out into a far more useful CSV format. I’ll spare you more details, but with the tricky step done I just had to iterate through every athlete page and collate the results.
You can access the detailed splits for the Ironman World Championship 2012 on Google Docs. If you want to play with the data yourself take a copy. I’ve yet to really pick through it, for now I’ve taken the top ten in the male and female professional ranks and looked out how their races unfolded.
With the exception of Raelert, Kienle and Bracht the male pros are largely swimming within a minute or two of each other and would have left a slickly executed transition close together. There’s more variance among the female pro swim times, but still a number of the top ten would have passed through transition together. If your bike and run is good enough in the womens’ field you can make the top ten despite a weaker swim as Tajsich and Badmann demonstrate.
The disadvantage of including ten athletes on one chart is that when they are all close together it’s hard to read. Pace and performance throughout the bike section is consistent across the board, without going back to the Ironman feed I would assume a number of these guys were riding in the same pack particularly later in the race. The women’s data follows a similar trend, pacing almost identical (bar the erroneous data point at mile 11), although the field was more split early in the race I would imagine a number rode much of the stage in a pack.
All the top ten men arrived in T2 within 10 minutes of each other. The race that unfolded afterwards with finishing times spread over a 20 minute range comes down to slight differences in pacing over the run. Trends in speed are again broadly the same, but there’s more variance and you can clearly see when an athlete is struggling – Dirk Bockel at mile 13.1 for example. The women were more broadly spread, over 20 minutes apart at T2, the differences in run pacing (missing data allowed) is less distinct than for the men. Ultimately a few places change through small differences in overall run speed, but the spread remains just over 20 minutes apart from first to tenth.
There’s a wealth of data in the spreadsheet and I’ve only picked at the surface, the easiest information to produce. I’m quite interested in comparing age group pacing on the run against the professionals and also to see if there’s much difference between age group champions and those further back in their category. A question of whether these athletes are simply faster or also racing smarter. Otherwise I’m open to suggestions of how to further examine the information.