When I wrote about European and North American times in Kona a couple of weeks ago my intention was simply to explore another possible use of the data I’d gathered and to take a shallow look at the commonly held viewpoint – at least in Europe – that Europeans are faster. I was happy to use a single year as the basis for my comparison and to simply compare distributions of athletes before drawing some softly worded conclusions. Then John and Bevan got in touch asking if I’d like to do an interview on the IMTalk podcast about the post; clearly it’s a slow news week in the world of Ironman, but if I was going to discuss the results and – potentially – defend the position that Europeans are faster I needed to do more research.
I’d already tried removing the Lottery and Legacy slots to examine their impact on the North American performances (small), but fundamentally one year is not enough to form a conclusion – for all I knew 2012 was an outlier where Europeans performed better than expected. I decided to look at a couple of angles: firstly retrieving data for every Ironman World Championship in the athlete tracker and secondly comparing by age group to see how that impacted comparative performance. While my tools were busy fetching the historical results I looked at the difference in performance between European and North American age divisions in the 2012 data.
Having already seen the distribution of finish times for 2012 the difference in average finish times for male age groups in the above graph wasn’t a surprise. On average, over the age of 30, Europeans are faster than their North American counterparts. Of course the faster average could be a consequence of a few unusually slow North Americans while the bulk are actually faster, I added standard deviation lines to give some indication of the spread of results for each group. European results are generally faster and typically not as widely distributed as their American counterparts; although the presence of far more North Americans in Hawaii may contribute to the wider spread.
When I plotted the same graph for the women’s field the results are noticeably different. The Europeans are faster for the most part, but there is a far greater overlap between the two – it would be harder to separate the two territories and say for certain one was faster than the other. But we are still looking at a single year in Kona which might, for all we know, be an exception.
While I was plotting these graphs my tools had successfully retrieved a number of other years. I opened up 2010 and quickly generated the same pair of charts.
Alarmingly, given I was scheduled to talk about fast Europeans and slower North Americans, my faster Europeans appeared to have gone. In 2010 there was a less obvious separation of results, some age groups appear faster, but it’s hard to – visually -determine if they actually are given the level of overlap. Unlike the 2012 results I’m not able to strip out the lottery athletes – it had proved hard enough to find names for 2012 without trying for every other year – so Europeans don’t appear notably faster even when a large number of North American lottery athletes are included. The women showed even less distinction in results (graph included in the collection below).
I worked through results from 2004 onwards waiting to see whether I would be explaining how Europeans appear to be faster or not. The results from 8 years of data are charted below, each year by gender and age group. There is not much to distinguish between European and North American women’s performance in Hawaii, not enough to claim European women are faster with any certainty. On the other hand during the same period, perhaps with the exception of 2010, European men have often been faster; not always to the same degree, but generally ahead.
Is that enough? You might be happy to eyeball the graphs and declare one side faster than the other, but having been encouraged to be more statistically rigorous I decided to also use this as an opportunity to experiment in R and perhaps apply some significance testing. Lets put a huge caveat here: I am not a statistician. I opted for a Mann-Whitney test because I’d heard of it (possibly even used it, I forget now) in the distant past, it seemed simple to apply, and after a bit of research I felt might give a degree of useful insight. Effectively it ranks members of two populations – Europeans and North Americans – allowing us to determine if the distribution of some attribute – finishing time – is significantly different or not. It seemed appropriate; although there are probably better ways to approach this (and potentially it’s entirely inappropriate, bear in mind my caveat).
Taking the first two years I examined, 2010 and 2012, by way of example, the women’s age divisions unsurprisingly show no significant difference in either year, exactly the conclusion I’d draw glancing at the graph. On the men’s side the test confirmed that in 2010 there are fewer significantly different North American and European performances than in 2012. Test results for male athletes between the ages of 30 and 45 support faster Europeans consistently since 2004, outside of that the picture is more mixed; below 25 and in the pro ranks the difference is rarely significant and above 50 some age groups are significantly faster and others aren’t. For those who like data – I’m prepared for critiques of the statistics – I’ve provided the table of results from the Mann-Whitney-Wilcoxon tests below.
It does appear that, broadly speaking at least, 30 to 50 year old European men are faster than their North American counterparts in Hawaii. That’s not to say there aren’t fast Americans or slow Europeans, just the distribution of results seems to separate in this way. There are caveats around the statistics: I haven’t eliminated lottery athletes from every set of results for example, but my impression is that there is a difference in the male field. I can’t say if this is also true outside of Hawaii, comparing results across races and years is beyond the current analysis. Nor can I do more than speculate as to why a difference exists – I pinned it on differences in competition for slots in my previous post.
Hopefully though, I sounded reasonably knowledgable and impartial when I spoke to Bevan and John yesterday afternoon. The interview should be up on the podcast sometime in the next few months, I’ll update this post when it’s available.