If you can’t avoid it, analyse it! #WorldCup

I am not a fan of football but whether I like it or not, after three weeks of seeing numerous grown men cry on the pitch,we have made it to the finals of the 2014 World Cup. And even though there were no vuvuzelas this time, fans have still managed to make a lot of ‘noise’ on social media platforms. It is possible that the 2014 World cup might become the most discussed topic on social media, ever. There have  been heated debates, office sweepstakes, match analyses… all trying to answer the most important question of all. Who will win?

Can we predict who will win the World Cup by using social media data (rather than octopi or camels)?

And this is where it gets exciting (for me at least) – Crimson Hexagon, a company dedicated to social media analytics, broke down their prediction for the winner into a number of components:

  • each team’s FIFA ranking (determined by their performance over the last four years).
  • the fan excitement index which is the volume of conversation about each team and team players the day leading up to the game compared against the average total posts for that team since last April.

 FIFA Rank + Fan excitement Index = Prediction

Using this ratio they calculated how ‘excitement’ is building about the team in question leading up to the team’s match. They used these components to predict the winner for each game and according to their calculations Argentina will beat Germany on Sunday by 5-2  hex


Researchers at Yahoo labs had a slightly different approach- they analysed billions of Tumblr messages that answered the question ‘who will win the world cup?’. According to their hypothesis, the popularity of a particular team as well as its players and their mentions on Tumbler blogs, directly correlates with the team’s chances to win the World Cup. They used a number of factors for their prediction analysis:

  • historical number of goals scored
  • number of goals opponents have scored
  • FIFA ranking
  • Team mentions, average number of player mentions and standard deviation of player mentions per team

According to their analysis, Brazil would be facing Spain for the finals with Brazil winning the World Cup. 


Andrew Yuan defined a predictive model for the World cup for his final year project by using FIFA rankings since 1993. Andrew found a correlation between the team’s relation in the ranking table, the location where the match took place (home away or neutral) and the proportion of matches won. He simulated every possible outcome of the World cup and calculated their probabilities. For further details on his methodology: http://andrewyuan.github.io/methodology.html and you can contact Andrew through his website 

His prediction showed Spain and Germany fighting for the title which Spain would eventually win.



LiverpoolFC also took a different (and less scientific) approach to predict how the World cup would play out by analysing the representation of countries on their Facebook page. According to that analysis, England would (should?) have won the World Cup. LFCFacebookWC (1)


Even though these predictions have so far been incorrect- I will still be keeping an eye on Sunday to see whether the predicted ‘5-2’ to Argentina will actually happen!

UPDATE: After watching the final last night, I can safely say that none of the social media predictions got it right. (Germany- Argentina final result  1-0)

0 comments on “If you can’t avoid it, analyse it! #WorldCup

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: