Colombia v Columbia: How many people get it wrong?

tldr; I reckon about a quarter of people get it wrong but I’m open to being corrected…

I was recently in Colombia (not Columbia) and whilst I was out there I noticed wristbands and T-Shirts etc for sale with the “It’s Colombia not Columbia” branding similar to below.

Screen Shot 2016-11-29 at 13.31.36.png
Credit: Miami Herald 

Colombians were understandably annoyed by how their country was consistently miss-spelt and so this campaign was an attempt to highlight and change this. When I saw the campaign, I was curious to know the numbers behind the miss-spelling.

The tragic news of the Colombian plane crash gave me a chance to look try to do a quick and dirty analysis of the scale of the problem.

The plane crash being such a big, worldwide news event meant that the search data for that time around people searching for Colombia V Columbia would be likely to be overwhelmingly about Colombia, the country rather than Columbia the university, clothing company or the state.

To see quantify the scale of the problem and look at the worst culprits I compared Colombia and Columbia in Google trends for a timespan of one day.

Screen Shot 2016-11-29 at 13.49.26.pngscreen-shot-2016-11-29-at-13-50-15screen-shot-2016-11-29-at-13-49-36

If we assume that most searches around this day were about the plane crash and therefore should have been ‘Colombia’ then you could see that most people were getting it right when they searched with a hefty chunk getting it wrong. But that’s a big assumption, that most people were searching for the plane crash.

So I then looked at just those News searches in Google trends by narrowing down the category. That’s much less of an assumption.

Below shows searches for Colombia (blue) and Columbia (red) over the period of one day (the spikes are when news broke of the crash). On breaking news, the red peaks at around 35 relative to the blue’s 100 so 35 as a percentage of 35+100 = 26% = that feels like a fairly good finger in the air estimate of the volume of people who miss-spell Colombia. 

Screen Shot 2016-11-29 at 13.54.53.png

A quick Google search showed that even respected news organisations get it wrong over the spelling of Colombia. Interestingly, when I clicked on the links on Google’s In the news results, below, it looked like the Columbia spelling had been corrected on the Business Insider site (but not in its page title), on the Mirror (which presumably had initially published with Columbia unless Google autocorrected) and was still miss-spelt on the Manchester Evening News.

Screen Shot 2016-11-29 at 13.40.32.png
Google autocorrecting or news editors correcting?

 

 

10 amazing things your internet search data is used for

In the days before the internet, libraries were a much more important source of free information for many people. What they lent, when and where from was, until recently, recorded by the Public Lending Right (PLR) body in the UK.

I (vaguely) remember the data that the PLR used to collect being used by the media as a gauge of public interest in a particular topic, book or genre however, as it was by its nature out of date by the time it was published, it was of limited use.

Google is our librarian now and the data that it gathers not only tells us what people were interested in, but what in particular (a bit like being able to tell which specific paragraph of a book people were interested in, rather than just the fact they were interested in that book).

The data also tells us when specifically they were interested in that topic, where they were, if they were satisfied with the information they received and even what demographic they were likely to belong to.

Coupled with all of this extra information, we also know what they wanted last week, rather than waiting for six months and, as the internet provides information on virtually any topic, we have a full on data set that details virtually anything that anyone has ever been interested in knowing.

So what I’m trying to say is that the data collected by Search engines and ISPs when you search for something online is a bit of a treasure chest. That little query string at the end of your URL may not mean much taken on its own (apart from to those closest to you), but when fed into many, and aggregated, and sliced, analysed and interpreted, it becomes one of the most powerful sources of potential knowledge and insight into human behaviour that we have today.

In my opinion, that knowledge is largely untapped due to lack of knowledge of its existence, the expense of getting more than a limited view of the data through third party tools and partly, perhaps, due to the lack of high profile use cases.

There are some researchers and organisations however that are making use of this data, mostly in Health, Finance and Marketing, paving the way for others.

I’ve made a start in listing them here. If you know of any other examples, feel free to comment and I’ll add them into the list.

1. Predicting unemployment

In March 2013, four academics from Beijing’s Renmin and Tsinghua universities published a paper detailing how using search engine data had outperformed traditional methods of predicting unemployment .

Similar results were achieved by German researchers, from Bonn university in May 2009

2. Knowing when people are abusing drugs

In November 2012, a paper was published by the Clinical toxicology (Philadelphia) journal detailing how internet search data could be used to detect outbreaks of people abusing drugs known as “bath salts”.

3. Measuring public awareness of Erectile Dysfunction

The Journal of the British Association of Urological Surgeons, BJU international published paper in December 2012 looking at public awareness of erectile dysfunction in Ireland, following a series of public awareness campaigns

4. Predicting outbreaks of Dengue fever

In August, 2011 a paper from PLOS Neglected tropical diseases concluded that “Internet search terms predict incidence and periods of large incidence of dengue with high accuracy and may prove useful in areas with underdeveloped surveillance systems.”

5. Predicting outbreaks of the flu

Google.org have long been predicting flu outbreaks and have a sleek website that really brings the data to life

6. Making loads of money from the stock market

Okay, so there’s a little bit of supposition in that but there have been studies linking search data to stock market activity and if anyone knows how to use data to make money, it’s got to be stockbrokers, right?

7. Helping computers understand humans

Microsoft looked at using search data to help machines understand human speech in this paper

8. Predicting house prices

A study by researchers from MIT said “We found evidence that queries submitted to Google’s Search Engine are correlated with both the volume of housing sales as well as a house price index”

9. Knowing when we’re more likely to spend

The Bank of England were reportedly using search data to help them understand consumer confidence in the UK

10. Selling you things online

Google and other search engines have long made their search data available for advertisers to research what their website visitors are most likely to search for and so shape their Ads, content and even website architecture accordingly

7 Most Googled athletes of the 2012 Olympics

English: Mo Farah at the 2010 European Athleti...

This years Olympics have really captured the imagination of the UK public.  Millions of us went online before, after or even during events to find out more about those on our screens.

So the question is, who were we searching for and what did we want to know?  The most popular personalities in order of searches to wikipedia are listed below, with the top 5 searches involving their name, listed beneath them.

1.  Jessica Ennis

  • jessica ennis boyfriend
  • jessica ennis hot
  • jessica ennis bikini
  • jessica ennis parents
  • jessica ennis sexy

2.  Usain Bolt

  • usain bolt 2012 olympics
  • usain bolt girlfriend
  • how old is usain bolt
  • how tall is usain bolt
  • usain bolt facts

3.  Bradley Wiggins

  • bradley wiggins twitter
  • bradley wiggins wife
  • bradley wiggins piers morgan
  • bradley wiggins diet
  • bradley wiggins olympics

4.  Mo Farah

  • mo farah foundation
  • mo farah olympics 2012
  • mo farah wife
  • mo farah the cube
  • mo farah medal ceremony

5.  Joanna Rowsell

  • joanna rowsell alopecia
  • joanna rowsell bald
  • joanna rowsell hair
  • joanna rowsell olympics
  • joanna rowsell cyclist illness

6.  Victoria Pendleton

  • victoria pendleton hot
  • victoria pendleton fhm
  • victoria pendleton pictures
  • victoria pendleton boyfriend
  • victoria pendleton disqualified

7.  Ian Thorpe

  • ian thorpe gay
  • is ian thorpe gay
  • ian thorpe girlfriend
  • ian thorpe boyfriend
  • ian thorpe 2012 olympics

Data from Hitwise, looking at searches made by the UK public over 4 weeks, ending 12th Aug 2012

Summertime searches

What do different countries search for in the summer?  Nations characters are revealed through their summertime searching…

(method at the bottom of this post)

UK

Top 3 summertime searches

  1. Hedge trimmers
  2. Float fishing
  3. Grass snake

Image

Germany

Top 3 summertime searches

  1. Karwendelhaus (Rental homes in the Karwendal mountains)
  2. Brukenfahrt (cruise / tours)
  3. Putter (golf putter)

USA

Top 3 summertime searches

  1. 1st anniversary
  2. Tours Chicago
  3. Remove mildew

Italy

Top 3 summertime searches

  1. Corsa in Montagne (mountain race)
  2. Versicolor (a fungal skin infection)
  3. sr 50 (scooter)

Russia

Top 3 summertime searches

  1. кемер (Resort town in Turkey)
  2. пластиковые (Plastics packaging!?!)
  3. рейс (flight)

Method

I was looking for seasonally linked health searches in the UK and decided that the best way to do this was to use Google correlate, a tool that can take your data (in my case, British summer temperatures from the met office website) and map search queries to that data.

What you end up with is a chart similar to the one below, which gives you the search term for your chosen country that most closely matches the pattern of your data i.e. The search term that follows your trends and not the search term that is searched for the most.

I then exported all of the terms as a CSV file (which it lets you do if you’re logged in with a Google account and, to get over the translation problem, I just pasted them all into Google translate and from there into wordle where I edited the colours to match those of the countries flag.

Feel free to comment below, or try it yourself and stick a link to it here…

Most searched new year’s resolutions

Recently, I’ve noticed a group of search terms that can be classified as ‘self improvement’ type searches.  For instance, whilst looking into what languages people wanted to learn online, I noticed that pretty much all of them had a surge of interest in the first week of the new year.

I decided to look at all of these searches together to see what the overall most searched for new year’s resolution was and if this corresponds to any other data.

I grouped the ‘self improvement’ searches together into 5 categories…

and then popped them into Google insights for search

**Yellow = Job**  **Green = Money** **Blue = Weight** **Purple = Gym** **Red = Learn**

(I did include ‘volunteer’ as a group and searches did peak in January for this term however the number of people searching for it was so low, you could barely make it out on the chart).

You can see they all have peaks in January but what is interesting here is the effects of the recession, first pointed out to me by www.nicholine.com  in her paper looking at the recessions effect on search behaviour.

If you notice every year, searches including the term ‘jobs’ not only had the biggest overall volume but also the biggest difference in volume between the year as a whole and the spike in January however last year this appeared to drop off dramatically and almost get overtaken for the first time by ‘money’ searches…

**yellow = searches for ‘jobs’** and **green = searches for ‘money’**

My interpretation of this is that fewer people were seeking to change career in the new year last year, looking to hold on to what they had during a gloomy outlook whilst more people were interested in ‘money’ whether that was saving, not spending or earning.

Steady learning and rising weight

People searching to learn a new language or skill in the new year have remained steady since 2004 and yet people looking to lose weight in the new year have slowly climbed in a very consistent way since 2004 with last year being a particularly high peak, coinciding with the relatively high number of increased ‘gym’ searches.

How they compare

It’s difficult to find reliable data online however this report from Marketing Charts seems to back up our findings whereas this from bing suggests fitness is first (although they don’t mention whether or not they actually looked at any finance related data.

US searches for new years resolutions

All the above data was looking at UK searches however if you compare them to searches in the US, you can see that weight is a much (weightier) concern…

Again, **Yellow = Job**  **Green = Money** **Blue = Weight** **Purple = Gym** **Red = Learn**

Interesting to see also in this case that whilst searches for ‘weight’ inc weight loss etc are a lot higher than the UK, searches for the gym are actually a lot smaller comparatively – do you want to tell them or shall I?

I love getting and answering questions so feel free to comment using the form below.

Why are cupcakes so popular?

Inspired by the wheel of hunger visualisation from the other week I started to look into food trends around the world with the ultimate aim of making a “World food trends” visual in the near future.

Along the way I stumbled across a beautiful demonstration of how Google insights for search can show a trend sweeping the globe.

The video below shows searches for cupcakes from 2004 to the present day (The darker blue the country gets, the more popular cupcakes searches are).  I like to think you could probably substitute ‘cupcake’ for ‘American fad’.

And this is what happened where it all started, in America.  Note, California starts the trend swiftly followed by those North East regions.

And here is the global search demand for cupcakes mapped out since 2004.  Note all of the peaks occurring in the early years in October (Halloween?) and now we have peaks in February and March too.

Global searches for cupcakes since 2004

Cupcake sites I came across during my research

Spiders and Conkers

My wife and I had a debate last night on the back of me having to catch and set free 4 massive house spiders over the course of two weeks.  My twitter stream was also full of people mentioning the fact that they’d seen huge spiders and the debate we had was around whether or not we should go out and gather some conkers!

My opinion is that the ability of conkers to deter spiders is an old myth.  But why would people think they had any effect in the first place?

Using what people search for on the internet provides compelling evidence…

Google searches for 'spiders' since 2004

So every year people search for spiders around the same time (May – June and then a big peak in September).  This years peak is particularly high, reflecting the unseasonably warm weather we’ve had in the UK which is encouraging spiders to come out and find mates.

So far so good, but what about the conkers? Well…

The red line represents searches for ‘conkers’ and at first glance there is indeed a correlation between the two search terms but take a closer look….

And you can see that as searches for conkers are peaking, the spiders are declining i.e. By the time of year that conkers fall to the ground comes around, spiders naturally stop mating and so are seen less.

So it’s easy to see why people would have thought that the conkers they had brought into their houses were keeping the spiders away, confusing correlation with causation and hence the birth of an old wives tale.

Note:  Despite the above evidence, we’re still going out to try and find some conkers tonight.

The wheel of hunger

(click image to enlarge)

Big thanks to Adam Hinks for his help in making this data look magnificent and to James Webb for his invaluable feedback and direction.  Want to see other ways search engine data can be made awesome?

Background

Our brains process 400 billion bits of information every second

Sounds like a lot doesn’t it?  Well it is, but fortunately we only ‘experience’ around 2,000 bits of this

This filtering enables us as human beings to make decisions based upon the data we receive. Without it, we’d be paralysed by the overwhelming possibilities of everything we see, touch, hear, store, smell and sense

In much the same way, when we look at what people are searching for online, it’s easy to be overwhelmed by rows upon rows of search terms, dates and volumes (even if it ‘only’ amounts to less than a million rows)

For someone looking at the data all day with advanced spreadsheet skills and tools it can be a little easier but we can’t all be data ninja’s and nor would we want to be

That’s why the visualisation above was created.  It takes a mass of data around what people search for online and filters it down through stages in order to answer the important question for anyone interested in creating content around food ‘what should we be talking about / linking to / sharing / discussing / creating / revamping?’

The filter process:

What people search for

(look at a big dataset of search terms that send traffic to over 3,000 recipe websites, in this case Hitwise)

What (UK) people search for around food

(restrict data to only those searches that ended up visiting a food related website)

What (UK) people search for around food every month

(Download the data on a month by month basis)

The most trending searches around food every month

(Compare this each months data to the previous months data and chart those searches that have risen the most in volume.  This is to make sure that we’re looking at search terms that are ‘big’ due to the month and not just those that are ‘big’ all year round).

What’s the point?

In creating this visualisation we’ve hopefully made something that helps us better connect to our audiences.

As every marketer knows, messages need to be delivered to the right people at the right time in the right place, using their language.  In refining it to just the top twenty trending search terms for each month we’ve hopefully made something that will not just make sense but will be used to make a difference.

Big thanks to Adam Hinks for his help in making this data look magnificent and to James Webb for his invaluable feedback and direction.

You can download the infographic in PDF format here BBC_Food_Infographic_v5

Most other search visualisations from this site are here

Please take a while to comment below.

Search engine data visualisations

I’ve decided I need a single place to put all of the search engine data visuals that I’ve been working on.

The visuals are made up of thousands of actual queries put into search engines by UK users over the course of a year.  This gives us an idea of ‘search demand’ which can/may/should equal actual, offline demand for a topic.

Feel free to republish however please link to this blog and also to James Webb who helped to create them.

They can be downloaded as PDF’s at the bottom of this page.

Overall

Gardening

Health

Science

Nature

History

Questions

Download as PDF

Click the links below to open the visuals in PDF format for better quality printing / viewing.

Overall

Gardening

Health

Science

Nature

History

Questions