Worldteliguiscience: septembre 2014

This blog post is a short introduction to the science of prediction which is a topic that I have been totallt immersed in over the last new months and recently presented about at the 2014 ESOMAR Congress with Hubertus Hofkirchner. I thought I would share some of what I have learnt.

The accuracy of any prediction is based roughly around this formula...

P Accuracy = Quality of information x Effort put into making the prediction x (1 - difficulty of accurately aggregating all the dependent variables) x The level of Objectivity with which you can do this x The pure randomness of the event

P = QxEx(1-D)xOxR

Here is the thinking behind this:

If you have none of the right information your prediction will be unreliable
If you don't put any effort into processing the information your prediction may be be unreliable
The more complex a task it is to weigh up and analyse the information need to make a prediction the less likely that the prediction will be correct
Unless you stand back from the prediction and look at things objectively then your prediction could be subject to biases which to lead to you making an inaccurate prediction
Ultimately prediction accuracy is capped by the randomness of the event. For example predicting the outcome of tossing a coin 1 time v 10,000 times have completely different levels of prediction reliability.

Realize that prediction accuracy is not directly linked to sample size

You might note as a market researcher, that this formula is not directly dependent on sample size i.e. one person with, access to the right information, who is prepared to put in enough effort, has the skills needed to process this data and is able to remain completely objective, can make as good a predictions as a global network of market research company interviewing millions of people on the same subject! I cite as an example of this Nate Silver's achievement of single handedly predicting all 52 US State 2012 election results.

Now obviously we are not all as smart as Nate Silver, we don't have access to as much information, few of us would be prepared to put in the same amount of effort and many of us many not be able to process up this information as objectively.

So it does help to have more than 1 person involved to ensure that the errors caused by one persons lack of info or another person lack of effort or objectivity can be accounted for.

So how many people do you need to make a prediction?

Now this is a good question, the answer obviously is that it depends.

It firstly depends on how much expertise the people making a prediction have on the subject individually and how much effort they are prepared to make. If they all know their stuff or are prepared to do some research and put some thought into it, then you need a lot less than you might think.

16 seems to be about the idea size of an active intelligent prediction group

In 2007, Jed Christiansen of the University of Buckingham took a look. He used a future event with very little general coverage and impact, rowing competitions, and asked participants to predict the winners. A daunting task, as there are no clever pundits airing their opinions in press, like in soccer. However Christiansen recruited his participant pool from the teams and their (smallish) fan base through a rowing community website, in other words, he found experts. He found that the magic number was as little as “16”. Markets with 16 traders or more were well-calibrated, below that number prices could not be driven far enough.

The Iowa Electronic Market, which is probably the most famous of prediction systems out there that has successfully been used to predicted over 600 elections, has I understand involved an average of less than 20 traders per prediction.

Taking account of ignorance

However for every one completely ignorant person you add into the mix who effectively makes a random prediction you will instantly start to corrupt the prediction. And in many situations this is scarcity of experts means to isolate ignorant and expert predictions this often means you need to interview a lot more people than 16.

Take for example trying to predict tomorrows weather. Imagine that 10% of the people you ask have seen the weather forecast and know it will not rain - these could be described as the experts and the rest simply guess 50% guessing it will rain and 50% not its easy to understand that if by chance more than 10% of the random sample predict it will rain, which is entirely possible the group prediction will be wrong. Run the maths and for 95% certainty you will need to have a margin of error of less than 10% to be confident which means you will have to ask 86 people.

It gets even harder if the experts themselves are somewhat split in their opinions. Say for example you were trying to predict who will win a tennis match and 10% of the sample are you ask are keen tennis fans (experts) who predict 2:1 that player A will win, the rest randomly guess 50% player A 50% player B. Because of division in the experts you now need to a margin of error of less that 7% to be 95% confident which means you will need to interview around 200 people.

Taking account of cognitive bias

It gets even harder if you start to take into account cognitive biases of the random sample. For example just by asking whether you think it will rain tomorrow more people will randomly say yes than no because of latent acquiescence bias. We have tested this out in experiments for example if you ask people to predict how many wine drinkers prefer red wine the prediction will be 54%, if you ask people to predict how many wine drinkers prefer white wine the number of people who select red wine drops to 46%. So its easy to see how this cognitive bias like this make predicting things difficult .

In the above example predicting the weather this effect would instantly cancel out the opinions of the experts and no matter how many people you interviewed you would never be able to get an accurate weather forecast prediction from the crowd unless you accounted for this bias.

This is just one of a number of biases that impact on the accuracy of our predictions, one of the worse being our emotions.

Asking a Manchester United football fan to predict the result of their teams match is nye on useless as it almost impossible for them to envisage losing a match due to their emotional attachment to the team.

This makes political predictions particularly difficult.

Prediction biases can be introduced simply as a result of how you ask the question

Imagine I were doing some research to get people to predict how often when a coin is tossed it is heads and I asked the question "If I toss this coin, predict if it will be heads or tails" for the reasons explained above the on average around 68% of people will say heads. The question has been asked in a stupid way so it delivers back a wildly inaccurate aggregated prediction. If you change the question to "If a coin were tossed 10,000 times, predict how often it would be heads" you probably need no more than a sample of 1 to get an accurate prediction. Now this might sound obvious, but this issue sits at the route of many inaccurate predictions in market research.

Imagine you asked 15 people to predict the "% chance" of it raining tomorrows and 5 of them happen to have seen the forecast and know there is a 20% chance of rain and the rest randomly guess numbers between 0% and 100%. If their average random guess is 50%, this will then push up the average prediction to 40% rain. If there is the same mix of informed in non informed predictors in your sample like this, it does not matter how many more people you interview the average prediction accuracy will never improve and will always be out by 20%.

This runs very much counter to how we tend to think about things in market research, where its nearly all about gathering large robust samples. In the world of prediction, its all about isolating experts and making calibrations to take account of biases.

The stupid way we ask question often in second hand ways we ask questions can exacerbate this.

"Do you like this ad" for example is not the same question as whether you think its going to be a successful ad. The question is a Chinese whisper away from what you want to know.

A successful ad is not an ad I like its an ad that lots of people will like. Change the question and motivate the participants to really think and we have found to make a perfect prediction about the success of an ad samples drop from around 150 to as low as 40.

Picking the right aggregation process

The basics

Imagine you were asking people to predict the trading price of a product and a sample of predictions from participants looks like this.

$1, $1.2, $0.9, $0.89, $1.1, $0.99, $0.01, $1.13, $0.7, $10,000

Your Mean = $1,000 .....Whoopse that joker putting in $10k really messed up our prediction.

Now for this reason you cannot use mean averages. For basic prediction aggregation we recommend using Median. The median average of these is = $1 which looks a lot more sensible.

An alternative might be to simply discard the"outliers" and use all the data that look sensible. In this example its the $0.01 and the $10,000 that look out of sync with the rest removing these the medium average = £1.03 which seems a bit more precise

Weighting individual predictions

The importance of measuring prediction confidence

In the world of prediction its all about working out how to differentiate the good and bad predictors and one of the simplest techniques to do this is simply to ask people how confident they are in their prediction.

For example if I had watched the weather forecast I would be a lot more confident in predicting tomorrows weather that if I had not. So it would be sensible when asking people to predict tomorrows weather to ask them if they had seen the weather forecast and how confident they were. From this information you could easily isolate out the "signal" from the "noise"

The trick is with all prediction protocols to try and find a way of isolating the people that are better informed than others and better at objectively analyzing that information but in most cases its not as easy as asking if they have seen the weather forecast.

For more complex predictions like predicting the result of a sports match, prediction confidence and prediction accuracy is not a direct linear relationship but certainly confidence weighting can help but needs to be carefully calibrated. How you go about this it a topic for another blog post.

In the mean time if you are interested in finding out more about prediction science read our recently published ESOMAR paper titled Predicting the future

How to make the perfect guess in a pub quiz

Having spent the last few months researching and studying the science or prediction and also being quite fond of pub quizzes here is my guide to how to make a perfect guess in a pub quiz using some of what we have learnt.

Step 1: Ideation

Ask people to think of the first answer that comes into their heads.

If they think of an answer they should not shout out the answer, this could corrupt the purity of other participants thinking. They should put up their hand to indicate they have thought of and answer and write it down. The should also write down how confident they are on a scale of 1 to 3. Each player can think of more than one answer but they must score their confidence of each one.

Confidence range:
1 = a hunch
2 = quite confident
3 = certain

Answer time:
Under 5 seconds = certain
5+ seconds = assign certainty based on personal confidence measure...

Step 2: Initial idea evaluation

After the point at which everyone gives up, you then share your answers from the team and the level of confidence.

Rules for deciding if the answer is correct:

If more than one person has come up with the same answer in under 5 seconds then its almost certain that this answer is correct.
If anyone is certain about their answer, there is a high chance this answer is correct.
If more than one person comes up with the same answer and the combined confidence score is higher than 3 then there is quite a high chance that answer is correct and suggest you opt for that.

If there is a conflict or no answer scoring more than 2 point then go to step 3....

If nobody has come up with an answer the team is satisfied with go to step 4....

Step 3: Answer market trading

Each person must rate each answer by buying or selling shares in each answer choice with some "virtual money" . They can buy or sell up to 2 shares in each answer.

Tip: If a person has 2 ideas that both are "hunches" then the first idea research has shown this is around 30% more likely to be correct. Take this into consideration when making your buy / sell decisions.

e.g. if I think an answer is definitely correct I buy 2 shares. If I think its correct but I am unsure I buy 1 share, If I think its definitely not correct I sell 2 shares, If I am feeling a little uncomfortable that it is wrong I sell 1 share. Everyone has to commit to buy or sell - no body is allowed to sit on the fence.

Add up the total money traded in each idea and choose the winner.

If you want to be super nerdy about how you do this then don't simply add up the amount bet. Answer should be weighted somewhat as there is not a linear relationship between betting confidence and prediction accuracy. Having studied a data from a large number of predictions we have found that prediction accuracy of somone who claims to be very confident is not twice as good as someone who has a hunch its only about 20% better (see chart below). And people having a hunch are only 10% better than people making a total guess. Interestingly there is little difference between someone who has a hunch and someone who says they are fairly sure.

Further more when you look at people betting against things and comparing to betting for things the prediction accuracy of the amount bet varies in an odd way. Smaller negative bets are slightly more predictive we found than large negative bets. Strong positive bets on the other hand were more predictive than small positive bets but those that bet more than 2 were actually slightly less predictive than those that bet 2. Hence our 2 point betting scale.

A more accurate betting aggregation process should score the amount bet like this:

-2 = -20%
-1 = -20%
+1 = +10%
+2 = +20%

If on either of these aggregation processes no idea has a positive trading value then go to step 4....

Step 4: Idea stimulation

If you are not satisfied with any answer, then all the team members should voice any "clues" they may be thinking about e.g. "I think his name begins with B" or "I think its something to do with football". Your thoughts could help another person think up the answer.

The scientific terms for this is called "Dialectical Boostrapping" - which basically means the sharing and discussion of ideas, which has been shown to help improve crowd wisdom generation processes. Find out more about this here Herzog and Hertwig (2009)

The more small clue you share they greater the chance of one of them triggering a thought in a team member. Note these can also be negative clues e.g. its definitely not...

If this process stimulates any ideas then go back to step 3 to evaluate them...

Step 5: Picking the best of a bad bunch of guesses

If you are left with more than one answer that nobody is particularly satisfied with, then pick the first answer the first person thought of. This one has the highest chance of being correct. It wont necessarily be right but it will have a slightly higher chance.

Advanced techniques:

Performance weighting your teams predictions

If you keep track of each individual's answer trading record over the period of several quizzes (.i.e if they bought 2 shares in an answer that eventually proved to be correct their personal balance would be +2). You can then start to weight your teams market predictions. You can do this by giving each person in the team a different total pot of money to bet based on their past performance record in correctly predicting the right answer based on how much money they would have won.

Note it would take several weeks studying at least 100 predictions to get a good idea of the prediction ability of each player so it would be a mistake to calibrate this after only one or two quizzes - luck has far more important role to play thank skill in the short term.

You might also want to assess how individuals confidence levels change when they have drunk 1 unit, 2 units 3 units of alcohol and start removing budget (or indeed giving extra budget!) as the night progresses!

Encouraging the team to think like foxes not hedgehogs

What buggers up the predictions of many pubs quiz teams can be the bullish viewpoint one or two individuals. Having a strong opinion about things generally I am afraid does not correlate very well with actually being good at making predictions. If you want to read up some evidence on this I recommend you order this book all will be explained.

The team should foster an atmosphere where its OK to change your mind, its not a battled between right and wrong , and should not be scared of failure.

Avoiding decision making biases

If the question is multi-choice make sure that your answer is no biased by the order effect or anchoring in the way the question is asked. For example yes/no questions more people pick yes than no for irrational reasons. When presented with multi choice options slightly more people pick the first choice for irrational reasons. By being aware of this you can be conscious that your decisions are being made objectively.

Important Note/disclaimer:

The advice is a fantasy methodology for making a perfect prediction. I don't advocate you using it in a real pub quiz. Firstly for practical reasons, In reality the speed at which most pub quizzes progress you probably would not have the time be able to implement this approach. Secondly it may also not be in the spirit of a fair pub quiz to use this technique in real life - it might be considered cheating!