What’s wrong with our polls?
Up until the last moment, most polls failed to foresee Donald Trump winning the American presidential election. In recent days, polls have also been wrong in predicting the support for right-wing populist parties, Brexit and the Colombian peace deal. Some would say that there’s always a chance of being wrong; polling deals with probabilities, which in themselves are not certain, as anyone who has played any game involving a dice would know (right, Settlers lovers?). But what can explain the recent inaccuracies in major elections and referendums? I argue that the problem lies is our assumption that the past can predict the future.
A poll is an activity in which people are asked questions in order to get information about what they think about something. So, when conducting polls, one might ask “who are you going to vote for in the upcoming election?” and get the reply “Hillary Clinton”. After asking a certain amount of people, we generally assume that we have an accurate picture of what the results would look like if everyone voted. If we could ask everyone in the U.S. (and get them to answer truthfully) there would be little risk of inaccuracy (and no need for an election). But this is neither possible nor necessary. By using the right tools, we can usually get an accurate picture of the entire population by asking a few. The objective is to get as reliable a result as possible with as little effort as possible.
One way to do this is to randomly pick people out of a phone book and call them up to ask whatever it is we want to know. The method may sound erratic but is in fact fairly reliable. The probability of getting systematic errors in the collected answers is quite small, assuming that everyone answers. But people increasingly don’t. Many young people don’t own landlines and don’t reply when an unknown number calls their cell phone, while old people tend to have landlines and answer them. This means that different groups will be over or under represented in the responses, ultimately giving an inaccurate picture.
Another difficulty is getting people to stay on the line and answer the questions sincerely. The difference between polls and the actual outcome on election day has been dubbed the “Brexit effect”, however, similar effects such as ‘The Shy Tory Factor’ and ‘The Bradley Effect’ have been known for some time. People tend to lie in polling situations if they feel cornered or feel that their self-image is threatened. As a result, there can be systematic errors in the responses, as it becomes uncertain who’s lying and who’s telling the truth.
Another method, often used together with the one described above, is to create quotas and try to fill them. If you, for example, know that 50% of the population are women, 30% are Hispanic and 15% are unemployed, this should be reflected in your quotas. So out of the people who answered your poll, you would have to check how many were from each “quota” and adjust the results accordingly. Therefore, if only 5% of those who answered your poll were Hispanic, you would enlarge the sample and make their views represent the views of all Hispanics in the general population. The problem here is obviously that those 5% might very well not represent the views of the other 25%, creating an inaccurate depiction. This has been reported to have happened in some polls prior to the U.S. election.
But even if you manage to get everyone you call to answer, and do so truthfully, and if you correct any mishaps by “weighing” certain groups (filling out the quotas), how do you know who is actually going to turn out on election day and vote? Come election day, will the people that answered your poll find themselves in the voting booth, and will they vote in the same way that they stated they would in the poll? Like with the quotas, this can be accounted for to a certain extent by looking at people’s voting tendencies in the past. Less educated and affluent people tend to vote to a lesser extent in most countries, and women are slightly more likely to vote than men. By adjusting the people who answered the poll to the expected participation in different groups, we assume that we can get a relatively clear picture of what’s going to happen on election day.
The bottom line is that we assume that things will progress in the same way they have in the past, that some trends will persist long into the future. We assume that social groups will vote similarly and that individuals in these groups therefore can be weighted to represent each other. We assume that the groups that didn’t vote last time will not do so this time either, and that they therefore can be removed from the equation.
But it is possible that the current state of affairs is different from that of the past, and that the future therefore cannot be predicted by relying on old data. While it is tempting to see events such as right-wing populism as a fad (“their support will disappear when the economy picks up”) and Brexit as an accident (“some didn’t understand that it was the actual referendum they voted in”) the reality is that many of us seem to stand at an ambiguous crossroad between the old and the new. If this is true, and Trump, Brexit and right-wing populism today represent something “new” when it comes to political behaviour (even if the message is old), then the methods we’ve used so far risk becoming inadequate in predicting what’s next.