Fooled by randomness in the US election: why we shouldn’t pay too much attention to daily polls

I’ve been in an argument this morning with someone on my desk, who thinks Barack Obama should be seriously worried by Mitt Romney’s momentum in the polls. A Gallup/USA Today poll yesterday put Romney ahead by four points among likely voters in the crucial swing states; it’s not definitive, says my opponent, but the numbers have been building up, and if you look at the results of polls day-by-day, you can see a trend.

My colleague has been watching US elections for a lot longer than I have, and with a lot more interest, and is a lot more politically knowledgeable than I’ll ever be. But I suspect that looking at the polls day-by-day is the worst thing you can do, and seeing trends where there are none is one of our greatest weaknesses.

In Fooled By Randomness, Nicholas Nassim Taleb’s book about the human tendency to imagine patterns in noise, the author points out that the very worst thing an investor can do is to get daily updates on the performance of her stock. He imagines a good investor, whose purchases will earn an average of 15 per cent above Treasury bills with 10 per cent volatility: no need to get into the numbers here, but essentially a steady profit made with impressive reliability.

He then works out the probability that the investor will show a profit on her investment over given timetables. Over a year, she’s 93 per cent likely to be in profit. Over a quarter, 77 per cent, and over a month, 67 per cent. But if she checks her investments daily, she’ll only be in profit 54 per cent of the time; hourly, just 51.3 per cent.

So almost half the time she looks at her stock, it’ll look like she’s doing badly. We are more loss-averse than gain-seeking: as the psychologists Amos Tversky and Daniel Kahneman showed in the 1970s, most of us who are offered a gamble on the toss of a coin are unwilling to do so unless we are offered a pay-off at least double what we might lose. Irrationally, we fear losses twice as much as we crave gains. So that daily check on her stock will feel agonising: dimly felt gains almost perfectly matched by gut-wrenching losses. She’ll be driven to sell stocks on the basis of a few random dips: an investor who only checked once a month, or better yet once a year, would be far less likely to suffer the wounds of randomness, and therefore more likely to make rational decisions, and thus make money on her investments. (This has been shown experimentally, by the way.)

I think there is probably a parallel, an imperfect parallel, with electoral polls. The results of the US election is of real importance to billions of people, especially commentators who have pegged their credibility to one or other (see for example Andrew Sullivan, who goes into fits of uncontrollable weeping at every positive Romney poll). The New York Times’s wonderfully wonkish and stats-nerdy FiveThirtyEight blog has been saying for some time that the recent swings back and forth seem to be statistical anomalies: Romney was leading in national polls but losing in state ones, and so on. Conveniently, yesterday, they published one entitled “Distracted by Polling Noise“, in which they talk about the problems of the hundreds of state and national polls that have been going on, and how easy it is to see what you want to see – or, of course, as in the case of Mr Sullivan, what you don’t want to see. The problem is compounded when lots of the polls have small samples with five- or six-point margins of error, essentially useless in a race which will be decided by one or two per cent of the vote, and by subtle but profound ways of unconsciously biasing your data, such as reading too much into subsets (“Working-class white male voters in Ohio”, etc), which themselves will be even smaller samples. (“Swing states”, an imperfectly defined subset, are especially tricky, apparently.)

These daily polls, and especially the over-examining of subsets, are the equivalent of the hourly stock updates that played merry hell with our imaginary investor. There’s simply too much randomness, and if you’re Andrew Sullivan, you’ll be seeing your Obama-stock dropping on far too many occasions, and it’ll hurt, and you’ll panic. (Equally, of course, people are prone to “confirmation bias”: hence all those pundits who are smugly certain that one or other will win, because they’ve seen a poll which puts their guy in the lead.)

There are other issues with the data: an earlier FiveThirtyEight blog post talks about the problem of “herding”, in which less rigorous, less reputable polling firms seem to cheat off the main ones, making sure that their own improperly gathered data doesn’t stand out from the crowd. So if one major poll finds one or other man has moved ahead, some of the lesser pollsters might follow it, giving an impression of a more statistically reliable change than it is.

Does this mean that we can’t trust anything that the pollsters say and we should ignore it all? No. But we should a) only really pay attention to the big aggregating ones (I recommend FiveThirtyEight because its author Nate Silver admits to being as prone to human biases as anyone, so he uses an algorithm to put the data together with no input from him) and b) be aware that there’s an awful lot of randomness and uncertainty even then.

So who’ll win? I don’t know. Neither does anyone else (although a lot of people have made very confident predictions one way or the other; some of them will be right, and some of them will be wrong, but all of them will have been guessing beforehand). The best we can do is look at the data we have, and the margins of error from them. FiveThirtyEight’s tracking poll has it at Obama 64 per cent, Romney 36 per cent; Betfair has about the same, Obama at one to two, Romney at two to one against. agrees. Essentially, from the available numbers right now, Obama is about twice as likely to win as Romney. It might change in the future, but anyone who gets much more confident than that has gone further than the data allow; and checking the polls every three hours will not tell you anything new.


