# Bayes law, Nate Silver and voodoo economics

Written by Michael Roberts

Thursday, 22 November 2012 02:54

Nate Silver is the new hero of the liberal left in the US. This mathematician and statistician correctly forecast Obama’s victory in the presidential election and in the Senate and the result for the electoral college in all 50 states. On the morning of the 6 November 2012, the final update of Silver’s model gave President Barack Obama a 90.9% chance of winning a majority of the 538 electoral votes. Both in summary tables and in an electoral map, Silver forecast the winner of each state. Silver’s model correctly predicted the winner of every one of the 50 states. In contrast, individual pollsters were less successful. For example, Rasmussen Reports, widely quoted by the right-wing “missed on six of its nine swing-state polls”.

Silver has now published a new book that is already a best seller and he now regularly appears on TV talk shows. Silver brilliantly exposed the biased commentaries of the right-wing TV channels and papers whose pundits regularly appeared on screen or in print to say that they ‘had a hunch’ that Romney would win or that the polls were ‘biased’ against the Republican candidates. Silver, in the meantime, quietly presented a statistical analysis of the polls and concluded the probability of Obama winning was over 80% and rising. His forecast was dead right. On November 12th, his new book, The Signal and the Noise (print edition) was named Amazon’s Best Book of the Year for 2012.

The evidence is that statistical analysis is way better at forecasting things than ‘hunches’ or human intuition. Indeed, out of the one hundred studies comparing the accuracy of actuarial statistics (probability analysis) and intuition, there has not been one case humans doing better (Stuart Sutherland, Irrationality, p200). Indeed, in most studies, actuarial analysis was way better. Take bank loans, nowadays 90% of loan applications are reviewed by computers taking into account client details against aggregate evidence on bank accounts, jobs etc to gauge risk. Loans granted by computer using statistical probabilities turn out to have far less defaults than those borrowers chosen by bankers on their own judgement. Insurance companies have applied to risk in life expectancy and accidents for many years. So when somebody tells you that their intuition delivers better results, they are talking out of their hats. Why would you not choose statistical methods to raise your chances of getting things right even if nothing is 100% certain?

Take the stock market. We are continually told in investment adverts by expensive investment advisers that they can make your money work for you more than just tracking a stock index, like the S&P-500. In other words, they can beat the market. But a host of statistical studies prove the opposite. Sure, some advisers can do better than the index for a few years, but eventually, they all come a cropper. It’s just so much snake oil voodoo investing.

But everything is not entirely random. If you were to read Nicholas Taleb’s book, Black Swan (see my book, The Great Recession, chapter 31), you would think that it was. Or to be more exact, even the most unlikely can happen under the law of chance. It was assumed that there were only white swans until Europeans got to Australia and found black ones. It was the ‘unknown unknowns’, to quote Bush’s neo-con Secretary of State, Donald Rumsfeld. The most unlikely can happen but you cannot know everything. For Taleb, the Great Recession was one such event that could not have been predicted and therefore bankers, politicians and above all, economists are not at fault. This was the excuse used by bankers when giving evidence to the US Congress and to the UK parliament.

But modern statistical methods do have predictive power – all is not random. In his book, Silver offers detailed case studies from baseball, elections, climate change, the financial crash, poker and weather forecasting. Using as much data as possible, statistical techniques can provide degrees of probability, like “the probability of Obama winning the electoral college is 83% and the probability of him winning the popular vote is 50.1%”. This is different from much statistical method in colleges and universities today that rely on idealized modelling assumptions that rarely hold true. Often such models reduce complex questions to overly simple “hypothesis tests” using arbitrary “significance levels” to “accept or reject” a single parameter value. In contrast, the practical statistician needs a sound understanding of how baseball, poker, elections or other uncertain processes work, what measures are reliable and which not, what scales of aggregation are useful, and then to utilize the statistical tool kit as well as possible. You need extensive data sets, preferably collected over long periods of time, from which one can then use statistical techniques to incrementally change probabilities up or down relative to prior data.

This is the modern form of what is called the Bayesian approach, named after the 18th century minister Thomas Bayes who discovered a simple formula for updating probabilities using new data. The essence of the Bayesian approach is to provide a mathematical rule explaining how you should change your existing beliefs in the light of new evidence. In other words, it allows scientists to combine new data with their existing knowledge or expertise.

What constitutes Bayes approach that led to Nate Silver’s accurate forecasts? Let me try and explain as best I can, using the help of examples provided by Eliezer Yudkowsky in his excellent blog (http://yudkowsky.net/).

Suppose it is an established fact through other studies that 1% of women at age forty who participate in routine screening have breast cancer. Second, 80% of women with breast cancer will get positive mammographies. But 9.6% of women without breast cancer will also get positive mammographies. A woman in this age group has a positive mammography in a routine screening. What is the probability that she actually has breast cancer? The correct answer is 7.8%, obtained as follows: out of 10,000 women, 100 have breast cancer; 80 of those 100 have positive mammographies. From the same 10,000 women, 9,900 will not have breast cancer and of those 9,900 women, 950 will also get positive mammographies. This makes the total number of women with positive mammographies 950+80 or 1,030. Of those 1,030 women with positive mammographies, 80 will have cancer. Expressed as a proportion, this is 80/1,030 or 0.07767 or 7.8%. So the answer is not 1% who do have cancer or the 80% with a positive mammo.

The original proportion of patients with breast cancer is known as the prior probability. The chance that a patient with breast cancer gets a positive mammography and the chance that a patient without breast cancer gets a positive mammography are known as the two conditional probabilities. Collectively, this initial information is known as the priors. The final answer – the estimated probability that a patient has breast cancer, given that we know she has a positive result on her mammography – is known as the revised probability or the posterior probability. The mammography doesn’t increase the probability that a positive-testing woman has breast cancer by increasing the number of women with breast cancer – of course not; if mammography increased the number of women with breast cancer, no one would ever take the test! However, requiring a positive mammography is a membership test that eliminates many more women without breast cancer than women with cancer. The number of women without breast cancer diminishes by a factor of more than ten, from 9,900 to 950, while the number of women with breast cancer is diminished only from 100 to 80. Thus, the proportion of 80 within 1,030 is much larger than the proportion of 100 within 10,000. The evidence of the positive mammography slides the prior probability of 1% to the posterior probability of 7.8%.

Actually, priors are true or false just like the final answer – they reflect reality and can be judged by comparing them against reality. For example, if you think that 920 out of 10,000 women in a sample have breast cancer and the actual number is 100 out of 10,000, then your priors are wrong. In this case, the priors might have been established by three studies – a study on the case histories of women with breast cancer to see how many of them tested positive on a mammography, a study on women without breast cancer to see how many of them test positive on a mammography, and an epidemiological study on the prevalence of breast cancer in some specific demographic.

Let’s say you’re a woman who’s just undergone a mammography. Previously, you figured that you had a very small chance of having breast cancer; we’ll suppose that you read the statistics somewhere and so you know the chance is 1%. When the positive mammography comes in, your estimated chance should now shift to 7.8%. There is no room to say something like, “Oh, well, a positive mammography isn’t definite evidence, some healthy women get positive mammographies too. I don’t want to despair too early, and I’m not going to revise my probability until more evidence comes in. Why? Because I’m an optimist.” And there is similarly no room for saying, “Well, a positive mammography may not be definite evidence, but I’m going to assume the worst until I find otherwise. Why? Because I’m a pessimist.” Your revised probability should go to 7.8%, no more, no less.

What’s so great about Bayes’ theorem is that it can be used for reasoning about the physical universe. But I think Bayes law also shows two other things that are useful to remember in economic analysis.

The first is the power of data or facts over theory and models. Neoclassical mainstream economics is not just voodoo economics because it is ideologically biased, an apology for the capitalist mode of production. But in making assumptions about individual consumer behaviour, about the inherent equilibrium of capitalist production etc, it is also based on theoretical models that bear no relation to reality: the known facts or priors. In contrast, a scientific approach would aim to test theory against the evidence on a continual basis, not just to falsify it (as Karl Popper would have it) but also to strengthen its explanatory power – unless a better explanation of the facts comes along. Newton’s theory of gravity explained very much about the universe and was tested by the evidence, but then Einstein’s theory of relativity came along and better explained the facts (or widened our understanding to things that could not be explained by Newton’s laws). This approach using statistical methods like Bayes law is what mainstream economics does not do.

The second thing we can glean from the use of Bayes law and Nate Silver’s results is the power of the aggregate. The best economic theory and explanation comes from looking at the aggregate, the average and its outliers. Data based on a few studies or data points provide no explanatory power. That may sound obvious but it seems that many political pundits were prepared to forecast the result of the US election based on virtually no aggregated evidence. It’s the same with much of economic forecasting. Sure, what happened in the past is no certain guide to what may happen in the future, but aggregated evidence over time is a hell of sight better than ignoring history.

Silver has now published a new book that is already a best seller and he now regularly appears on TV talk shows. Silver brilliantly exposed the biased commentaries of the right-wing TV channels and papers whose pundits regularly appeared on screen or in print to say that they ‘had a hunch’ that Romney would win or that the polls were ‘biased’ against the Republican candidates. Silver, in the meantime, quietly presented a statistical analysis of the polls and concluded the probability of Obama winning was over 80% and rising. His forecast was dead right. On November 12th, his new book, The Signal and the Noise (print edition) was named Amazon’s Best Book of the Year for 2012.

The evidence is that statistical analysis is way better at forecasting things than ‘hunches’ or human intuition. Indeed, out of the one hundred studies comparing the accuracy of actuarial statistics (probability analysis) and intuition, there has not been one case humans doing better (Stuart Sutherland, Irrationality, p200). Indeed, in most studies, actuarial analysis was way better. Take bank loans, nowadays 90% of loan applications are reviewed by computers taking into account client details against aggregate evidence on bank accounts, jobs etc to gauge risk. Loans granted by computer using statistical probabilities turn out to have far less defaults than those borrowers chosen by bankers on their own judgement. Insurance companies have applied to risk in life expectancy and accidents for many years. So when somebody tells you that their intuition delivers better results, they are talking out of their hats. Why would you not choose statistical methods to raise your chances of getting things right even if nothing is 100% certain?

Take the stock market. We are continually told in investment adverts by expensive investment advisers that they can make your money work for you more than just tracking a stock index, like the S&P-500. In other words, they can beat the market. But a host of statistical studies prove the opposite. Sure, some advisers can do better than the index for a few years, but eventually, they all come a cropper. It’s just so much snake oil voodoo investing.

But everything is not entirely random. If you were to read Nicholas Taleb’s book, Black Swan (see my book, The Great Recession, chapter 31), you would think that it was. Or to be more exact, even the most unlikely can happen under the law of chance. It was assumed that there were only white swans until Europeans got to Australia and found black ones. It was the ‘unknown unknowns’, to quote Bush’s neo-con Secretary of State, Donald Rumsfeld. The most unlikely can happen but you cannot know everything. For Taleb, the Great Recession was one such event that could not have been predicted and therefore bankers, politicians and above all, economists are not at fault. This was the excuse used by bankers when giving evidence to the US Congress and to the UK parliament.

But modern statistical methods do have predictive power – all is not random. In his book, Silver offers detailed case studies from baseball, elections, climate change, the financial crash, poker and weather forecasting. Using as much data as possible, statistical techniques can provide degrees of probability, like “the probability of Obama winning the electoral college is 83% and the probability of him winning the popular vote is 50.1%”. This is different from much statistical method in colleges and universities today that rely on idealized modelling assumptions that rarely hold true. Often such models reduce complex questions to overly simple “hypothesis tests” using arbitrary “significance levels” to “accept or reject” a single parameter value. In contrast, the practical statistician needs a sound understanding of how baseball, poker, elections or other uncertain processes work, what measures are reliable and which not, what scales of aggregation are useful, and then to utilize the statistical tool kit as well as possible. You need extensive data sets, preferably collected over long periods of time, from which one can then use statistical techniques to incrementally change probabilities up or down relative to prior data.

This is the modern form of what is called the Bayesian approach, named after the 18th century minister Thomas Bayes who discovered a simple formula for updating probabilities using new data. The essence of the Bayesian approach is to provide a mathematical rule explaining how you should change your existing beliefs in the light of new evidence. In other words, it allows scientists to combine new data with their existing knowledge or expertise.

What constitutes Bayes approach that led to Nate Silver’s accurate forecasts? Let me try and explain as best I can, using the help of examples provided by Eliezer Yudkowsky in his excellent blog (http://yudkowsky.net/).

Suppose it is an established fact through other studies that 1% of women at age forty who participate in routine screening have breast cancer. Second, 80% of women with breast cancer will get positive mammographies. But 9.6% of women without breast cancer will also get positive mammographies. A woman in this age group has a positive mammography in a routine screening. What is the probability that she actually has breast cancer? The correct answer is 7.8%, obtained as follows: out of 10,000 women, 100 have breast cancer; 80 of those 100 have positive mammographies. From the same 10,000 women, 9,900 will not have breast cancer and of those 9,900 women, 950 will also get positive mammographies. This makes the total number of women with positive mammographies 950+80 or 1,030. Of those 1,030 women with positive mammographies, 80 will have cancer. Expressed as a proportion, this is 80/1,030 or 0.07767 or 7.8%. So the answer is not 1% who do have cancer or the 80% with a positive mammo.

The original proportion of patients with breast cancer is known as the prior probability. The chance that a patient with breast cancer gets a positive mammography and the chance that a patient without breast cancer gets a positive mammography are known as the two conditional probabilities. Collectively, this initial information is known as the priors. The final answer – the estimated probability that a patient has breast cancer, given that we know she has a positive result on her mammography – is known as the revised probability or the posterior probability. The mammography doesn’t increase the probability that a positive-testing woman has breast cancer by increasing the number of women with breast cancer – of course not; if mammography increased the number of women with breast cancer, no one would ever take the test! However, requiring a positive mammography is a membership test that eliminates many more women without breast cancer than women with cancer. The number of women without breast cancer diminishes by a factor of more than ten, from 9,900 to 950, while the number of women with breast cancer is diminished only from 100 to 80. Thus, the proportion of 80 within 1,030 is much larger than the proportion of 100 within 10,000. The evidence of the positive mammography slides the prior probability of 1% to the posterior probability of 7.8%.

Actually, priors are true or false just like the final answer – they reflect reality and can be judged by comparing them against reality. For example, if you think that 920 out of 10,000 women in a sample have breast cancer and the actual number is 100 out of 10,000, then your priors are wrong. In this case, the priors might have been established by three studies – a study on the case histories of women with breast cancer to see how many of them tested positive on a mammography, a study on women without breast cancer to see how many of them test positive on a mammography, and an epidemiological study on the prevalence of breast cancer in some specific demographic.

Let’s say you’re a woman who’s just undergone a mammography. Previously, you figured that you had a very small chance of having breast cancer; we’ll suppose that you read the statistics somewhere and so you know the chance is 1%. When the positive mammography comes in, your estimated chance should now shift to 7.8%. There is no room to say something like, “Oh, well, a positive mammography isn’t definite evidence, some healthy women get positive mammographies too. I don’t want to despair too early, and I’m not going to revise my probability until more evidence comes in. Why? Because I’m an optimist.” And there is similarly no room for saying, “Well, a positive mammography may not be definite evidence, but I’m going to assume the worst until I find otherwise. Why? Because I’m a pessimist.” Your revised probability should go to 7.8%, no more, no less.

What’s so great about Bayes’ theorem is that it can be used for reasoning about the physical universe. But I think Bayes law also shows two other things that are useful to remember in economic analysis.

The first is the power of data or facts over theory and models. Neoclassical mainstream economics is not just voodoo economics because it is ideologically biased, an apology for the capitalist mode of production. But in making assumptions about individual consumer behaviour, about the inherent equilibrium of capitalist production etc, it is also based on theoretical models that bear no relation to reality: the known facts or priors. In contrast, a scientific approach would aim to test theory against the evidence on a continual basis, not just to falsify it (as Karl Popper would have it) but also to strengthen its explanatory power – unless a better explanation of the facts comes along. Newton’s theory of gravity explained very much about the universe and was tested by the evidence, but then Einstein’s theory of relativity came along and better explained the facts (or widened our understanding to things that could not be explained by Newton’s laws). This approach using statistical methods like Bayes law is what mainstream economics does not do.

The second thing we can glean from the use of Bayes law and Nate Silver’s results is the power of the aggregate. The best economic theory and explanation comes from looking at the aggregate, the average and its outliers. Data based on a few studies or data points provide no explanatory power. That may sound obvious but it seems that many political pundits were prepared to forecast the result of the US election based on virtually no aggregated evidence. It’s the same with much of economic forecasting. Sure, what happened in the past is no certain guide to what may happen in the future, but aggregated evidence over time is a hell of sight better than ignoring history.

THE NEW STREAMLINED RSN LOGIN PROCESS: Register once, then login and you are ready to comment. All you need is a Username and a Password of your choosing and you are free to comment whenever you like! Welcome to the Reader Supported News community. |

### ARTICLE VIEWS: 2474

### MOST RECENT ARTICLES

Last week, the Washington Post detailed allegations that Roy Moore—the Republican Party’s candidate for the U.S. Senate seat in Alabama—sexually molested Leigh Corfman when she was 14 years Friday, 17 November 2017 |

The CIA Opened a Branch of Churches. Handmaid’s Tale
by Javier Castro
In the early 1960s, the Central Intelligence Agency (CIA) hatched a plot to open a Mennonite church in San Salvador, El Thursday, 16 November 2017 |

From the Streets to the Citadels of Power: The Feminist Approach to Sexual Harassment
By ignoring history and rational analysis, media coverage of sexual harassment obscures its sources and Thursday, 16 November 2017 |

If you run a small business, you might already have a mobile application which helps you with its promotion. And that’s smart! Mobile apps have become so popular nowadays that you can’t afford Thursday, 16 November 2017 |

Why a federal minimum wage of only $7.25 an hour? There is no reason, other than media control of the issue to have our government maintain a below poverty wage as the American minimum. Here is how Wednesday, 15 November 2017 |

Back in the day before Afghanistan started, A reader to me a suggestion imparted, And looking back now it seems to make sense, That rather than spend two trill and ten cents, Why not hand over Tuesday, 14 November 2017 |

Technology is making a big impact on our lives – starting from medicine and business, through education or even dating, technological advance is essential. Making use of it means improved Tuesday, 14 November 2017 |

## Comments

A note of caution regarding our comment sections:

For months a stream of media reports have warned of coordinated propaganda efforts targeting political websites based in the U.S., particularly in the run-up to the 2016 presidential election.

We too were alarmed at the patterns we were, and still are, seeing. It is clear that the provocateurs are far more savvy, disciplined, and purposeful than anything we have ever experienced before.

It is also clear that we still have elements of the same activity in our article discussion forums at this time.

We have hosted and encouraged reader expression since the turn of the century. The comments of our readers are the most vibrant, best-used interactive feature at Reader Supported News. Accordingly, we are strongly resistant to interrupting those services.

It is, however, important to note that in all likelihood hardened operatives are attempting to shape the dialog our community seeks to engage in.

Adapt and overcome.

Marc Ash

Founder, Reader Supported News

RSS feed for comments to this post