The Decision Maker's Handbook to Data Science by Stylianos Kampakis

The Decision Maker's Handbook to Data Science by Stylianos Kampakis

Author:Stylianos Kampakis
Language: eng
Format: epub
ISBN: 9781484254943
Publisher: Apress


In Depth: Bayesian vs. Frequentist Statistics

Maybe you have heard of the term “Bayesian.” There are many models that include the word “Bayes”: “Naïve Bayes,” “Bayesian networks,” “Bayesian inference,” and so on.

All these refer to a school of thought behind statistics. It all started with the British mathematician Reverend Thomas Bayes, who came up with this simple formula:

This formula describes the probability of event A taking place, given that B has taken place as well. This was later developed further by Pierre-Simon Laplace and Sir Harold Jeffreys into what is now forming the foundations of Bayesian statistics.

One of the main ideas behind Bayesian statistics is that it is possible to incorporate prior knowledge about something into our analysis, be it a model or a significance test. On the other hand, frequentist statistics (which is the most popular statistical theory) do not accommodate that, neither do they allow it.

So, for example, let’s say that you want to run some experiment to prove that ghosts exist. You have two hypotheses (ghosts exist or not). A Bayesian statistician can pre-assign a very small probability to the hypothesis that ghosts do not exist based on common sense. Hence, the experiment should find overwhelming evidence in favor of the ghost hypothesis before the test is significant. A frequentist doesn’t have this choice.

Let’s see another example. Frequentist statistics are formulated under the concept of an infinite number of experiments. For example, if you flip a fair coin a large number of times (thousands or more), then the actual probability of flipping heads will converge to 0.5.

However, how can this analogy work for events such as football games or elections? These events are unique. A game between two teams in a specific point in time is not going to be repeated. Even if it were, the parameters underlying it (e.g., how tired the players are or the members of a political party) might change. We can’t perform an infinite number of these events. A frequentist is in trouble in this case, but this is not a problem for a Bayesian. In Bayesian statistics, probabilities express subjective beliefs of events happening, not repeated experiments.

Bayesian statistics have proven to be very useful and to also provide a theoretical framework for parts of machine learning. There are still many disagreements in the statistics community as to which theory is the best. In practice, frequentist statistics are used more often, but there are also many applications based on Bayesian statistics. Once again, it’s about finding the right tool for the job!



Download



Copyright Disclaimer:
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.