The Art of Statistics: Learning From Data (Pelican Books) by David Spiegelhalter

The Art of Statistics: Learning From Data (Pelican Books) by David Spiegelhalter

Author:David Spiegelhalter [Spiegelhalter, David]
Language: eng
Format: epub
ISBN: 0241398630
Amazon: B07HQDJD99
Publisher: Pelican
Published: 2019-03-27T19:00:00+00:00


He was right – it really is an extraordinary law of nature.

How Does This Theory Help Us Work Out the Accuracy of Our Estimates?

All this theory is fine for proving things about distributions of statistics based on data drawn from known populations, but that is not what we are mostly interested in. We have to find a way of reversing the process: instead of going from known populations to saying something about possible samples, we need to go from a single sample back to saying something about a possible population. This is the process of inductive inference outlined in Chapter 3.

Suppose I have a coin, and I ask you for your probability that it will come up heads. You happily answer ‘50:50’, or similar. Then I flip it, cover up the result before either of us sees it, and again ask for your probability that it is heads. If you are typical of my experience, you may, after a pause, rather grudgingly say ‘50:50’. Then I take a quick look at the coin, without showing you, and repeat the question. Again, if you are like most people, you eventually mumble ‘50:50’.

This simple exercise reveals a major distinction between two types of uncertainty: what is known as aleatory uncertainty before I flip the coin – the ‘chance’ of an unpredictable event – and epistemic uncertainty after I flip the coin – an expression of our personal ignorance about an event that is fixed but unknown. The same difference exists between a lottery ticket (where the outcome depends on chance) and a scratch card (where the outcome is already decided, but you don’t know what it is).

Statistics are used when we have epistemic uncertainty about some quantity of the world. For example, we conduct a survey when we don’t know the true proportion in a population that consider themselves religious, or we run a pharmaceutical trial when we don’t know the true average effect of a drug. As we have seen, these fixed but unknown quantities are called parameters and are often given a Greek letter.fn4

Just like my coin-flipping example, before we do these experiments we have aleatory uncertainty about what the outcomes may be, because of the random sampling of individuals or the random allocation of patients to the drug or a dummy tablet. Then after we have done the study and got the data, we use this probability model to get a handle on our current epistemic uncertainty, just as you were eventually prepared to say ‘50:50’ about the covered-up coin. So probability theory, which tells us what to expect in the future, is used to tell us what we can learn from what we have observed in the past. This is the (rather remarkable) basis for statistical inference.

The procedure for deriving an uncertainty interval around our estimate, or equivalently a margin of error, is based on this fundamental idea. There are three stages:

We use probability theory to tell us, for any particular population parameter, an interval in which we expect the observed statistic to lie with 95% probability.



Download



Copyright Disclaimer:
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.