Common Errors in Statistics (and How to Avoid Them) by Phillip I. Good & James W. Hardin

Common Errors in Statistics (and How to Avoid Them) by Phillip I. Good & James W. Hardin

Author:Phillip I. Good & James W. Hardin
Language: eng
Format: epub, pdf
Publisher: Wiley
Published: 2012-06-03T16:00:00+00:00


RECOGNIZING AND REPORTING BIASES

Very few studies can avoid bias at some point in sample selection, study conduct, and results interpretation. We focus on the wrong end points, participants and co-investigators see through our blinding schemes, or the effects of neglected and unobserved confounding factors overwhelm and outweigh the effects of our variables of interest. With careful and prolonged planning, we may reduce or eliminate many potential sources of bias, but seldom will we be able to eliminate all of them. Accept bias as inevitable and then endeavor to recognize and report all exceptions that do slip through the cracks.

Most biases occur during data collection, often as a result of taking observations from an unrepresentative subset of the population rather than from the population as a whole. The example of the erroneous forecast of Dewey over Truman was cited in Chapter 3. In Chapter 6, we considered a study that was flawed because of a failure to include planes that did not return from combat.

When analyzing extended time series in seismological and neurological investigations, investigators typically select specific cuts (a set of consecutive observations in time) for detailed analysis, rather than trying to examine all the data (a near impossibility). Not surprisingly, such “cuts” usually possess one or more intriguing features not to be found in run-of-the-mill samples. Too often, theories evolve from these very biased selections. We expand on this point in Chapter 10 in our discussion of the limitations on the range over which a model may be applied.

Limitations in the measuring instrument, such as censoring at either end of the scale, can result in biased estimates. Current methods of estimating cloud optical depth from satellite measurements produce biased results that depend strongly on satellite viewing geometry. In this and in similar cases in the physical sciences, absent the appropriate nomograms and conversion tables, interpretation is impossible.

Over- and underreporting plague meta-analysis (discussed in Chapter 7). Positive results are reported for publication; negative findings are suppressed or ignored. Medical records are known to underemphasize conditions such as arthritis, for which there is no immediately available treatment, while overemphasizing the disease of the day. (See, for example, Callaham et al., 1998.)

Collaboration between the statistician and the domain expert is essential if all sources of bias are to be detected and corrected for, as many biases are specific to a given application area. In the measurement of price indices, for example, the three principle sources are substitution bias, quality change bias, and new product bias.9

Two distinct kinds of statistical bias effects arise with astronomical distance indicators (DIs), depending on the method used. These next paragraphs are taken with minor changes from Willick [1999, Section 9].

In one approach, the redshifts of objects whose DI-inferred distances are within a narrow range of some value d are averaged. Subtracting d from the resulting mean redshift yields a peculiar velocity estimate; dividing the mean redshift by d gives an estimate of the parameter of interest. These estimates will be biased because the distance estimate d itself is biased and is not the mean true distance of the objects in question.



Download



Copyright Disclaimer:
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.