Statistical Inference as Severe Testing by Deborah G Mayo

Statistical Inference as Severe Testing by Deborah G Mayo

Author:Deborah G Mayo [Mayo, Deborah G]
Language: eng
Format: epub
ISBN: 9781107054134
Publisher: Cambridge University Press
Published: 2018-09-30T04:00:00+00:00


J. Berger and Sellke, and Casella and R. Berger.

Berger and Sellke (1987a) make out the conflict between P -values and Bayesian posteriors by considering the two-sided test of the Normal mean, H 0 : μ = 0 vs. H 1 : μ ≠ 0 . “ Suppose that X = (X 1 , … , Xn ) , where the Xi are IID N( μ , σ 2 ), σ 2 known” (p. 112). Then the test statistic , and the P -value will be twice the P-value of the corresponding one-sided test.

Starting with a lump of prior, generally 0.5, on the point hypothesis H 0 , they find the posterior probability in H 0 is larger than the P -value for a variety of different priors on the alternative. However, the result depends entirely on how the remaining 0.5 is allocated or smeared over the alternative (a move dubbed spike and smear). Using what they call a Jeffreys-type prior, the 0.5 is spread out over the alternative parameter values as if the parameter is itself distributed N(µ 0 , σ ). Now Harold Jeffreys recommends the lump prior only to capture cases where a special value of a parameter is deemed plausible, for instance, the GTR deflection effect λ = 1.75″ , after about 1960. The rationale is to avoid a 0 prior on H 0 and enable it to receive a reasonable posterior probability .

By subtitling their paper “ The irreconcilability of P -values and evidence,” Berger and Sellke imply that if P -values disagree with posterior assessments, they can’ t be measures of evidence at all. Casella and R. Berger (1987) retort that “ reconciling” is at hand, if you move away from the lump prior. So let’ s see how this unfolds. I assume throughout, as do the critics, that the P -values are “ audited,” so that neither selection effects nor violated model assumptions are in question at this stage. I see no other way to engage their arguments.

Table 4.1 gives the values of Pr(H 0 | x ). We see that we would declare no evidence against the null, and even evidence for it (to the degree indicated by the posterior) whenever d( x ) fails to reach a 2.5 or 3 standard error difference. With n = 50, “ one can classically ‘ reject H 0 at significance level p = 0.05,’ although Pr(H 0 | x ) = 0.52 (which would actually indicate that the evidence favors H 0 )” (J. Berger and Sellke 1987 , p. 113).

Table 4.1 Pr(H 0 | x ) for Jeffreys-type prior

P one-sided z α n (sample size)

10 20 50 100 1000

0.05 1.645 0.47 0.56 0.65 0.72 0.89

0.025 1.960 0.37 0.42 0.52 0.60 0.82

0.005 2.576 0.14 0.16 0.22 0.27 0.53

0.0005 3.291 0.024 0.026 0.034 0.045 0.124

(From Table 1, J. Berger and T. Sellke (1987 ) p. 113 using the one-sided P -value)



Download



Copyright Disclaimer:
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.