Understanding Regression Analysis by Arias Andrea L.; Westfall Peter H.;

Understanding Regression Analysis by Arias Andrea L.; Westfall Peter H.;

Author:Arias, Andrea L.; Westfall, Peter H.;
Language: eng
Format: epub
Publisher: CRC Press LLC
Published: 2020-08-15T00:00:00+00:00


These tests tell you whether results are explainable by chance alone, or are not easily explained by chance alone, and that is all. Tests do not address the question, “Should the variable be kept in the model?”

You might think that these F tests are redundant with the information already contained in the ordinary regression output. Look at the coefficients for the interaction terms in the full model for example: They are all “insignificant,” based on their p-values, corroborating the small F-statistic 0.7548. And look at the common slope model: All four of the location difference estimates are “significant,” based on their p-values, corroborating the large F-statistic 48.4442 shown just above. Isn’t it sufficient to just look at the parameter estimates and their p-values, rather than bother with the full model/restricted model F tests?

The answer is, “No.” For one thing, what if some of the tests are “significant” and some not? How many have to be “significant” in order to call the global test “significant”? In particular, p-values are uniformly distributed when the data are produced by the restricted model, so with 20 tests from the restricted model, you expect one of the p-values to be less than 0.05, explained purely by chance.

Further, it can easily happen that all of the coefficients of the indicator variables are “insignificant,” yet the global test is “highly significant.” This can happen, for example, when the reference category has a small sample size. In that case, the comparisons against the reference category might all have large p-values. But there may be big differences among the remaining categories, which will be correctly detected by the full model/restricted model F statistic.

It can also happen that the individual |T| statistics are small, yet the global F statistic is large, when the X variables are multicollinear (this case is called a predictive multicollinearity).

Finally, note that the “ordinary” F test printed in bottom of the lm output, and that was discussed in Chapter 8, is also an example of the full model/restricted model test. The full model is ; the restricted model is , and the restriction is that . The least-squares estimate of β0 in the restricted model is just the average of the Y data, so the “SST” term used in the F statistic that was discussed in Chapter 8 is just the error sum of squares for this particular restricted model, which has only the intercept term, β0.



Download



Copyright Disclaimer:
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.