Educational Measurement for Applied Researchers by Margaret Wu Hak Ping Tam & Tsung-Hau Jen

Educational Measurement for Applied Researchers by Margaret Wu Hak Ping Tam & Tsung-Hau Jen

Author:Margaret Wu, Hak Ping Tam & Tsung-Hau Jen
Language: eng
Format: epub
Publisher: Springer Singapore, Singapore


Summary

To use fit statistics in item analysis, we need to understand the properties of these statistics. In particular, the impact of the sample size on the fit statistics needs to be taken into account. If we use fit mean-square values to set criteria for accepting or rejecting items on the basis of fit, we are likely to declare that all items fit well when the sample size is large enough. On the other hand, if we set limits to fit t values as a criterion for detecting misfit, we are likely to reject most items when the sample size is large enough.

Some textbooks or other resources make recommendations on the range of acceptable mean-square values or t values for residual-based fit statistics. There are probably no right or wrong answers. You will need to understand the issues with these fit statistics when you apply rules of thumb.

More importantly, fit statistics should serve as an indication for detecting problematic items rather than for setting concrete rules for accepting or rejecting items. Based on the fit statistics, one should examine the items and look for sources of misfit. Improve or reject items if sources of misfit can be identified. The fit statistics should not be used blindly to reject items, particularly those that “over-fit”, as you may remove the best items in your test because the rest of the items are not as “good” as these items.

Furthermore, when residual-based fit statistics show that items fit the Rasch model, this is not sufficient to conclude that you have the best test. The reliability of the test and item discrimination indices should also be considered in making an overall assessment.

Additional Notes

Figure 8.14 shows the theoretical, or expected, item characteristic curve for an item, with four points, A, B, C, and D denoting four regions where the observed ICC may fall. Point A denotes the region above the theoretical ICC, and to the right of the vertical line where θ = δ, the ability at which there is a 50% chance of obtaining the correct answer. Point B denotes the region below the theoretical ICC and to the right of the vertical line θ = δ. Point C denotes the region above the theoretical ICC but to the left of the θ = δ line. Point D denotes the region below the theoretical ICC and to the left of the θ = δ line. It can be shown mathematically that the contribution of observed points in the A and D region to the outfit mean-square, , has an expectation less than one, while the expectation of for points in the C and B regions is greater than one. It is clear then the fit mean-square value provides a test of whether the “slope” of the observed ICC is the same as the theoretical one. Given that the theoretical one can be regarded as an “average” of all items, the fit mean-square value tests whether the observed ICC for this item is the same as the slopes of the other items.



Download



Copyright Disclaimer:
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.