Statistical Models for Test Equating, Scaling, and Linking by Alina A. Davier

Statistical Models for Test Equating, Scaling, and Linking by Alina A. Davier

Author:Alina A. Davier
Language: eng
Format: epub
Publisher: Springer New York, New York, NY


12.2 Terminology

The equating of scores on two different forms of a test is often accomplished on the basis of data collected when the groups of examinees taking the two forms are not of equal ability but are linked by a common “anchor” test taken by both groups. This data collection plan is often referred to as the nonequivalent groups with anchor test (NEAT) design. In this chapter, X and Y will refer to the scores on the two test forms; A will refer to the score on the anchor test. The examinees taking the two forms will be assumed to be sampled from different populations, referred to as P and Q (corresponding to test forms X and Y, respectively).

Several methods have been proposed for equating test scores on the basis of data from a NEAT design. Some of those methods constrain the equating relationship to be of the form Y=α+βX. Those methods will be referred to as linear equating methods. Other methods do not impose this constraint; instead, they estimate the function that transforms the distribution of X into the distribution of Y in some specified population of examinees. Those methods will be referred to as nonlinear equating methods. The specified population will be referred to as S and is assumed to be a composite of populations P and Q, represented in the ratio w to (1 − w).

To equate test scores on the basis of data from a NEAT design, it is necessary to assume that some characteristics of the bivariate distributions of test and anchor scores are population invariant—that they are the same in populations P, Q, and S. One common assumption is that the conditional distributions of X and Y, given A, are population invariant. That assumption makes it possible to estimate the distributions of scores X and Y in population S and then use those estimated distributions to equate X to Y. Equating on the basis of this assumption will be referred to as poststratification equating. The linear version of poststratification equating is the Braun-Holland method (Braun & Holland, 1982). Other linear equating methods based on similar assumptions include the Tucker method and the Levine method (Kolen & Brennan, 2004, pp. 105–132). The nonlinear version of poststratification equating (described by Angoff, 1971/1984) is commonly known as “frequency estimation” (p. 113) equating.

An alternative set of assumptions is that the symmetric linking relationships of X to A and of Y to A are population invariant. Equating methods based on these assumptions link score X to score A by assuming the linking relationship in population S to be the same as in population P; they then link score A to score Y by assuming the linking relationship in population S to be the same as in population Q. These methods will be referred to as chained equating. The linear and nonlinear versions of chained equating are commonly known as chained linear and chained equipercentile equating.



Download



Copyright Disclaimer:
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.