## Sunday, October 18, 2009

### More on Scoring Rates

Following on from my last blog, the table below shows the raw scoring rates on two additional items, H and K. Item H is twice as hard as item I, and is only addressed by Student B. Item K is twice as easy as Item J and is only addressed by Student A. This scenario synthesises one which might be generated by a computer based adaptive arithmetic test, which presents more difficult items to more able students and easier items to less able students.

 Raw Rates Item H Item I Item J Item K Session Mean Student A 4 8 16 9.33 Student B 4 8 16 9.33 It Mean 4 6 12 16 9.33

From the table, the effect of the adaptive component of the computer based arithmetic test has been similar to that of a very good handicapper in a horse race. By presenting more difficult items to the more able student and easier items to the less able student, it has produced a dead heat in the result. An examiner looking at the raw rates might be misled into thinking that both students had the same ability. Hence the need to adjust the results to take into account the difficulty of the items presented to each student.

Similarly the adaptive component of the test has distorted the item mean scores of those items presented to only one student. Take Item H. The item mean scoring rate is shown as 4. However, had the item been presented to Student A, from the stated assumptions of the example, one might have expected the scoring rate to have been 2 capm, and the item mean scoring rate would then have been 3, not 4. In the case of Item K, the item mean scoring rate is shown as 16. From the stated assumptions of the example, had this item been presented to Student B, one might have expected the scoring rate to have been 32 capm, and the item mean scoring rate would then have been 24. Item H has been made to look relatively easier than it is, because it was only presented to the more able student, and Item K has been made to look relatively harder than it is, because it was only presented to the less able student.

The scoring rate quotients calculate out as follows:

 Quotients Item H Item I Item J Item K Session Mean Student A 0.43 0.86 1.71 1.00 Student B 0.43 0.86 1.71 1.00 It Mean 0.43 0.64 1.29 1.71

The session rates can then be adjusted, using the item quotients to calculate the adjusted rate. I am reversing the order here from my previous blog. In the previous blog I adjusted the item rates first, but in this example, it is the session rates which most clearly "need" adjusting, and the item rates in fact cannot be adjusted.

 Adjusted Rates Item H Item I Item J Item K Session Mean Student A 6.22 6.22 9.33 7.26 Student B 9.33 12.44 12.44 11.41 It Mean 9.33 9.33 9.33 9.33 9.33

The adjusted mean scoring rate for Student B is now higher than that for Student A, but not by a factor of 2. Just looking at the numbers for this example, it very clear that the data set is "incomplete". The "missing" data from Student A on item H and from Student B on Item K is distorting the results. The session quotient method of transforming the data offsets the distortion partially, but not completely.

And in this example, adjusting the item rates, using the session mean quotients, is not very useful, as the session means were identical, and the session mean quotients were all unity. It follows that iterations would not achieve much, because no matter how many times you divide by one, you move no further forward.