In my last few blogs I have used Java to generate a dataset simulating 64 students sitting a 64 item test. I have talked about the "prox" method of estimation discussed in the Winsteps documentation. I shall now show the results of seven iterations through the dataset using this formula:
di' = mi - root(1 + si2/2.9)di
where di' is the revised item difficulty and di is the previous estimate of item difficulty, mi is the mean ability of the children from the most recent estimate and si is the standard deviation of those abilities. The chart immediately below shows the raw mean item difficulty as di0 and the mean item difficulty from 6 subsequent iterations. The chart below that shows the variance of raw item difficulty as di0 and the variance from 6 subsequent iterations.
It is clear from these charts that the best estimate of difficulty and the tightest distribution is the original dataset. Four addition iterations were carried out but not charted because the gyrations of the mean and the variance went off the scale.
The prox estimation method requires similar calculations to be carried out for both item difficulty and student ability, and the corresponding formula for student ability is:
ai' = mi + root(1 + si2/2.9)ai
where ai' is the revised item difficulty and ai is the previous estimate of item difficulty, mi is the mean difficulty of the items from the most recent estimate and si is the standard deviation of those difficulties. The charts below the means and variances of ability through six iterations.
Once again we know from the model parameters that the best estimate of ability and the tightest distribution is given by the original dataset.
In the Winsteps documentation it says the iterations should continue until: "the increase in the range of the person or item measures is smaller than 0.5 logits". The charts below show the "range" of ability and difficulty in the first estimate and the six subsequent iterations.
These charts clearly show the range of ability and difficulty increasing in increasing increments, so it is not immediately clear when to stop. On closer examination, the abilities range increases by less than .5 logits after the first iteration, and the difficulties range increases by less than .5 logits for the first 4 iterations.
The documentation also says at least 2 iterations are carried out, although it is not clear whether this includes the first "transformation" on the raw scores. If it does, and the process stops at ab1 and di1, the the process will not have strayed too far from what we know (from the model parameters) to the "true" abilities and difficulties. But nor will it have improved on the raw data, or the first raw transformation.