Friday, January 8, 2010

Parameter Estimation

The Rasch book, and the Winsteps documentation, about which I have written at some length, are both essentially about parameter estimation. And one of the objectives I want to achieve with my java applet, is real time parameter estimation. I revisited the Rasch book, and spent some time trying to follow the argument, when really I should have just gone straight for a math text book.

Rasch presents his mathematical argument as if it is new or somehow unique to the paradigm he is presenting; perhaps it was at the time, although by his own references to physics in the introduction I doubt it. Perhaps in those days it was conventional to present an argument in full, rather than simply refer the reader to one or more generics methods, and assume the reader has the ability to look them up. Certainly the book could have been a lot shorter, had he done so.

At the end of the day there is no substitute for the Law of Large Numbers, which essentially states that when your sample becomes large enough the experimental frequency of events will approximate to the expected frequency, and the need for esoteric and highfalutin mathematics is dispensed with. That is why, regardless of whether people accept my preference for scoring rates over raw scores, there is a strong argument for establishing a global item bank for frequently used items in the estimation of numeracy and other psychometric parameters.

The one feature of Rasch-parameter estimation, which perhaps is unique, or at least unusual, is the need for the simultaneous estimation of two parameters. If you toss a coin a hundred times and show roughly 50 heads, you can deduce that the probability of throwing a head is roughly 50%. If a child sits ten tests and scores an average of 95%, you can deduce the mean probability of success on the test items was 95%, but in Rasch theory, that probability itself derives from two parameters - the ability of the child and the difficulty of the test. These results in isolation do not tell you whether the child is very clever, or the test is very easy. This is the problem which Winsteps seeks to address with its data transformations and iterations, and which I have sought to address with my scoring rate quotient. Unfortunately, although Rasch showed in theory how to isolate each parameter in turn, he offered little advice on practical methodology, so to some extent there is no single authoritative source on the matter.