Wednesday, April 8, 2009

Charting Data from a Java-based Probabilistic Model

In my last blog I used a few lines of Java to simulate the selection of beans from a set of 64 boxes each containing 64 beans in which respectively 1 to 64 were red. The idea was to simulate the results from a test comprising 64 items with smoothly graduated difficulty. For the time being the test candidates have been assumed to have a neutral effect on the results - I pictured children randomly selecting beans from the boxes.

In the chart below I have charted items scores against item number as a simple scatter, and before carrying out any Rasch analysis, I ran a simple regression on the data. The coefficient was unity (to 2 decimal places) and the intercept was 0.3. R squared was 0.97. This is a pretty good fit to the theoretical line predicted in the previous blog. When data from a Probabilistic Model fits well with a theoretical prediction it indicates that the generated dataset is large enough to be useful.

Next I ran the "transformation" described in the Winsteps documentation. The formula for the transformation is:

y' = ln(y/(64 - y))

where y' is the transformed item score and y is the raw item score. The chart is shown below

When I generated this data I specified that the ability of the children was neutral, or zero logits on the Rasch scale. But this is a probabilistic model, so setting a parameter in the model will influence the results, but it will not guarantee any outcome. So let's have a look at the student results generated by the model.

The "expected (from the model) score" for each child is between 32 and 33, but the chart above shows the actual results scattered over a range from 24 to 39, and a mean score of 31.

Finally, the chart below shows the transformed scatter. Here again the mean at -0.057 is below the expected average of zero, and for record, the standard deviation is 0.18.

In my next blog I shall apply Rasch based estimation to this data.