Tuesday, August 25, 2009

GUI for thinking

Whatever bad things some people say about Microsoft, in the olden days they brought to market a raft of products, which were accessible, easy to use, and useful. MS Access is an example. It may have limitations as a commercial database engine, but as a sketch pad, a tool for collecting one's thought's, it is, in my opinion, hard to beat.

My current task is to design a set of iterations through scoring rate data to render the scoring rate as an objective measure of student ability and item difficulty. The raw data is set out in a single table, as shown in my last blog. On this I have written two queries:

SELECT [HAItemB].[sessidx],
Avg([HAItemB].[rate]) AS AvgOfrate
FROM HAItemB
GROUP BY [HAItemB].[sessidx];

and

SELECT HAItemB.item,
Avg(HAItemB.rate) AS AvgOfrate
FROM HAItemB
GROUP BY HAItemB.item
ORDER BY HAItemB.item;

These queries calculate the average raw scoring rate for each session and each item. The item query looks like this:

Item AvgOfrate
1+1 34.000
1+2 30.877
1+3 32.935
1+4 31.286
1+5 38.674

A third query calculates the overall mean scoring rate:

SELECT Avg(HAItemB.rate) AS AvgOfrate  FROM HAItemB;

The average rate happens to be 18.185, out of a grand total of 14,480 records.

I then joined this query with the two previous queries to calculate the scoring rate quotient (SRQ) for each student session and each item. The results for the above items are shown below.

Item ItemRate0 AvRate ItQ1
1+1 34.000 18.185 1.870
1+2 30.877 18.185 1.698
1+3 32.935 18.185 1.811
1+4 31.286 18.185 1.720
1+5 38.674 18.185 2.127

I then used the session quotients to recalculate the items rates, and the item quotients to recalculate the student/session rates, as proposed in my last blog but one. The table/array below shows this being done for five items in the first session:

Sessidx Item Rate ItQ1 SRateAdj1 SQ1 ItRateAdj1
1 1+2 67 1.698 39.461 1.642 40.805
1 3+1 60 1.784 33.640 1.642 36.541
1 2+3 55 1.552 35.435 1.642 33.496
1 5+2 40 1.481 27.000 1.642 24.361
1 4+4 50 1.938 25.806 1.642 30.451

And this is where the GUI comes in. I can sit staring at those numbers and thinking about them. At first I could see that a number (Rate) was being divided by two different numbers (ItQ1 and SQ1), and I thought why not save time, multiply them together, and divide Rate by the resulting product? But, to paraphrase Buffy, that would be wrong.

It is the item adjusted session rates (SRateAdj1), which are grouped to form the first pass adjusted session average rates, and the session adjusted item rates (ItRateAdj1) which are grouped to form the first pass adjusted item average rates.

The queries are almost the same as before, except that they are written against the table containing the adjusted rates. So for the sessions we have:

SELECT AdjSesstable1.sessidx,
Avg(AdjSesstable1.SRateAdj1) AS AvgOfSRateAdj1
FROM AdjSesstable1
GROUP BY AdjSesstable1.sessidx;

and for items we have:

SELECT AdjSesstable1.item,
Avg(AdjSesstable1.ItRateAdj1) AS AvgOfItRateAdj1
FROM AdjSesstable1
GROUP BY AdjSesstable1.item
ORDER BY AdjSesstable1.item;

For completeness, I ran a query to compute the overall adjusted average rates, but guess what? They were identical to each other and to the overall raw mean. I guess a true mathematician would have known that, but I was quite surprised. Anyway, from there it was quite easy to compute the second pass quotients. These are shown for items below, side by side with first pass numbers:

Item ItemRate0 ItemRate1 AvRate ItQ1 ItQ2
1+1 34.000 35.691 18.185 1.870 1.963
1+2 30.877 32.057 18.185 1.698 1.763
1+3 32.935 33.249 18.185 1.811 1.828
1+4 31.286 35.697 18.185 1.720 1.963
1+5 38.674 36.070 18.185 2.127 1.983

Although we are only looking at five items here, I find these numbers very encouraging. On the first pass, I asked myself the question: Why is the item "1+5" easier than the item "1+1"? Common sense would suggest this was anomalous, cause by the chance happenstance that in this sample, more able students addressed the item "5+1". And after the first iteration, when item rates have been adjusted for the ability of the students addressing them, the estimate of difficulty (given by the reciprocal of SRQ) of the item "1+1" has been increased, while that for "1+5" has been reduced.

I think that's enough for one blog. I'll continue with more iterations tomorrow, and if I like the results, I'll report on them.

No comments: