GUI for thinking
Whatever bad things some people say about Microsoft, in the olden days they brought to market a raft of products, which were accessible, easy to use, and useful. MS Access is an example. It may have limitations as a commercial database engine, but as a sketch pad, a tool for collecting one's thought's, it is, in my opinion, hard to beat.
My current task is to design a set of iterations through scoring rate data to render the scoring rate as an objective measure of student ability and item difficulty. The raw data is set out in a single table, as shown in my last blog. On this I have written two queries:
SELECT [HAItemB].[sessidx],
Avg([HAItemB].[rate]) AS AvgOfrate
FROM HAItemB
GROUP BY [HAItemB].[sessidx];
and
SELECT HAItemB.item,
Avg(HAItemB.rate) AS AvgOfrate
FROM HAItemB
GROUP BY HAItemB.item
ORDER BY HAItemB.item;
These queries calculate the average raw scoring rate for each session and each item. The item query looks like this:
Item | AvgOfrate |
1+1 | 34.000 |
1+2 | 30.877 |
1+3 | 32.935 |
1+4 | 31.286 |
1+5 | 38.674 |
A third query calculates the overall mean scoring rate:
SELECT Avg(HAItemB.rate) AS AvgOfrate FROM HAItemB;
The average rate happens to be 18.185, out of a grand total of 14,480 records.
I then joined this query with the two previous queries to calculate the scoring rate quotient (SRQ) for each student session and each item. The results for the above items are shown below.
Item | ItemRate0 | AvRate | ItQ1 |
1+1 | 34.000 | 18.185 | 1.870 |
1+2 | 30.877 | 18.185 | 1.698 |
1+3 | 32.935 | 18.185 | 1.811 |
1+4 | 31.286 | 18.185 | 1.720 |
1+5 | 38.674 | 18.185 | 2.127 |
I then used the session quotients to recalculate the items rates, and the item quotients to recalculate the student/session rates, as proposed in my last blog but one. The table/array below shows this being done for five items in the first session:
Sessidx | Item | Rate | ItQ1 | SRateAdj1 | SQ1 | ItRateAdj1 |
1 | 1+2 | 67 | 1.698 | 39.461 | 1.642 | 40.805 |
1 | 3+1 | 60 | 1.784 | 33.640 | 1.642 | 36.541 |
1 | 2+3 | 55 | 1.552 | 35.435 | 1.642 | 33.496 |
1 | 5+2 | 40 | 1.481 | 27.000 | 1.642 | 24.361 |
1 | 4+4 | 50 | 1.938 | 25.806 | 1.642 | 30.451 |
And this is where the GUI comes in. I can sit staring at those numbers and thinking about them. At first I could see that a number (Rate) was being divided by two different numbers (ItQ1 and SQ1), and I thought why not save time, multiply them together, and divide Rate by the resulting product? But, to paraphrase Buffy, that would be wrong.
It is the item adjusted session rates (SRateAdj1), which are grouped to form the first pass adjusted session average rates, and the session adjusted item rates (ItRateAdj1) which are grouped to form the first pass adjusted item average rates.
The queries are almost the same as before, except that they are written against the table containing the adjusted rates. So for the sessions we have:
SELECT AdjSesstable1.sessidx,
Avg(AdjSesstable1.SRateAdj1) AS AvgOfSRateAdj1
FROM AdjSesstable1
GROUP BY AdjSesstable1.sessidx;
and for items we have:
SELECT AdjSesstable1.item,
Avg(AdjSesstable1.ItRateAdj1) AS AvgOfItRateAdj1
FROM AdjSesstable1
GROUP BY AdjSesstable1.item
ORDER BY AdjSesstable1.item;
For completeness, I ran a query to compute the overall adjusted average rates, but guess what? They were identical to each other and to the overall raw mean. I guess a true mathematician would have known that, but I was quite surprised. Anyway, from there it was quite easy to compute the second pass quotients. These are shown for items below, side by side with first pass numbers:
Item | ItemRate0 | ItemRate1 | AvRate | ItQ1 | ItQ2 |
1+1 | 34.000 | 35.691 | 18.185 | 1.870 | 1.963 |
1+2 | 30.877 | 32.057 | 18.185 | 1.698 | 1.763 |
1+3 | 32.935 | 33.249 | 18.185 | 1.811 | 1.828 |
1+4 | 31.286 | 35.697 | 18.185 | 1.720 | 1.963 |
1+5 | 38.674 | 36.070 | 18.185 | 2.127 | 1.983 |
Although we are only looking at five items here, I find these numbers very encouraging. On the first pass, I asked myself the question: Why is the item "1+5" easier than the item "1+1"? Common sense would suggest this was anomalous, cause by the chance happenstance that in this sample, more able students addressed the item "5+1". And after the first iteration, when item rates have been adjusted for the ability of the students addressing them, the estimate of difficulty (given by the reciprocal of SRQ) of the item "1+1" has been increased, while that for "1+5" has been reduced.
I think that's enough for one blog. I'll continue with more iterations tomorrow, and if I like the results, I'll report on them.
Comments