Friday, January 8, 2010

Parameter Estimation

The Rasch book, and the Winsteps documentation, about which I have written at some length, are both essentially about parameter estimation. And one of the objectives I want to achieve with my java applet, is real time parameter estimation. I revisited the Rasch book, and spent some time trying to follow the argument, when really I should have just gone straight for a math text book.

Rasch presents his mathematical argument as if it is new or somehow unique to the paradigm he is presenting; perhaps it was at the time, although by his own references to physics in the introduction I doubt it. Perhaps in those days it was conventional to present an argument in full, rather than simply refer the reader to one or more generics methods, and assume the reader has the ability to look them up. Certainly the book could have been a lot shorter, had he done so.

At the end of the day there is no substitute for the Law of Large Numbers, which essentially states that when your sample becomes large enough the experimental frequency of events will approximate to the expected frequency, and the need for esoteric and highfalutin mathematics is dispensed with. That is why, regardless of whether people accept my preference for scoring rates over raw scores, there is a strong argument for establishing a global item bank for frequently used items in the estimation of numeracy and other psychometric parameters.

The one feature of Rasch-parameter estimation, which perhaps is unique, or at least unusual, is the need for the simultaneous estimation of two parameters. If you toss a coin a hundred times and show roughly 50 heads, you can deduce that the probability of throwing a head is roughly 50%. If a child sits ten tests and scores an average of 95%, you can deduce the mean probability of success on the test items was 95%, but in Rasch theory, that probability itself derives from two parameters - the ability of the child and the difficulty of the test. These results in isolation do not tell you whether the child is very clever, or the test is very easy. This is the problem which Winsteps seeks to address with its data transformations and iterations, and which I have sought to address with my scoring rate quotient. Unfortunately, although Rasch showed in theory how to isolate each parameter in turn, he offered little advice on practical methodology, so to some extent there is no single authoritative source on the matter.

Wednesday, January 6, 2010

Probability Theory

Now that I've published some code, a risk arises that someone might read it, and thereby discover that the code does not really conform with my project description. I should really rewrite it as quickly as possible, to minimise the risk of discovery, but there is a theoretical matter I should like to address first.

After writing my blog of 9 August 2009, I thought I had stumbled across something worthy of more formal publication. So I contacted my doctoral supervisors at UWA, and some people at the Institute for Objective Measurement, but to cut a long story short nothing has happened.

To be honest I was looking for help with the theory as much as a co-author. The probability theory was a bit out of my depth, so I was looking for someone who knew more about it. But apart from one pragmatic and generally expedient former supervisor, who offered to "help" (which really means proof read), because he felt sorry for me, I drew a blank.

I guess I have this fantasy of communicating with intelligent people of like interest, like you read about in books or see in TV docos about famous scientists; but it never really works out. I am not trying to put myself on a plane with famous scientists, but I am a qualified and published academic in a certain field, and it is disappointing to me that I have failed to track down anyone of like mind, either when I was enrolled at the university or since.

I had two supervisors, one of whom checked for grammatical and spelling mistakes and the other of whom checked for calculation errors. They did a good job, and I passed, but it was intellectually disappointing that they never really understood where I was coming from or what I was driving at.

And during that time I published five papers in refereed journals in three countries, the UK, Australia, and the US, and I had not one item of correspondence arising from any of them. I had emails from people interested in my website, but nothing from the printed journals.

So why do I want to publish in a refereed journal again? Fair question. Partly a litmus test, for approval rating, and partly, because, rightly or wrongly, they look good on a CV or personal web site. But it is not that important.

What is important, whether I am writing for a journal or a blog, is that I get right what I am trying to say. And having failed to find an academic who is either qualified or interested to help me, I shall simply have to gem up on the theory myself.

The good news is what social scientists and especially educationalists find esoteric and rarefied, scientists in other areas find bog standard. So the stuff I need to learn is available in undergraduate text books for mathematics and the natural sciences. I shall use the Probability and statistics EBook, published by UCLA. I have referred to it before, and probably mentioned that it is quite fun to read, with lots of graphic examples.

Tuesday, January 5, 2010

Posting Source Code

CollabNet Subversion turns out not to be quite so hard to learn as I expected.

I had set out with a fundamental misconception, believing I needed to set up the server on my end as well.

When I realised that Sun had already done all that work for me at their end, and that I simply had to use the client to post code on their server, and that I could edit the local files with whatever I wanted at my end, it all became much clearer to me.

Furthermore, Sun even provided the URL for my project on their server together with the syntax for the checkout command.

It was pretty easy to combine the URL in this example with the import command given in the book to post my source code in the project. I say pretty easy; there was a bit of fiddling around. The checkout command includes a local path to which files should be copied, and obviously that is not needed for the import (which term from my perspective really means export or post), and the -m switch/parameter needed to be filled with a comment.

But at the end of the day, a first draft of my code is posted, and I feel much better.

Monday, January 4, 2010

Going Open Source

I have decided to go open source. It has been my intention to do so for a long time, but I hesitated for several months, because I thought my more commercially minded friends would call me a fool. So I put some feelers out into the commercial world, and they were substantially ignored. With hindsight I should have followed my first instinct without hesitation, because the exercise has been a blow to my confidence. I am now launching myself into the open source community with a very dented ego and a low self esteem.

To make it worse, going open source is not a simple exercise. You don't just say hello world this is my source code, share it with me. You have to master a technology called version control, and at first glance the learning curve looks as steep as embracing java itself. I feel is if I've started all over again, and this time, instead if having my head held high, my heart is in my boots.

When I began this blog 18 months ago, I was full of optimism. I was quite excited about learning a new programming language, and I was enthusiastic about rewriting my code to make it better than before. I was very optimistic about using the web as a distribution tool, because it meant the whole world could use my software as soon as it was written, without me using a litre of fuel, or buying a postage stamp.

Now it seems that so many other people have mastered the art of delivering tools on the web, the availability of software, free and otherwise, way exceeds the ability or inclination of people to use it.

So wilh heavy heart, I am beginning to read an online book entitled "Version Control with Subversion". It seems well written, but long. Right now I cannot even envisage how my manually edited and managed source code files will fit into its system, so I must resign myself to many hours of dry reading. I'll crack open a box of matches to hold open my eyes.