Thursday, July 23, 2009

Building Rasch Formulae

Towards the end of his introduction, Rasch describes a simple probabilistic model involving mice. I shall not use his exact notation because it is not convenient for me to put bars over characters. So where Rasch describes and outcome A or Ã I shall describe the outcomes as 1 or 0. The probability of outcome 1 is q, and the probability of outcome 0 is 1-q.

The thing about probability is not that it is hard, but that it is tedious. What makes it hard is that people who work with probability all the time, to reduce the tedium, have developed annotations. They also jump steps. I guess for them the fewer the steps, the lower the tedium, for for people (like me) who don't work with it a lot, it has the effect of combining tedium with headache. I am sure this is one of the reasons why Rasch is not well known by people who walk up and down high streets carrying shopping bags.

Rasch jumps with his mouse model straight into an expression which looks to the layman like gibberish. I shall move in steps small enough for me to understand them. So the probability of the outcome of a single event is 1 is given by:

 p{1} = q (1)

If there are two events, the probability that both outcomes are 1 will be q2. If there are three events, the probability that three out of three outcomes will be 1 is q3. Conversely, the probability that three out of three outcomes will be zero is (1-q)3. This is pretty easy because I am skirting along the outer edge of the outcome tree.

It gets harder to predict one or two results of 1 because there are multiple paths to get there. Take one result of 1. The result of the first event might be a 1, followed by two zeros. Or the result of the first event might be a zero, followed by either a 1 and a zero, or a zero and a 1. The probability of any one of these paths would be q(1-q)2, but as there are three such paths, they must be added together to give the overall probability of a single 1 out of three events (or trials). Using similar notation to Rasch:

 p{1|3} = 3q1(1-q)3-1 (2)

This is a nice little formula, in so far as it goes, but it is not quite Equation 6.1 given by Rasch on page 12 of his book. It is on the number of paths that it falls down, because to some extent a single occurrence of an event is a special case.

Of course I realise that probability experts covered this in kindergarten, but I am not a probability expert. I did pure and applied math at school: ladders leaning on walls and simple harmonic motion. I left probability to the ubergeeks. Sometimes I regret that decision. We had a very good math master. He was very systematic, very methodical, and he made everything very easy.

Returning to the mice, I think I'll convert them to coins, because for me they are easier to manipulate both conceptually and in practice (they stay still for longer). Following in the tradition of my old math master, I am building up in small steps. I have two coins in front of me. I note there are two ways of displaying a head and a tail.

Now I have three coins in front of me. I am counting the arrangements of 2 heads and 1 tail. Essentially the tail can be in any one of three positions, so there are three possibilities, as shown in equation three above. Similarly with four or five coins, if there is only one tail (or one head), the number of combos is the same as the number of coins.

Now I have four coins in front of me and I am counting the arrangements of 2 heads and 2 tails. If I put both the heads on the left, both the tails are on the right and there is no movement; that is one possibility. If I hold one head on the left, the other head is loose in 3 coins, which gives three possibilities, except that you can't allow the head on the left of the three, because that has already been used. If I hold no heads on the left, one tail is loose in three coins which gives three more possibilities. The total is 1+2+3=6.

Now I have five coins in front of me and I am counting the arrangements of 3 heads and 2 tails. If I put all the heads on the left, all the tails are on the right and there is no movement; that is one possibility. If I hold 2 heads on the left, the third head is loose in 3 coins, which gives three possibilities, of which only two count, as explained above. If I hold one head on the left, at first sight, two heads are loose in four coins, which would add six possibilities, but of course that is not really the case, because only one head is on the left, which means the next coin must be a tail, so in fact you have a single tail loose in 3 coins, which has 3 possibilities. And if I hold one tail on the left, the remaining tail is loose in four coins which adds four more possibilities. Altogether that is 1+2+3+4=10.

My final experiment is with six coins and I shall begin with 4 heads and 2 tails. Holding all the heads on the left, all the tails are on the right and there is no movement; that is one possibility. If I hold 3 heads on the left, the fourth coin has to be a tail, which leaves a head and a tail to alternate positions - two more possibilities. If I hold two heads on the left, forcing the third coin to be a tail, one tail has complete freedom of movement through three coins, adding three possibilities. One head on the left followed by a tail leaves one tail complete freedom of movement through four coins, adding four possibilities. Finally one tail on the left leaves the remaining tail free in five coins which adds another five possibilities. That is 1+2+3+4+5=15.

Finally for 3 heads and 3 tails I begin with 3 and 3, which is one combo. Then I'll hold 2 heads and a tail on the left, leaving the remaining head free in 3 coins, adding 3 combos. Holding 1 head and a tail on the left leaves two heads and two tails free, which was covered three paragraphs above and adds 6 possibilities. And holding a tail on the left leaves 3 heads and 2 tails as discussed two paragraphs above and adds 10 possibilities. That is 1+3+6+10=20.

I'd love to say I could see a pattern in all of this, but at this stage I really can't. What I can do is cheat a little and look for a similar formula on the web. The notation seems to have changed a little, but making allowance for that, the probability of a specified event occurring a times out of n trials seems to be:

 p{a|n} = (n!/((n-a)!a!))q a(1-q)n-a (3)

I tried it with the last two examples above, and the term with all the factorials in it seems to yield the same number of paths through the outcome tree as my manual method. I'd love to find a step by step derivation of that term*, but I haven't time. In the absence of that, at least I have put some pith on an expression, which on my first reading of the Rasch book, was not very meaningful to me.

*I have subsequently found a good one here.