Rasch was very clear on one thing. All these measures are relative. "Neither δ nor ζ can be determined absolutely" (op. cit. page 16). Rasch emphasises the arbitrary nature of units, even in physical science: "1 Ft = the length of the king's foot" (op. cit. page 16). This answers my own question posed in the fourth paragraph of this blog, which was that if the difficulty parameters for three items were found to be 1, 2 and 5 using one population, would they be exactly the same for another population or in the same proportion. The answer is the latter. It also confirms my own gut feeling that those writers who have developed the habit of talking about "logits", in the context of Rasch parameters, should desist from doing so, because stipulating units flies in the face of the original Rasch argument.
Sometimes I think Rasch is oversold by the enthusiasts. If all you want to do, with say three children, Flossy, Gertrude, and Samantha, is estimate that Flossy is twice as clever as Gertrude, who in turn is two and a half times as clever as Samantha, then you really don't need all the highfalutin mathematics. You just throw a few tests at the children and note that Flossy scores around 50 marks, Gertrude around 25 marks, and Samantha around 10 marks, or at least marks in that ratio. There is nothing very probabilistic about this methodology; it is as old as the hills.
I am sure I have read somewhere that Rasch methodology frees you from the tyranny of both peer comparison and arbitrary item selection, but the cruel truth is that it does not. Of course what you can do is compare Flossy, Gertrude, and Samantha with a control group, calibration set, or the "Class of '56", who then become like the "King's foot". There is nothing wrong with this. As long as the same "unit of measure" is used year after year, you can give Flossy, Gertrude, and Samantha a "score", which depends neither on them being tested with each other, not the specific items chosen for their test. But again this is an application of generic measurement theory, and has nothing whatsoever to do with probabilistic modelling.
If I might be forgiven for referring once again to my gut, I have a strong (gut) feeling that there is something interesting about Rasch, but that it is less about the metrics which purport to come from an application of his methodology than about an acknowledgement of the randomness of events and the need to quantify levels of confidence when ascribing meaning to test results.