Monday, July 25, 2011

Diagnostics with MySQL

I have changed my mind and decided to record my diagnostic work because it may be of interest to someone else moving from a Derby database to MySQL, and perhaps more honestly, it helps me to think.

The problem I am addressing is twofold. First I am running routines written two years ago, and although they were written by me, with the passage of time they now look like double Dutch. Second I am using an aging dataset, which has become messy with the passage of time. I know the code ran two years ago, and one year ago, so the problem probably relates to something I have done in the last year. The trouble is, I have done a lot, mostly in the name of "cleaning it up" or "filling holes" in the dataset.

Shares, like all real world data, don't behave in a simple or uniform way. Companies change their names, they merge with and get taken over by other companies, and they change the nominal unit value of their shares on a whim. The question is how to deal with this. Do you use "transactions", which didn't take place, but which attempt to adjust to the current situation, or do you break all accounting and audit protocols by overwriting old transactions with updated parameters. Either way, you muck up the data and make it more difficult to analyze in the sort of methodical fashion which lends itself to easy programming.

The error reported was:

SQLException: You have an error in your SQL syntax;
check the manual that corresponds to your MySQL
server version for the right syntax to use
near 'of TrQ' at line 1

Now if the problematic SQL query were hard coded, it would be quite easy to check it against the manual as recommended. But in this case it is not. The query is generated from within the code. Indeed it is generated from within an iterative loop, and at first site, the problem does not seem occur on the first iteration. Stranger still, when I ran it using Derby, the program crashed, but in a different place.

So I need to break up the code such that the loops begin to run, but pass their output to the screen rather than trying to use it run more queries.

My initial screen dump was in the form:

Sidx= 470
Sell insert Query = INSERT INTO TransComp VALUES
(470, '2027-10-20', 'Sell', 'CEY', 6.2, -400)
try insert sell
insert sell succeeds
Buy Query = SELECT * FROM TransRaw where TrType = 'Buy'
and StCode = 'CEY' for update of TrQ
Sidx= 471
Sell insert Query = INSERT INTO TransComp VALUES
(471, '2027-10-20', 'Sell', 'AVE', 1.77, -1000)
try insert sell
insert sell succeeds
Buy Query = SELECT * FROM TransRaw where TrType = 'Buy'
and StCode = 'AVE' for update of TrQ

This confirmed that sales data were being successfully inserted, but the dump was too long, and I lost the top. To save the bother of writing to a text file, I shortened it. The next dump was in the form:

try sell query loop
Sell Query = SELECT * FROM TransRaw where TrType = 'Sell' order by dDate, TRidx
INSERT INTO TransComp VALUES (337, '2001-06-20', 'Sell', 'CRT', 0.45, -6000)
INSERT INTO TransComp VALUES (418, '2001-08-20', 'Sell', 'QCH', 5.9, -300)
INSERT INTO TransComp VALUES (443, '2001-09-20', 'Sell', 'CBA', 51.2, -125)
INSERT INTO TransComp VALUES (404, '2002-08-20', 'Sell', 'CBA', 53.48, -125)
INSERT INTO TransComp VALUES (475, '2002-11-20', 'Sell', 'CBA', 49.99, -100)

This all fitted on one screen, and the first thing I noticed was that the dates are all wrong. I also noticed that CRT is indeed the first item in this jumbled up dataset. I therefore ran the same dump using Derby, and the first five items were now:

try sell query loop
Sell Query = SELECT * FROM TransRaw where TrType = 'Sell' order by dDate, TRidx
INSERT INTO TransComp VALUES (420, '2006-09-12', 'Sell ', 'ZFX', 10.65, -100)
INSERT INTO TransComp VALUES (410, '2006-12-12', 'Sell ', 'BIL', 12.44, -80)
INSERT INTO TransComp VALUES (418, '2007-08-01', 'Sell ', 'QCH', 5.9, -300)
INSERT INTO TransComp VALUES (412, '2007-08-15', 'Sell ', 'CDO', 6.2, -220)
INSERT INTO TransComp VALUES (417, '2007-11-07', 'Sell ', 'LIP', 0.95, -1300)

The dates here are correct, and the order is as I would expect. I must confess here, that while I spent a few days browsing the Derby documentation, I haven't even glanced at that for MySQL yet. I just assumed it would be the same, and thought I could "wing it". I guess that assumption has now been proven false. I really need to have a look at the manual (as the error message suggested), but first let's have a closer look at the lines for CRT:

INSERT INTO TransComp VALUES (337, '2001-06-20', 'Sell', 'CRT', 0.45, -6000)
INSERT INTO TransComp VALUES (337, '2009-06-01', 'Sell ', 'CRT', 0.45, -6000)

The first line above is taken using MySQL, and the second using Derby. And the first observation is that MySQL is using day information for the year and converting the first two digits of the year to the day. That should be easy enough to fix. The second is that MySQL is removing trailing spaces from the transaction type, while Derby is not. I've used a 5 character field: Derby displays it with 5 characters: 'Sell ', while MySQL truncatates it to 'Sell'.

Finally for today, using the first screen dump above, let's look closely at the Buy query, which MySQL rejected:

SELECT * FROM TransRaw where TrType = 'Buy'
and StCode = 'CEY' for update of TrQ

Just now I can't remember what I meant by "for update of TrQ". Perhaps I need to revisit both manuals.

No comments: