Friday, December 21, 2007

SQL QOTW - Solution!

To anyone who has been bashing their head against the desk with the SQL brain-buggerer I posted earlier, here is the solution I came up with. There may well be others, that may be more elegant, but here's the one that occurred to me...

OK, so you want to get all the documents up to a given cumulative total byte size.
A straightforward get-me-all-the-docs query looks like this:
SELECT id, filename, bytesize
FROM documents


We want to calculate the cumulative total size for all the records we've got so far, and then stop once we're about to pass the total. So the first problem is how do we work out the cumulative total?

Well, we're essentially dealing with an ordered list of records. We effectively want to work down the ordered list, calculating the cumulative total as we go.

So first lets impose an order on the list:

SELECT id, filename, bytesize
FROM documents
ORDER BY id

Now for each record, we want to work out the cumulative total. I had many thoughts on how to keep a running total, but I eventually realised that you don't actually need to. Following the old physics student's guiding principle of "if you find yourself with a problem you can't solve, rewrite it into a problem that you can solve"**, here's the trick - we don't actually need to calculate this as "x + (next x)" - as long as we use the same imposed order, we can get this with a rather cunning subquery:

SELECT id, filename, bytesize,
( SELECT SUM(bytesize)
FROM documents d2
WHERE d2.id <= documents.id
) AS cumulative_total
FROM documents
ORDER BY id


You see how it works? If we're working down an ordered list of ids-

1000
1001
1002
1003
1004
...
etc



- then keeping a running total is exactly equivalent to querying for the sum of all rows up-to-and-including the current row. So we can now move the subquery from the SELECT clause into the WHERE clause, and it gives us a sufficient criteria for accepting or rejecting rows:


SELECT id, filename, bytesize
FROM documents
WHERE ( SELECT SUM(bytesize)
FROM documents d2
WHERE d2.id <= documents.id
) < @max_cumulative_size * 1024 * 1024
ORDER BY id

( where @max_cumulative_size is the maximum total size in megabytes, obviously )

It all hinges on the ordering. The one proviso with this query is that, although you are free to order the results by whatever you choose, the ordering must be the same as the implicit ordering in the subquery.

And - shock, horror, step back in amazement - that sql works as-is across MySQL and Oracle!

Yay!

So I don't know about you, but after that I feel the need to go hit the climbing wall and have a recuperative pint or two afterwards.

Have a great Christmas everyone!




**This is a really useful skill to have, and it's not just applicable to Physics exams either. In fact, taken to extremes, it's perfectly possible, with a bit of cunning, to change a question about something you haven't revised, into a question about something you *have* revised....

I first realised the power of this at school, in English -
Q2.4 : Is Hamlet really mad? What is the purpose of his madness in the play and by the play? Discuss in not less than 3,000 words.

Answer : In any discussion concerning Hamlet, it is important to note that had he been Scottish, his situation would have been very similar to that of Macbeth....


:-)

SQL QOTW - SELECT all rows up to a cumulative total

Here's a little SQL test that arose here the other day. Let's say you have a DB table called documents, in which you have the following fields:

id
filename
bytesize
date_created
full_content

full_content is a BLOB, containing all the text extracted from whatever document the record represents.
bytesize is, cunningly enough, the size of the content in bytes.

You want to get all the documents, and do something to them - what you want to do is not important here, just that it involves the full_content. However, a moment of pondering will show the nasty bit of this problem... Let's say you may have tens of millions of documents, and the average bytesize is around a meg. Clearly attempting to read the full table into memory is not a good idea (at least until we get Terabytes of RAM in our servers).

A far better idea is to limit the maximum memory usage to some value, and only process up to X MB of documents in one batch. The next time round, you'll get the next X MB of docs, and so on.

We turned this problem over and over for a while, and we thought of a few ways of doing it, but all involved multiple queries, and/or things like temporary tables... and, dammitall, it SHOULD be do-able in one query, dammit man! (cue much indignant harumphing and twizzling of large mutton chop side whiskers, in a Victorian man-of-letters kind of way) Oh yeah, and here's the killer - it has to be portable across Oracle 10g and MySQL.

So the problem is this - given this table, can you construct a single query that will read all the documents up to a given maximum number of megabytes, that will run on Oracle 10g AND MySQL?

Well, after pondering this in several isolated attempts for a few days, I went back to it yesterday, and the answer just popped out straight away - and it's actually deceptively simple. I was going to just give you the answer, but I thought it might be a bit more fun to leave it as a challenge, so a big fat Brucie Bonus goes to the first person who gets it - I'll put the answer up here later on today, if no-one gets it in the meantime.

Monday, November 19, 2007

Beware Case-Insensitive Comparisons on Oracle!

Let's say you have a large database with lots of People in it. Each Person can have many EmailAddresses. Let's say that your code needs to be portable across multiple DB systems (specifically, MySQL and Oracle).

Let's also say that you process up to a million emails per day through your system, and for each email you check the sender and recipient addresses against your big DB. So that's potentially an awful lot of reads, and a potential performance bottleneck.

"No Problem!" you say, "I'll just stick an index on the EmailAddress.address field, and I'm sorted"

...but that's not quite it. EmailAddresses are case-insensitive, so that FOO@BAR.COM is the same as foo@bar.com. MySQL does a case-insensitive comparison by default, but Oracle doesn't.

So how do you get Oracle to do a case-insensitive comparison? Well, to cut a long story short, there are two parameters you can set (so long as you're on at least 10gR2):
alter session set NLS_COMP=LINGUISTIC;
alter session set NLS_SORT=BINARY_CI;


Setting NLS_COMP to LINGUISTIC tells Oracle to perform strict case insensitivity operations, and NLS_SORT governs the sort order. The _CI suffix tells it to be case-insensitive with respect to sorting.

Lovely jubbly, smashin' sorted and great - well, actually, no. Not sorted and great at all, because there's a lovely implementation gem there - "Setting NLS_SORT to anything other than BINARY causes a sort to use a full table scan, regardless of the path chosen by the optimizer.".

So Oracle's implementation of a case-insensitive comparison is, basically, "i'm going to go through every single row in the table one-by-one, checking to see if it matches".

As Mel Smith used to say - "Aw, it's marvellous, innit?"

So what can you do? Obviously a full table scan is BAAAAD, so we need to find a way to make it use an index. There are two things you can do**:

1) Add a explicit lower-case-address field, that duplicates the address field and explicitly lower-cases the address before writing. You can then add an index on this field, and check against that. This means you duplicate info (which is bad) and use more storage than you need (which is also bad) but it's totally portable (which is good)

2) Add a functional index to the address field:
CREATE INDEX ix_my_index_name ON email_addresses( LOWER(address) )
This means you gain the vastly superior performance of an index range scan as opposed to a full table scan, but guess what - MySQL doesn't have functional indexes yet.

GRAAAAAAAAAAAAAAAAAAAARGH!


**OK, so there is a third and possibly fourth option in this particular case, to just lower-case the address field anyway, but the point of this post was meant to be general and it does have a downside too.

You potentially lose capitalisation of the personal part of an email address. For instance:
"Alistair Davidson" <apdavidson@some-domain-im-not-going-to-tell-you.com>
would get flattened to "alistair davidson" <apdavidson@some-domain-im-not-going-to-tell-you.com>

So you want to try to keep the capitalisation if possible.

Probably the best thing to do here is actually to split the personal part from the address, and have the personal part stored as-is, with the address part lower-cased.

Monday, October 29, 2007

Custom Blog Ads Rates?

In blatant defiance of all known precedent, someone has not only read my climbing/mountaineering/great-outdoors-in-general blog, but actually thinks it's worth advertising on!

They emailed me to ask how much I'd charge per month for text ads - trouble is, I don't really know, I, like 99% of the rest of the blogosphere, just shoved AdSense on there and forgot about it. Like, really forgot about it, I haven't even had my first Google cheque yet.

So, er.... hmmm.... I don't want to undersell myself, but equally I don't want to give them a price that instantly marks me out as a no-hoper. Ponder, ponder, stick finger in the air...

But I'm a physicist by training, dammit, I should be able to work this out, surely. So let's do a quick calculation.....

It's a pretty low-traffic blog (a couple of hundred hits a day), but it's quite targetted, and AdSense shows an effective CPM of around $3 with a clickthrough ratio of just over 1%.

So, if I'm getting H hits per day, and the CPM (cost per thousand, I think?) is C, then the price per month should be (avg number of days in a month) * (H / 1000) * C, right?

That gives me a baseline price of around $30 / month. Obviously I can adjust that depending on page placement, as ads just below the banner get more clicks than in-page, but does that sound about right for a low-traffic, specialist enthusiast blog?

I didn't know they were so cheap either!


i'm hurt we're so cheap
Originally uploaded by clurr

Saturday, October 13, 2007

Planning for a 1.5 Terabyte Database

Here's something to make you stroke your chin and stare into the distance for a moment or two -

A customer asked us about crunching a large amount of email. Some rough back-of-an-envelope calculations lead us to expect up to about 150GB of it in one go. Our experiments with the Enron emails on MySQL produced about a 1:10 ratio of data-in to database size, a ratio of roughly 1:1 on data-in to language model size and full-text index, and a roughly linear (not 1:1 though!) increase in processing time per email with elapsed cumulative crunching time.

So that means we can expect a database size of up to 1.5 Terabytes, plus another 300GB of language models and full-text index...

(We'd be using Oracle for the DB, as it comes with some very handy out of the box management and monitoring tools, performance advisor alerts, and all that kaboodle, and it may be that the 1:10 ratio is different on Oracle - we're looking at that at the moment)

So how do you go about planning for a database of that size?

We can't even defer the question and scale-as-we-go, as it's going to be growing to that size very quickly, within days of kicking off. It's got to be right, straight from the word go. I've worked with very large email datasets before, on Smartgroups, but in Freeserve we had a big team of specialist UNIX engineers to manage it.

How do you spec up the disk configuration, knowing that the database files are going to be that huge? We've tended to go for RAID 1 (mirrored) by default, as an until-now acceptable balance between simplicity and resilience. But this means that if you have, say, 2 x 500GB disks, you only actually get 500GB of storage. Dell provide up to 1TB disks on some servers, but damn it's pricey... we'll no doubt end up going for a combination of mirrored, striped configs, but that means more complexity of course.

On that subject, how do you organise the file system? And, crucially, how do you go about arranging back-ups for all that data? The last thing you want is for 1.5TB of data to get lost with no backup, and have to be re-calculated from scratch. The backups have to be stored somewhere, and even just the process of shifting that amount of data from one storage device to another is non-trivial. I mean, shifting 1.5x1013 bits, even over a dedicated, max-ed out Gigabit ethernet is going to take at least 1.5x104 seconds - or just over 4hrs...

Hmm, questions, questions, questions, chin stroke, thousand-yard-stare, tap mouth absently... I'll be pondering this for a while, I think.

Mind you, my old university mate Dan once emailed me back in his PhD days, saying "Hey, you know about databases, don't you? Could you give me a quick intro? I've got to write one to handle data from the SLAC particle accelerator - it's got to handle terabytes of data per day...."
- and that was way back in about 1997 or so, when a 4GB drive would cost you nearly $500. Maybe time to give him a ping on Facebook....

Friday, October 12, 2007

It Lives..!

OK, so posts have been a bit thin on the ground lately, due to being very busy at work but mostly on pre-sales stuff or client-confidential stuff.... but in general, just being so damn busy all the time.

I've been to the States on a client visit, where we also went to see the supremely-smashing Nellie McKay play at The Birchmere. I'd never heard her before, and Peter just said "She's kind of hard to describe..." - I ended up describing the gig as like Frank Zappa and Paul Simon jamming with Suzanne Vega and Phoebe from Friends, and I think that's about as close as I can get.

I've been sport-climbing in the magnificent El Chorro in Andalucia, which I'll post more about on Dynamove (which I've also been neglecting lately)

I've been studying the Social Dynamics of Werewolves and Lynch Mobs...and if you want an atmospheric bar in which to do so, you couldn't get better than Shunt - it was like walking into the bar in An American Werewolf In Paris

I've also been coding more in the last couple of weeks, working on LDAP importers and such, and I've come up against some interesting questions of scale, which I'll post more about later....

Friday, September 07, 2007

Comment Spam Extortion Racket?

This comment on Trampoline Systems' blog made me smile -

hello , my name is Richard and I know you get a lot of spammy comments , I can help you with this problem . I know a lot of spammers and I will ask them not to post on your site. It will reduce the volume of spam by 30-50% .In return Id like to ask you to put a link to my site on the index page of your site. The link will be small and your visitors will hardly notice it , its just done for higher rankings in search engines. Contact me icq 454528835 or write me tedirectory(at)yahoo.com , i will give you my site url and you will give me yours if you are interested. thank you

I know you get a lot of spammy comments , I can help you with this problem . I know a lot of spammers....

Kind of like walking into a newsagent and saying "ya know, I can't help noticing you got a lot of paper in here... that's kind of a fire trap, ya know what I'm sayin?"

If this guy really does know a lot of spammers, I'm sure someone out there knows a lot of Federal Agents who'd like to talk to him - his ICQ number and email address are in the quote. ;-)

Saturday, September 01, 2007

OMG, I'm a PHB....

Last week was something of a milestone for me. While gathering my thoughts for the daily stand-up on Friday, the realisation suddenly dawned - I'd had a really busy week, but I hadn't actually written a single line of code all week. I hadn't even fired up my IDE....

So that's it - I am now officially "Management". How did this happen? Seems like just five minutes ago that we were four guys around a table - one CEO and three techies, building stuff. I'd sometimes spend entire days in the office coding away on my own. Now there's 13 of us, and that's just the permanent staff, not counting contractors and non-exec directors, and I've just spent the whole week running around like the proverbial blue-arsed fly, organising, cabling, battling with MS Project, on the phone with suppliers, dealing with customers, in meetings and conference calls, discussing product strategy, allocating work.... etc etc etc... anything, in fact, except coding.

It's funny how things change. Ten years ago I was the hardcore techie sniggering at the back of the room, thinking that the managers just didn't get it and that I would never be in their place. Now I'm the Pointy-Headed-Boss - and I'm actually OK with that... this stuff needs to be done, and I'd rather I dealt with it than have my team disturbed.

...I mentioned this in the stand-up, and was of course greeted with an appropriate chorus of cat-calls, ironic cheers, and a cry of "MAAAANAGEMEEENT WAAAAAAANNNNKKKKEEEEEERRRR!" (thanks Peter!)
So it's nice to know that although the faces change, the culture doesn't.

Thursday, August 16, 2007

The Perennial RSS Authentication Dilemma

It's a common problem, one that has cropped up many times for me over the last few years. You build a secure system, locked up behind a login so that only authenticated users can access the tightly-controlled data, and everything's fine - and then you come to the RSS feeds.

Simply put, RSS feeds and the corresponding use-case of syndicating data out of the application into another application - be it a desktop RSS reader, a web-based aggregator, or even another context within the same system - is in direct contradiction to your security. You can't have an RSS reader log in to your app using the standard login form, and most readers certainly don't support cookies, so you have to provide a bypass.... but what mechanism?

I've tried numerous approaches - most often HTTP AUTH (which some readers support, but not many) or an encrypted url token.

HTTP AUTH is somewhat like an old faithful, it's been around forever, every web server and browser worth mentioning has supported it for years, and it's simple to implement. But it has the disadvantage that once authenticated, the only way to log out is to close your browser completely. Also, many RSS readers don't support it.

Encrypting the users security credentials into a token that you can pass on the URL, is guaranteed to work on anything that can pass on a url correctly, but it has the disadvantage that then anyone who gets access to that url, to all intents and purposes, is that user - so you still have to be careful what you expose in the feed itself.

The main thing, as ever, is to establish exactly what behaviour is "intended". If the brief is for the user to be able to copy-and-paste RSS urls into readers / emails / other sites, then make sure everyone is clear on the implications of that - you're essentially allowing people and / or applications to impersonate a user without going through the login process.


Requirements, of course, vary wildly from app to app, but the approach I've tended to settle on is a combination of multiple methods - if there is a cookie identifying the user, then use that to establish ID, else if you have HTTP AUTH credentials, use them, authenticating appropriately as required. But remember that if you are dealing with automatic requests from readers via HTTP AUTH or encrypted tokens, you should ALWAYS clear out any session variables at the end of the request, otherwise you can quickly end up with thousands of persistent sessions for no apparent reason

Wednesday, August 15, 2007

Astroturfing in action : social media manipulation exchange

"Astroturfing" is the art of faking "grassroots" - social media manipulation, in other words.

And here is where you can buy, sell, or exchange Diggs, Stumbles, Reddits, blog posts, reviews, you name it....

Monday, August 13, 2007

Apache Proxied Rewrite Character Set Gotcha

Character sets, character sets, I'm perenially beseiged by character sets.... I just got to the bottom of a particularly strange character set issue with one of our clients' website.

Their domain points to our apache web server, which rewrites any requests for collaboration engine urls (e.g. somedomain.org/groups/ or somedomain.org/people ) in order to send them to the collab platform, and rewrites any other urls to go directly to a third-party hosted CMS-driven website.

The problem they were having was that when viewing the live, proxied site, strange characters were appearing in the generated content - e.g. "?" and that funny square box that handily tells you "you got the character set wrong, fool" - but when they looked at the site directly on the third-party CMS, it was fine.

Using Firebug to inspect the headers showed that when viewing the third-party CMS, the Content-Type was:

Content-Type text/html

but when proxied via our Apache web server, it was:

Content-Type text/html; charset=UTF-8

The meta tag in the HTML itself was :

< META http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" >

- and the ensuing character set mismatch was causing the dodgy character funkiness.

It turns out that buried deep within the bowels of our Apache config was this line:

#
# Specify a default charset for all content served; this enables
# interpretation of all content as UTF-8 by default. To use the
# default browser choice (ISO-8859-1), or to allow the META tags
# in HTML content to override this choice, comment out this
# directive:
#
AddDefaultCharset UTF-8


Which was doing exactly what it said on the tin - forcing UTF-8 interpretation whatever the META tag said. Rather than changing it in the central config, I overrode it in their specific config with this line :

AddDefaultCharset Off

and it seems to work fine now. So that's a big "yay" for Firebug, a dependent "yay" for Firefox's plugin/extension architecture that allows and encourages third parties to develop these funky plugins, and a big slap round the face with a large fish for anyone who's not using UTF-8 in this day and age....

Tuesday, July 31, 2007

Rails Migration Version Headache - D'oh!

I've been having real headbanging frustrations with Rails ActiveRecord migrations not picking up the correct target version. On the face of it, this should be about as simple as you can get -

- To determine the current version, the migrate task looks in your database for a table called "schema_info", and reads the value of the "version" field from it. If the table is not there, or is empty, then version is nil, and nil.to_i = 0.

- To determine the target version, the task gets the highest-numbered file from the db/migrate directory and parses an integer out of the filename.

OR

- You can override the target version by passing in a command-line parameter like so:
rake db:migrate VERSION=XXX

So, as the famous "outraged of Tunbridge Wells" might say, why oh why oh why was my migration task always reverting back to previous versions, despite having migration files numbered sequentially right the way up to version 19 (and counting) ?

In vain did I bang my head against the desk, curse migrations and all they stood for, implore the heavens to rain down brimstone upon the head of DHH and all his acolytes, and impugn the parentage of my PC, Windows, Oracle and databases in general....

On a trawl through the Railties source code, it turns out there's a third, more surreptitious way to tell the migrate task which version you'd like to migrate to - if there's an environment variable called VERSION, it will take the value from that. And when I looked at my System Variables in XP, what did I find?

VERSION=3.5.0

Ah.


OK.


My bad...

Thursday, June 28, 2007

A Four-Way Grudge Match: Oracle vs. MySQL vs. Views vs. Derived Tables, eek..!

I like Views. SQL Views are great. Unless, that is, you have to support certainly two (MySQL and Oracle) and ideally three (plus SQLServer) of the major DB systems with the same code. Yesterday I was forced into a dastardly hack that made me wince with it's ickiness, but was actually the most practical solution given the constraints.

The Situation

We have a view which combines a LEFT OUTER JOIN with an AVG and a GROUP BY, to "collapse" what may be many related records down in to one. The AVG gets an average link weighting, to give you an effective measure of how closely two items are related by all the various different linking mechanisms.

This view references a table - item_links - that is expected to grow very rapidly, to the size of many millions of records.

We INNER JOIN onto this view in application code, to get item records related to the "current" item.


The Problem

On Oracle, this view works really well. Nice and fast, at most a tenth of a second or so per query. Doing an EXPLAIN in Oracle's SQL Developer shows that the execution plan uses all the expected indexes. Luvvly jubbly, bosh, sorted, etc.

On MySQL, however, view support is much less mature. The query optimiser can only use the underlying indexes if the view is created with CREATE ALGORITHM=MERGE and the docs say:
"The MERGE algorithm requires a one-to-one relationship between the rows in the view and the rows in the underlying table. If this relationship does not hold, a temporary table must be used instead. Lack of a one-to-one relationship occurs if the view contains any of a number of constructs:
  • Aggregate functions (SUM(), MIN(), MAX(), COUNT(), and so forth)
  • DISTINCT
  • GROUP BY

...

... so instead, what MySQL does is select all rows into a temporary table, which is then used for executing the statement against.

Doing an EXPLAIN in MySQL shows that for the crucial operation of selecting the required rows from this potentially-huge table, there are no indexes used AT ALL. It is in fact selecting EVERY row into the temp table.


All of which leaves me - and, I must say, in blatant defiance of all known precedent - in the truly strange position of cursing MySQL whilst praising Oracle.....

Which worries me - I've never been here before.... :-)


The Solutions That Won't Work

  1. Get rid of the view and just put the sql into each query,
    i.e. change INNER JOIN vw_item_links to INNER JOIN item_links ON (blah) LEFT OUTER JOIN link_prefs ON (blah) etc etc
    Hmmm.... it's possible, but I would say it's a last resort, because that would mean that we'd then have to GROUP BY every single field in every query, not just the fields in the view. This can get majorly cumbersome, because although MySQL allows you to just group by just-enough-fields-to-get-uniqueness (and thus gets a yay for ease-of-use), Oracle insists on grouping by every single field that's not an aggregate (and just about gets a begrudging yay for insisting on standards compliance)

  2. Eliminate the view and the GROUP BY through using subqueries
    i.e. SELECT (item fields), (SELECT AVG(weight)  FROM item_links WHERE ...) AS weight, (SELECT other_item_id)
    This is a bit of a null option, unfortunately, as we'd still have to INNER JOIN onto item_links in the FROM clause to get the related records, so we're not really gaining anything, and in fact we'd be introducing more overhead through a correlated subquery executed for every row.

So what else can we try? Anyone? Ah yes, you boy, there at the back....


The Solution That Did Work

You can, in fact, use subqueries in the FROM clause to create DERIVED TABLES, like so:
SELECT (fields) 
FROM items
INNER JOIN ( your view definition SQL ) inline_vw_item_links
ON inline_vw_item_links.other_item_id = items.id
WHERE inline_vw_item_links.item_id = ...
ORDER BY inline_vw_item_links.weight DESC

in effect, putting your view SQL into the query as if it were a regular table

This does mean that you lose some of the nice encapsulation of the view, and if the view definition is complex it makes your queries look ugly, but at least it works across both DB platforms. You can also get round the ickiness problem by holding the view definition SQL in a variable :

SELECT (fields)
FROM items
INNER JOIN ( #getViewDefinitionSql()# ) inline_vw_item_links
ON inline_vw_item_links.other_item_id = items.id
WHERE inline_vw_item_links.item_id = ...
ORDER BY inline_vw_item_links.weight DESC


But hang on, there's one more subtlety - (and this is the icky bit)

An EXPLAIN on Oracle shows that this query still uses all the appropriate indexes, as it should. Smashin'.

But MySQL, although it's now recognising that there are indexes there to be used, is still selecting ALL rows from item_links. Which, to be fair, is what the query is *strictly* telling it to do.

So here's the icky hack:

We put the item_links WHERE clause restrictions into the inline view definition SQL, so that it looks like this:

SELECT (fields) 
FROM items
INNER JOIN (
SELECT AVG(weight) AS weight, other_item_id, (etc...)
FROM item_links LEFT OUTER JOIN (blah) ON (whatever)
WHERE item_links.item_id = #my_item_id# AND shared=1
) inline_vw_item_links
ON inline_vw_item_links.other_item_id = items.id
WHERE inline_vw_item_links.item_id = ...
ORDER BY inline_vw_item_links.weight DESC


which you can then prettify a bit for encapsulation like this:

SELECT (fields) 
FROM items
INNER JOIN (
#getViewDefinitionSql( "WHERE item_links.item_id = #my_item_id# AND shared=1" )
) inline_vw_item_links
ON inline_vw_item_links.other_item_id = items.id
WHERE inline_vw_item_links.item_id = ...
ORDER BY inline_vw_item_links.weight DESC


Now this makes me feel icky. It's (effectively) mixing your selection criteria in with your join criteria, which violates my rules for "good" sql, and goes against what I've always thought of as best practise - letting the DB engine's query optimiser work out the execution plan for itself. And it's just...wrong...

...But it works. An EXPLAIN shows that MySQL is now using the indexes that it should be using, and only selecting the rows that it needs from the potentially-many-millions-of-rows item_links tables. And the benefit is blatantly obvious in execution times, turning what was a several-second query execution time into a few tenths of a second. And Oracle is still happy. I suspect that it was previously optimising the view SQL into effectively this query anyway, under the hood. So I'm swallowing my SQL purist pomposity, and going with it. But if anyone can think of a more "proper" way of doing this, I'm all ears.

Anyone fancy a football game TONIGHT? 8pm, Willesden CCA

Due to some last-minute drop-outs, we're two men short for our 11-a-side match tonight, 8pm, at Willesden CCA. We're desperate enough, in fact, for me to test the power of the online social network and the blogosphere blah blah blah.... to the limit by posting this.

We ideally need a right-back, and a goalie, but to be honest we'd settle for any two reasonably competent players in any position, cause we're all fairly flexible.

So, if anyone fancies it, let me know ASAP!

Things to know:

  • We're not exactly stuffed full of Ronaldinhos, but you should at least be able to control and pass a ball, and run around for a full 90 minutes. Or play in goal!
  • It's on astroturf, so there's no studs and very few sliding tackles
  • I can bring spare shorts, shirt, socks, shinpads and a pair of size 11 trainers if needed - obviously it's best if you have your own trainers

Tuesday, June 26, 2007

How to make yourself feel old, instantly

The baby on Nirvana's "Nevermind" album cover -


- now looks like this:



....sheesh....

Friday, June 22, 2007

"Folksonomy" : The Most Irritating Word On Teh Intarweb!

It's official!

At least, according to YouGov that is.

And the second and third most annoying words on the internet?
"Blogosphere", followed by "blog". "Wiki" came in tenth - so evidently wikis still have some way to go before people are sick to death of them. Maybe as they gain more traction in the enterprise, the word will rise through the ranks?

So next time you trumpet the social software manifesto, bear this in mind...

Thursday, June 14, 2007

When Facial Hair Becomes Compulsory


DSC_5998
Originally uploaded by mccraig
It had to happen eventually - and today it did. By directorial decree, strange facial hair moved from a mere convention to a compulsory requirement at Trampoline. No exceptions!

I was aiming for a Zappa look, but unfortunately it turned out to be alarmingly close to George Michael... look out for me slumped at the wheel of a beemer in a Hampstead layby near you...

Wednesday, May 30, 2007

Last.fm - yours for £140m, squire....

Woah - the Web 2.0 buyout bubble comes to Shoreditch, with our near 'n' dear neighbours last.fm announcing today that they've been bought out by none other than CBS for the eye-watering, pant-moistening, gast-flabbering sum of £140 million.

Congratulations guys! And there we were thinking these kind of deals only happened in California... GIT'chaw web 2 businesses 'ere, mate, get'em while they're hot...

Wednesday, May 23, 2007

Anti-Social Social Software - How far is Too Far?

The social software revolution has been, more or less, about openness. Transparency and freedom have been the order of the day. "Information wants to be free!" goes the mantra, and by and large it's hard to argue against it. All kinds of information that has always been publically available is becoming gathered together, cross referenced and made searchable by the new wave of social software mashups and visualisations, and those who expressed doubt or indignation about this approach have been dismissed as relics of a bygone era, who "just don't get it".

(I'm including myself in this, by the way. Trampoline's Enron Explorer is a perfect case in point. The information was already public domain, and we weren't even the first to visualise it - we just put a Web 2.0 face on it, and got lucky with the timing.)

But now and again there comes an application of this philosophy that makes me wonder how far is too far? Enter the charmingly named www.whosarat.com.

According to the New York Times:

The site was started by Sean Bucci in 2004, after he was indicted in federal court in Boston on marijuana charges based on information from an informant.

Initially free, it now charges charges between $7.99 for a week and $89.99 for life. You don't have to be as cynical as me to take a guess as to the kind of people who will happily pay to see names and mugshots of the 4,300 informers and 400 undercover agents that the site says it has identified.

So is this too far?
Is it possible to define a clear moral line between a "good" use of public-domain data and "bad" ? Even if it is possible, should we?
Should some usages be permitted but not others? How do we tell the difference?

What is it that actually makes me uneasy about this?

The trouble is that any time you actually try to define what's acceptable and what isn't, in clear unambiguous language, you can always find a counter-example that destroys your mental partitioning of the world and means you have to start again.

The only way I can put my finger on what bothers me about the site is to refer to the intent behind it. Where the intent behind a mashup like Chicago Crime could reasonably be argued to be beneficial to society in some way, Who's A Rat is clearly not.

But even that definition is full of ambiguity. Who's A Rat may well be of immense benefit to certain segments of society (Hal Helms' "Vinny" persona, for one) and Chicago Crime could be argued to be detrimental to, say, property owners in high-crime areas who are looking to sell.

I guess the hoary old cliche applies just as much to information as to people : with great freedom comes great responsibility. We can't on the one hand argue that information needs to be free, yet on the other hand argue that it can only be free for usages that we like. Either the information is out there or it isn't, and I guess that if we want to gain the benefits of information freedom, we have to be prepared to tolerate the drawbacks.

Friday, May 11, 2007

Come Back XP, All Is Forgiven

A couple of months ago, after the burglary, I replaced the nicked big-red-beast of a laptop with a shiny new Core 2 Duo / 2GB of RAM laptop, powerful enough to run SONAR, all its dependencies, and a Java IDE at the same time without dying. It's still heavier than your average laptop, but not quite so much as to generate its own gravitational pull, like the old one. (It's so much more calming to be able to site down and browse t'intarweb without small objects orbiting your laptop, you know? The accretion disk of pens and paperclips always got in my way, and the time distortions played hell with my google calendar...)

Anyway, at the time I bought it, the *ONLY* OS you could get from Dell was Vista. I ummed and aahed, and debated and discussed, and eventually decided to ignore my own standard policy of wait-till-at-least-service-pack-1-before-installing-any-new-MS-OS and take the plunge. It arrived, and I admit I was pleasantly surprised..... at first. Until I tried to put any software on it.

Firefox often (about half the time) won't start, the process just sits there using ~ 4M of RAM and never even getting the splash screen, and the process can't be killed. I have to reboot the laptop and try again.

The Apache 2.2 service won't start.

The Bluedragon 7 JX service will start, but immediately stop.

Windows Firewall drives me so nuts I had to turn it off to prevent an unfortunate episode involving me, a shopping mall, a chainsaw, and visions of GTA3.

User Account Control - yes, yes, I KNOW it's off! I turned it off deliberately, dammit! Stop telling me about it!

Office 2007 - a brave idea, to depart from the established interface so completely, but where the hell is the "Save As..." command? Where's it gone?! The help file tells you how to use it, but doesn't tell you where it is! Can't I load a document, make some changes and then save it to a different filename anymore? I'm downgrading.

There's a pre-installed copy of CineLink PowerDVD that I've NEVER run, but seems to be permanently resident and using up to 200MB of RAM - is this a DRM thing..?

Roll on service pack 1....

Wednesday, March 21, 2007

New Trampolinees Needed!

We're hiring! Oh yes indeedy - techies are needed at Trampoline, with experience of some or all of these :

Java, Hibernate, Spring framework, JMS, Ruby, Ruby on Rails, Transactions, Asynchronous systems, Transactional messaging, SQL, MySQL, Oracle, Linux sysadmin, Windows sysadmin

But far more important than that is that they just fit. Come help us drag the enterprise market kicking and screaming into the 2.0 / 3.0 / buzzword-of-your-choice world. It's fun! It's hard work, but it's fun.

Friday, March 16, 2007

"Will Code For Equity" - sound like anyone you know?

A friend of mine has an old CF/Fusebox site based around moving house, and finding companies to help you do that. She wants to completely revamp it and bring it in to the 2.0 era with user feedback, recommendations and reputations, etc, and she needs a developer.

Trouble is, although her existing site has been supporting her for two years, she doesn't have much money up front for a dev to revamp and relaunch it.

So the deal she's offering is, basically, take charge of the site and get free reign to do whatever you think is a good idea, in return for a decent chunk of equity and a corresponding share of the profits.

I'm not taking on any side work, I've got my hands more than full with Trampoline, but I told her I'd ask around, so if this sounds like anyone you know (and, crucially, you'd be happy to recommend them - she is a friend, after all, and I don't want to see her get ripped off) please give me a shout. Email address is in the "Other Links" section on the right, just under my gurning fizzog**

Sheesh. Coding for equity - does this sound like Bubble 2.0 to anyone else....?

**Translation for non-Northern people: grinning face

Thursday, March 15, 2007

Finally, I Can Say It!

Phew, at last I can say what's been going on, why I haven't been blogging much recently, etc etc - it's because we've been working so hard to close THIS: Trampoline Systems receives £3 million funding. YAAAAAAAAAAAAY!!! We did it!!! Woo and Yay and quite possibly Hoopla aswell! It's going to make a huge difference to the business, and let us step up a level or two in scale and approach.

I can also reveal that Dante was wrong. There is an eighth circle of hell - it's called "Due Diligence" and is full of lawyers. Lots of lawyers asking lots of questions that make you feel like you're being gradually eaten away by a swarm of bees, and make you want to stick your face in a deep fat fryer just to be rid of them.

But you'd have to make sure you kept paper copies, in triplicate, of the license agreement from the fryer company that gave you explicit permission to use it for face-boiling purposes, and prove that you complied with every letter of the fryer license in distributing the bits of hot fat around the room...

So here's a bit of advice. If you think that there's even the slightest possibility of your company ever getting outside investment, or being taken over, or floated in an IPO, make DAMN SURE that you :

  • Keep a record of all open-source software you use, whether you redistribute it or not

  • keep copies of the license agreements for ANY open-source software that you use, whether you distribute it or not

  • keep proof that anything you re-distribute has been distributed in accordance with the terms of the license
    - i.e. if it's an MIT license, you have to provide a copy of the license and copyright notices along with any redistribution, so just keep a copy of what you provide to anyone else

  • read and understand the license terms before you redistribute!
    - this is important, really! For instance, if you provide anyone with a copy of, say, the Oracle OCI client driver, pay close attention to the documentation requirements that it puts on you, and the implicit permission that gives Oracle to audit you at any time.



It's quite straightforward to do this whenever you download some open-source stuff. Just keep a directory somewhere that you can dump license agreements and copyright notices, and keep, say, a Google doc spreadsheet that just records the name of the library, the URL to download it, and the type of license it is under.

If you keep these up to date as and when you download OSS, it only takes a few minutes each time. If, like us, and I'm sure, most people, you didn't think to do it as you went along and waited till you were asked for all the information on everything, right now - then you're in for an awfully long slog of awfully tedious work. A little bit of diligence now saves a massive amount of retro-diligence in future.

Now, where's those Trampoline-branded space hoppers, baths full of champagne, and toilet roll stock option dispensers ? Boo.com 2.0, here we come.....

Friday, March 02, 2007

Irish Donkey Sex Case Brings Down Galway First

I'm speechless - there are so many things to love about this story I don't even know where to start, right down to the name of the receptionist and the guy's claim that it was a "super-rabbit"... and is there really such a thing as the "Unlawful Accommodation of Donkeys Act"?

Irish Donkey Sex Case Shocks Net

Was there really so much dubious accomodation of donkeys going on in 1837 that they felt the need to pass a specific law against it?

Thursday, March 01, 2007

[OT] [Rant] That's it, that's the last straw

That's it - we can't pretend any more, it's finally undeniable - Hollywood has officially run out of ideas completely.

Honestly, in the last couple of years we've had execrable remakes of classic Asian films like The Ring, Ju-on (aka The Grudge) and The Eye, and now they're even plundering Battle Royale. I'm not even going to mention the appalling crime against humanity that was the remake of The Wicker Man.

I give up. I really do. Satire has become the new truth. You can't even satirize anymore, there's no point - reality has become the ultimate ironic piss-take of itself. Just take a look at this article from The Onion - U.S. Dept. Of Retro Warns: 'We May Be Running Out Of Past' dated 1997, and try to tell me it hasn't actually come to pass.

The saddest thing is that most westeners will only see the remakes, and so they'll associate the titles of these fantastic movies with the watered-down, over-produced, bastardised Hollywood interpretations that are (as far as I can see) mediocre at best, rather than the classic originals.

Honestly, can anyone give me one original idea that Hollywood has produced in the last couple of years? And yet, it's undoubtedly piracy that's causing declining revenues, oh yes - not the fact that virtually everything they put out these days is utter shite, oh no. It's them evil pirates, yes sir...

*sigh* I need coffee.

Friday, February 23, 2007

StatSVN - Ruby vs. Java and "Developer Of The Month"

I'm a big fan of SVN, and I know it's in very common use throughout the development community, so I thought I'd give a shout out to an interesting project that extracts some fascinating SVN stats and draws lots of pretty pictures - StatSVN


We've now added a StatSVN task to our build scripts, and the figures it kicks out are just great - once you start looking, you can lose yourself in there for ages....


To give some random samples:


  • As of three days ago, there were 75130 lines of code in the Sonar trunk, of which Craig has contributed 33.6%, myself 32.1% and Jan 22.2%. As Jan has mostly been working on the Ruby On Rails interface, and Craig and I have mostly been writing the Java backend, what does this say about the relative expressive efficiency and productivity of each language?

  • Craig took "Developer Of The Month" for February, with 5286 lines, taking over from me in January with 7546 lines.

  • Over 80% of my changes are additions, with under 20% being updates. Craig is more or less the same, with a few percent more modifications - probably due to fixing my occasional "EVIL" quick hacks...

  • By far and away the most commits overall get done between 4pm and 5pm, but personally, I do three times as many commits between 2pm and 3pm than any other hour. Probably because I keep saying to myself got to finish this before I go to lunch...

  • Jan has not done a single commit before 12 noon :-) However, he's the only person to have committed between 1am and 2am!

  • I do a huge number of commits on a Monday (400+), followed by successively fewer every day until Friday (~60). Does this mean I'm fresher and more enthusiastic after the weekend, or is it that when I start the week, I go for the easiest tasks first, and ramp up towards the more difficult tasks as I go on? On the other hand, overall the most commits are done on a Friday, followed by Tuesday


You can also measure some metrics around files, rather than developers:


  • Overall, across all file types, we have an average of around 40 lines of code per file

  • 58.7% of the files in the repository are .java, but they contribute 73.3% of the total lines of code

  • On average, each .rhtml template is just 26.6 lines of code

  • The most amended file in the whole repository is the Rakefile!

I could go on for hours....but it's nearly 2pm and I need to commit some code, dammit! So install StatSVN on your repository and have a play yourself. It's fun.

Wednesday, February 21, 2007

Roy Orbison In Clingfilm In Print!

(ahem) Allow me to explain that title.... A few years back, myself and some of the Freeserve guys stumbled across the cringe-inducingly hysterical Ulli's Roy Orbison In Clingfilm site, a pastiche of Slash Fiction consisting of a selection of stories involving Roy Orbison turning up unexpectedly, and by increasingly unlikely chains of events, ending up wrapped in clingfilm.

If you haven't read it, and you fancy skirting that fine line between weeing yourself laughing and feeling unaccountably disturbed, go have a look now, and then come back... go on, it's ok, I'll wait...

Back? Good. I'll continue.

"Ulrich Haarbürste" is a nom-de-guerre of Michael Kelly, whose Page of Misery is a veritable treasure trove of warped, cynical, misanthropic comedy gold, and has been the source of several viral emails that you might have had at some point ( French Intellectuals In Afghanistan, anyone? ).

Anyway, when we discovered the delights of Roy Orbison In Clingfilm, I felt so inspired that I set the first story to music, using the Microsoft Agent control to read the text in a Steven Hawking stylee, and using various Cubase plugins (e.g. Voice Designer) to change the pitch and formants of the voices for Roy and Ulli. It was quite a fun experiment, and once I was done, I sent a copy to Michael. He loved it - and so did a growing number of people - if you Google for "Al Davidson Roy Orbison In Cling Film" it turns up in a surprising number of playlists...

So I'm genuinely chuffed to hear that Michael Kelly has produced a Roy Orbison In Clingfilm book! You can buy it HERE - go on, you know you want to, it'll make a delightful coffee table discussion piece, guaranteed to break the ice at parties, and scare off the neighbours' children too!

I should just add that I'm not getting any referral kickbacks off this - I just think the esteemed Mr Kelly's talent deserves all the exposure it can get, and I hope he gets showered with riches and future publication contracts from his book. Tell your friends. And after the week I've had, a bit of warped humour is exactly what I need. Tell the world! Lets see if we can get Roy Orbison In Clingfilm into some best seller lists!

PS - To celebrate the occasion, I moved the song from its previous home onto a more permanent home at last.fm, along with a couple of other self-indulgent noodly guitar instrumentals. I keep meaning to finish off the hundreds of tunes-in-progress I've got kicking around on my hard drive and upload those too, but somehow there's never quite enough time...

Tuesday, February 20, 2007

The Icing On The Cake

Did someone break a mirror or something?

I spent the weekend clearing up after the burglary and running around like a blue-arsed fly, getting locks replaced and notifying the insurance company, getting some alarms and setting them up, trying and failing to get someone to come and replace the back door with its now-gaping-hole-where-the-catflap-used-to-be, and ending up hacking together a temporary repair with two plastic chopping boards and big tube of superglue (I kid you not!), and was finally starting to feel secure in my own home again.

So I made a conscious decision to try and relax on Sunday evening. I put my feet up, had a couple of beers, settled into the sofa for a wind-down in front of the TV followed by an early night, when about nine o' clock the phone rang. It was the tenant from one of Lisa's parents' flats.

"Can you ask him to call us urgently?" she said. "There's been an explosion!"

Lisa's parents own the whole building, and rent out the bottom floor to a restaurant, and the two flats above it to private tenants. We used to live in the first floor flat.

As it happens, the restauranteurs had been having some renovation work done, and there'd been a gas leak in the kitchen.... and on Sunday night, it went BOOM. Quite loudly, apparently. The police were round, and needed to get the landlord's details. One of the tenants in the first floor flat was hurt and had to go to hospital. The guy who'd been doing the renovation work has disappeared back to Morocco, and can't be contacted.

I'd already had a couple of beers, so I couldn't drive anywhere that night. So I dashed round there 7am the next morning to take a look at the first floor flat - the gas had obviously gone up the middle of the wall between the kitchen and bathroom, and when it ignited, it blew out that wall on both sides. The kitchen units were wrecked, and there's damage to the wall and ceiling.

Do you ever get the feeling that the universe is trying to tell you something?

At least I've still got my guitars - I feel a country song coming on...!

Friday, February 16, 2007

The Curse Continues....

Every time Lise goes away without me, something bad happens - we joke about it, but I'm not really in the mood for laughing right now. She's currently in Malaysia with her family, eating superhuman quantities of food and visiting relatives, and me..? Well, I got back from work yesterday to discover we'd been burgled.

I knew something was strange as soon as I got back from work about 8pm, because I couldn't get in. My key was turning in the Yale lock, but it wasn't opening. I had to knock on the neighbour's door and ask them to let me through into their back garden, so that I could get into our garden and see if I'd left a window unlocked anywhere, although I was sure I hadn't.

At this point I thought I'd just managed to lock myself out somehow, or something had gone wrong with the lock... I fitted it myself, so it's perfectly possible that I messed it up somehow - I inherited lots of things from my Dad, including a tall slim build, a stubbornly-resilient head of hair and an overpowering love of guitars, but sadly his skill with DIY completely passed me by.

So, bumbling about in the back garden with my neighbour, I tried the back door, and it was open. "Yay for my lax security!" I cried, big slightly-embarrased grin with the neighbour, gaw-blimey-what-a-muppet-I-am-guvnor, forget-me-own-head-if-it-wasnt-screwed-on expression on my face as I went inside. And it was as soon as I walked into the kitchen that I realised what had happened.

It looks like they kicked in the catflap in the back door, then managed to get to the key, which was in the lock - either with a very long thin arm, or some bent coat-hanger kind of contraption, I don't know. But I'm sure that's how they got in. I figure they must have then gone straight through to the front door and flipped the deadlock switch - so they wouldn't be disturbed if I came home early and tried to let myself in through the front door.

The bastards had torn everything off the shelves and emptied every box onto the floor, in every room, evidently looking for money or small items they could fence quickly, like jewellery. They tore all our clothes out of the cupboards and drawers, went through everything, and trampled lots of back-garden mud into our beautiful fluffy white deep-pile bedroom carpets in the process.

Luckily, they didn't steal much. They must have been in and out within a few minutes. They took :
  • a Nikon Coolpix 6MP digital camera
  • possibly an old Sony Ericcson mobile, but most importantly
  • Lisa's Dad's Toshiba laptop.
  • possiby some of Lisa's jewellery.

Anyone who's seen me present at a London CFUG will probably remember the laptop - it's HUGE (17"), shiny and red, and weighs an absolute ton. So if by any chance you see it cropping up on eBay or in your local Cash Converters, email me.

It could have been far worse, of course. They didn't take:
  • any of my guitar collection
  • my main PC
  • the TV, DVD player, X-box, etc
  • my passport - phew! - despite all my paperwork living in orderly, clearly-labelled box files
  • Lisa's treasured Jimmy Choo shoes
  • any of the climbing/mountaineering equipment, despite going through it all
  • my stupidly expensive mountaineering watch, which I bought with my leaving present (gift vouchers) from Headshift

(Sadly they didn't even take the all-in-one printer that I ordered online ages ago, without realising what a huge great monstrous carbuncle it was going to be, and have regretted buying ever since)

And they didn't leave any nasty surprises on the carpet / up the wall / etc. When I got burgled back in Bridlington (by a guy who I thought was my best friend at the time) they left a nice big turd in the middle of the carpet, and killed the cat in the process. When my mum got burgled a few years ago, they smeared shit up the walls and left what I'll just call a "deposit" in her underwear drawer. I'm thankful that they didn't do any of that in our flat.

It must be a sign of the times that I was far more worried about them stealing my identity and information than my posessions. They now know my name, what I look like, where I live and where I work. I'm still worried about what was on the laptop. Did I have any banking information on there? Did Lise? Too late to check now. Hopefully if they didn't steal my passport, they're not sharp enough to realise what they have, and they'll just format the disk before flogging it on - but you just don't know, do you - and that's going to worry me.

Things like this make you realise, in fact, that when someone breaks in and steals your stuff, it's just stuff - and stuff doesn't actually matter that much. Stuff can be replaced. Mess can be tidied up - it took me until 2:30am this morning, but I cleared it all up. What matters is people, and I'm just grateful that Lise wasn't here when they came, and that she didn't get the huge shock I did when I walked in, and that she didn't have to see the mess they made of our beautiful home.

What gets me, though, is that feeling of being violated. "An Englishman's home is his castle" as the saying goes, and my castle feels...soiled. Fuckers.

I've now got a fun-packed day to look forward to today. The police were too busy to come round last night, so they're coming this morning, probably just to take my particulars and give me a crime number for the insurance. I'm sure they've got more urgent priorities, like trying to stop fifteen-year-olds shooting each other, which seems to be the preferred pastime-du-jour. I've got to file the insurance claim, get the locks changed, and try to get the back door replaced, if I can.

And I also have to tell Lise, and her Dad, over a dodgy mobile phone reception and an 8hr time difference, and I just know that it's going to break her heart. It;s going to ruin the remaining week of her holiday, and she's going to worry herself sick every day until she comes home.

So let this be a cautionary tale. Never leave your key in the lock! And if you, like me, keep looking at your door or windows and thinking "hmm, really ought to get round to getting that properly fixed-up at some point..." then get round to it TODAY! OK? Ok...

Thursday, February 08, 2007

DevNotes On AJAX

As a long-time user of the FLiP methodology, there have been many times where I've created a prototype and used Hal Helms' DevNotes as a plonk-and-play way of capturing clients' thoughts and feedback.

It's a nifty little tool that does the job it sets out to do - but since it was released way back when, teh intarweb has moved on, and I've had a few niggly things that I had to work around when using it :

  • it's rendered using "old-school" HTML, with things like FONT tags

  • sometimes you can get conflicts between the DevNotes form submissions and parameters that the containing prototype app requires

  • the dependency on cfx_make_tree.dll didn't play nicely with a non-Windows OS


So, while noodling around over the weekend recently, I decided to give it a bit of a facelift. One thing led to another, and it ended up as a pretty major update. So laydeez and gennullmun, allow me to present....(drum roll)

DevNotes On AJAX!


Main Changes:


  • Removed dependence on CFX_Make_Tree.dll,
    replaced with udfMakeTree.cfm from cflib.org, by Michael Dinowitz
  • Ported to be based on Ajax requests,
    de-coupling the DevNotes requests from the Prototype page requests, and allowing people to stay on the same Prototype page while adding notes
  • Added RSS syndication of notes.
  • Prefixed all request variables and JS functions with DEVNOTES
    to prevent any potential conflicts with the containing app
    Required parameters in the containing app are now:

    // these fields required for DevNotes
    request.DEVNOTES.devAppName = "UniqueAppname";
    request.DEVNOTES.attributeToKeyOn = "page";
    request.DEVNOTES.DevNotesDSN = "YourDSNName";
    request.DEVNOTES.devnotesRSSTitle = "Title for the RSS feed";
    request.DEVNOTES.DevNotesRSSDescription = "Description for the RSS feed";

  • Updated HTML to be more semantic and standards-based.
    Admittedly it might not be 100% XHTML-compliant, but I don't really have time to check it on that - be my guest...
  • Introduced CSS styling
    and grayscale mini icons from Brand Spanking New
  • Re-factored the code into a fusebox3-stylee app structure.

Example usage:


Extract the archive into a subfolder called "DevNotes" under the root of your prototype app.

Make sure that the following variables are set up somewhere in your app - fbx_settings.cfm is usually the best place :


// these fields required for DevNotes
request.DEVNOTES.devAppName = "UniqueAppname";
request.DEVNOTES.attributeToKeyOn = "page";
request.DEVNOTES.DevNotesDSN = "YourDSNName";
request.DEVNOTES.devnotesRSSTitle = "Title for the RSS feed";
request.DEVNOTES.DevNotesRSSDescription = "Description for the RSS feed";


Then in your prototype's LAYOUT file, just add
<cfinclude template="../DevNotes/DevNotes.cfm" />
(replace the path as necessary)

NOTE: don't put this in OnRequestEnd.cfm, as it will get triggered for each AJAX request and land you in an infinite recursive loop! The layout file is the best place for it, as it's display-generating code.

Usual usage policy : you can do what you like with this code, so long as
a) you maintain the attribution comments
b) you accept that it's provided entirely as-is, with no warranty of any sort
c) you don't violate any terms in the other code that it includes (see credits within the code)
and most importantly
d) you don't have too much of a go at me for not commenting it particularly well!


Enjoy !

DevNotes On AJAX

We Have A New Paperweight


New Paperweight
Originally uploaded by Dr Snooks.
We scooped a TrampoGong last night - the Oracle Partner Innovation Award for 2006. Yaaaay! Although, after sampling the varied delights of the hospitality bar, it's hard to tell whether the award or the coffee was more appreciated first thing this morning....

Wednesday, February 07, 2007

Oracle, Sonar and Schnow, oh my

'ello, yes, I am still alive... bleh - been busy busy busy recently, it's been quite a week or two....

We had our first major deployment of Sonar to get out, so none of us got home until after midnight until Thursday. Eventually got it done, though, to huge relief and slaps on the back all round, thanks to some Herculean efforts from everyone on the team.

After a week like that, I needed to get right away from it all, so a weekend in the mountains of Scotland for some ice climbing was just the job. I learned quite a lot over that weekend -

- supporting my 12-and-a-half-stone on four tiny metal spikes sticking out of my toes gets pretty damn painful on the calves after four hours or so
- just because the weather is gorgeous right now, doesn't mean that it's not going to be bloody grim soon
- and many more

I got back to work yesterday, with just one day to recover before tonight's extravaganza at the Oracle Partner Innovation Awards - we've been nominated for the "Hottest Tech Prospect In The UK" award, so we're off to meet 'n' greet 'n' press the flesh with the industry bigwigs.

...which was nice...

Hopefully some point soon I'll be able to get back to some technical posts on this 'ere blog, as I have a couple of goodies to impart (even one CF-related, shock, horror) and I'll post them as soon as I get chance to package them up.

Monday, January 22, 2007

"Fed-Ex" Social Engineering / ID Theft Scam?

At 7am this morning I was woken up by the phone ringing. I didn't make it to the phone in time, and it rang off. "Who the hell is ringing me at 7am?" I thought, in my usual semi-comatose, semi-neanderthal pre-coffee state, but a 1471 said "we do not have the caller's number". Hmmm....

Twenty minutes later it rang again. This time I got there in time to pick up -

"Hello?" I said.
(Well, admittedly, at that time in the morning it was probably more like "blrglhmph?", but you'll have to perform the necessary transliterations yourself for the rest of this transcript)

A man with a very strong Indian-type accent replied:
"Hello, can I speak to Mr. Alistair Davidson?"

"Speaking..."

"Is your address [my address and postcode] ?"

"Yes...."

"This is Fed-Ex, we have a package to deliver to you this afternoon"

"Oh ok..." ...so why do you need to ring me at 7am about it? I thought

"I need to confirm some security details - can you give me Mr Alistair Davidson's date of birth?"

Now at this point my tinfoil-hat alarm bells started ringing -

  • why would Fed Ex possibly want my date of birth in order to deliver a package?

  • come to think of it, how would they know my telephone number?

  • in fact, how did I know this guy was from Fed Ex at all? He said he was calling from Fed Ex, but I could ring up anyone at random and claim to be Kylie Minogue, it wouldn't make it true...



I decided to play it cautious -

"Why do you need my date of birth?" I asked

"I need to confirm these security details to deliver the package" he said

"Well I'm not going to give that information out," I replied

"But we won't be able to deliver this package" he was starting to get a little tell-tale this-isn't-the-way-it's-supposed-to-go tone of voice

"Ok," I paused, thinking - in fact, how did they get my number? I've never sent anything by Fed-Ex, but I'm fairly sure they won't require any information about the addressee other than the address? Now I was really suspicious - time to challenge back....

"So who's this package from?" I asked

"Er - it's a cash, door-to-door delivery, I can't tell you who it's from" he replied. Well, I'll give him a 5.7 on artistic impression for speedy improvisation, I thought, but there was a definite hesitation there, for which the Russian judges would mark him down on technical merit. Besides, if they'll happily accept packages for delivery without any information about the sender, then they certainly wouldn't need to know anything about the recipient

"So can you tell me Mr Alistair Davidson's date of birth?" he returned to what I was increasingly convinced was his script.

"No, I'm sorry, I'm not going to give that information out," I said firmly.

...and he hung up on me. A swift 1471 showed that he'd suppressed the number before he called.

OK, so there's a couple of possibilities here -

  1. He really was from Fed Ex, and they really do have a policy of incredibly bizarre security procedures, ringing their addressees up at 7am, and hanging up on their customers. Call me Mr Idealistic, but I think that's unlikely

  2. He, or an associate, got my name and address from somewhere - maybe from something as simple as a spam letter I threw away, or come to think of it, I'm in the phone book - and decided to try some phishing.


If I was a betting man, I'd put money on the second.

Which makes me wonder -

  • Does he / do they normally try this on US numbers? We don't really do Fed-Ex here in blighty, any delivery is far more likely to be via Parcel Force or even UPS.

  • Do they (I'll assume for now that there's more than one of these scammers) deliberately call early in the morning, to get people while they're fuzzy and bleary and not thinking straight?

  • How many other people have they managed to get identity details from like this?


I'm a suspicious sod, and paranoid about my personal information - it's a known side effect of having worked in "teh intarweb" for too long :-) - but I know plenty of people - like my mum - who will be incredibly dubious about giving their credit card details to an online retailer, but will quite happily answer any question asked by any random punter who rings them up and says "hi, this is (say) HSBC, can we check your account number and sort-code please?"

It's a sad fact that the weakest link in any security system is the people involved, and until there's a fundamental shift in human nature, that's unlikely to change any time soon.

Monday, January 15, 2007

In Defence Of Hungarian Notation

I started typing a comment on Pete Bell's post Why Not Hungarian, but it got a bit too long so I've put my two-penn'orth here.

Hungarian notation, for me, can be very useful, but maybe not in the form that's most commonly understood. I've heard the argument that the original intent behind Hungarian notation was to declare what KIND OF THING a variable is - but this got mistranslated into what TYPE a variable is, and the two are not the same -

For instance, in the last big CF project I did, I took to declaring strings that contained markup with a "htm" prefix, and strings that may have character codes - such as from textarea input - with a "raw" prefix. Similarly URL-encoded strings get an "enc" prefix. They're all strings, and traditional Hungarian notation might give them all the same prefix - but if you do it this way, as soon as you started typing a line like:

<cfset htmContentForDisplay = rawInput />

you immediately know that something is wrong. The variable prefixes themselves make you think "Hmm - that's not going to work.... I need to do some intermediate processing to convert the formats there"

In a strongly-typed, pre-compiled language like Java or C++, then Hungarian notation can just get in the way - you've got the compiler as a type-safety net, and anyone who's ever tried to do anything meaningful with MFC will certainly testify to the horrors of things like "lpszhwndmyWindowHandle". But I think that in loosely-typed languages such as CF, Hungarian notation has it's place if you use it properly - remember, "a Hungarian is the only man who can follow you into a revolving door and come out first" :-)