Here's a little SQL test that arose here the other day. Let's say you have a DB table called documents, in which you have the following fields:
full_content is a BLOB, containing all the text extracted from whatever document the record represents.
bytesize is, cunningly enough, the size of the content in bytes.
You want to get all the documents, and do something to them - what you want to do is not important here, just that it involves the full_content. However, a moment of pondering will show the nasty bit of this problem... Let's say you may have tens of millions of documents, and the average bytesize is around a meg. Clearly attempting to read the full table into memory is not a good idea (at least until we get Terabytes of RAM in our servers).
A far better idea is to limit the maximum memory usage to some value, and only process up to X MB of documents in one batch. The next time round, you'll get the next X MB of docs, and so on.
We turned this problem over and over for a while, and we thought of a few ways of doing it, but all involved multiple queries, and/or things like temporary tables... and, dammitall, it SHOULD be do-able in one query, dammit man! (cue much indignant harumphing and twizzling of large mutton chop side whiskers, in a Victorian man-of-letters kind of way) Oh yeah, and here's the killer - it has to be portable across Oracle 10g and MySQL.
So the problem is this - given this table, can you construct a single query that will read all the documents up to a given maximum number of megabytes, that will run on Oracle 10g AND MySQL?
Well, after pondering this in several isolated attempts for a few days, I went back to it yesterday, and the answer just popped out straight away - and it's actually deceptively simple. I was going to just give you the answer, but I thought it might be a bit more fun to leave it as a challenge, so a big fat Brucie Bonus goes to the first person who gets it - I'll put the answer up here later on today, if no-one gets it in the meantime.