Tuesday, December 12, 2006

Trusting The Magic Pixies - Hibernate HQL

<rant>
I'm having trouble trusting the Magic Pixies. I admit, it's a common theme of mine, and maybe it's just NIH syndrome-by-proxy, but Magic Pixies - able and willing though they are - can only do your job for you if you ask them in just the right way. And sometimes they seem like they're being deliberately dumb and obstinate little buggers that just won't do as they're told, dammit!

(ahem) Perhaps I should elaborate here...

The Magic Pixies in question here are the ones that do the "hard" work for you in Hibernate, the near-as-dammit de facto standard ORM system for Java. Or Persistence Management. Or whatever term you prefer. Pete Bell has been blogging in great detail about the process of writing his own cf-based ORM framework, so if you're not familiar with ORM systems then you're probably best going to check out his blog, because this is a pure and simple straightforward bit of spleen venting.

The basic idea of Hibernate is that you can write your object model just as you choose, and then map your objects to persistent entities using (surprise surprise) an XML config file - or, if you're using Java 5 and EJB 2 and Hibernate 3, then you can dispense with the XML config file (yay!) and use annotations instead (double yay!) So long as you construct your mappings correctly, then you "don't need to worry about" the SQL - the Magic Pixies of Hibernate will auto-generate your db schema and SQL queries, and magically do your CRUD for you with a sprinkling of their Magic Pixie Dust™.

"But what if I want to do something a bit more complex than basic CRUD?" I cried.

"Like what?" said the Magic Pixies

"Like a left outer join?" I replied

"Oh, you don't need to worry about all that nasty SQL" said the Magic Pixies, " because that would tie your code to your database system, and that's a BAD THING! BAD Al! BAAAAAD!"

"Gosh, sorry Magic Pixies," I said, rather sheepishly, "I promise to use that nice abstracted Criteria API that you so generously provided in future"

"And so you should!" said the Magic Pixies, "Remember, you do the code, we do the data, otherwise we'll have harsh words with our union rep, OK?"

"Ok! Ok!" I said. "Now could you stop hitting me with that rolled-up newspaper please?"

"So long as you promise to be good"

"I do! I do!"

"Ok then"

"But what if I want to do something more complex than that?" I asked.

The Magic Pixies looked a bit puzzled.

"What on Earth could you want to do that's more complex than that?" they replied.

"Well, what if I wanted to find a set of objects of type X that didn't have any corresponding objects of type Y that match certain criteria?"

"Pfffft!" said the Magic Pixies, " that's easy, you just create Criteria along the association paths!"

"Er, huh?" I said.

"You just create a Criteria object for class X, and then create another Criteria object using that Criteria object by passing the name of the property of class X which refers to the encapsulated class Y contained within class X!"

"Huh?" I said.

"Or if you really want to, you can use HQL"

"HQL?"

"Yes, Hibernate Query Language. It's almost like SQL, but not quite. Because we wouldn't want you using SQL - SQL's tied to database platforms, and that's BAAAAAD"

"So how do I do it HQL?"

"You create a query that queries along the association paths and properties of the objects"

"Oh, right, OK" I said. "But what if class X doesn't have class Y as a property?"

"Er..... huh?" said the Magic Pixies.

"Class X has no property that refers to class Y"

"Well then, you won't need to query for it, will you?" said the Magic Pixies, a touch too smugly for my liking.

"But I do!" I insisted.

"Er.... huh?" said the Magic Pixies.

"Well, say if I had a table of ItemLinks...." I began.

"A WHAT of ItemLinks????"

"Sorrysorrysorry! I mean an ItemLink object..."

"That's better!"

"...that represented a weighted link between two Items, such as might be calculated by some very complicated fuzzy logic and Natural Language Processing"

"OK"

"...and a separate LinkPreference object that represented an preference expressed by a Person as to whether their ItemLink to a particular object would be public or not"

"Erm... can you give us an example?"

"Sure - this clever NLP stuff might detect that Bob from SysAdmin has been talking a lot about clustering database servers, and he might want to share that link so that he is known as an expert in that field."

"OK, with you so far..."

"But it might detect that the boss has been talking with his secretary about a dirty weekend in Brighton, and they really wouldn't want that shared at all, would they?"

"Erm, isn't that an outmoded stereotype that just reinforces age-old gender-typecast notions of sycophantic star-crossed secretaries as prey for the equally-stereotypical notion of amoral boss-as-alpha-male-predator?"

"Alright, alright, but you get the idea!" (that Magic Pixie was really starting to get on my titty ends)

"Yes, I follow you"

"So what if I want to query for all ItemLinks that have been created in the last, say, two weeks, and that don't have a corresponding LinkPreference?"

"Well, you could query for ItemLinks that have LinkPreference set to null"

At this point I was starting to snort quite heavily.

"But I told you, ItemLink doesn't have a property that refers to LinkPreference! The two are completely independent!"

"Well then you shouldn't want to query for them" said the Magic Pixies

"But I DO!"

"Well, can't you follow the association path up from ItemLink to Item and then down to LinkPreference?"

"Well, yes, I could, but wouldn't that result in the Items table being read in the query when there's absolutely no need for it?"

The Magic Pixies looked down at their feet

"...might do..."

"And isn't that horribly inefficient?"

They started fiddling with their shorts

"...might be..."

"And it's not that simple anyway, because it's a compond join on TWO properties!"

"...yes..."

"WHAT was that?"

"yes!" said the Magic Pixies, with bottom lip sticking out.

"So can you perform this raw SQL query in your own way?"

SELECT item_links.* 
FROM item_links
LEFT OUTER JOIN link_prefs
ON item_links.item_id = link_prefs.owner_item_id
AND item_links.other_item_id = link_prefs.linked_item_Id
WHERE
item_links.created_at < ?
AND link_prefs.shared IS NULL



"...might do, if you ask us nicely..."

(sigh) "OK, can you pleeeeeeease do it?"

They conferred for a moment in hushed whispers, and then turned back with a very smug-looking smile, and said

"Yes, we can - but we're not going to"

"WHAT?"

"You have to ask us in the right way"

Steam was starting to emerge from my ears

"And what IS the right way to ask you?"

They grinned even wider

"We're not going to tell you!"

And I stormed out of the room.

You see, the trouble I have with ORM systems is that they're all well and good as far as they go, and yes they can save large amounts of "donkey work" But sooner or later you nearly always come up against something that would be almost trivially easy to do with raw SQL, but the nice insulated ORM abstraction just can't deal with. I know that I'm probably looking at this from the "wrong" direction, I'm thinking about the data rather than the objects, but until the Magic Pixies start to play a bit more nicely, I'm always going to be a bit suspicious of them.
</rant>

(deep breaths..... calm.... happy thoughts..... nearly Christmas....)

12 comments:

Anonymous said...

Best lunch time reading in a while :P

'sides I lilke the "donkey work" sometimes you can save it for a rainy day when the "challenging" stuff gets boring

Alistair Davidson said...

:) Me too, I'm feeling a bit of a nostalgic twinge for the days of donkey work - mainly because I worry that if I get out of the habit of doing it, I'll get rusty.

Alistair Davidson said...

Ah-hah!

Cracked it:

String hql = "from ItemLink AS itemLink"
+ " WHERE "
+ " itemLink.createdAt < ?"
+ " AND NOT EXISTS( "
+ " from LinkPref AS pref "
+ " WHERE pref.ownerItem.id = itemLink.item.id "
+ " AND pref.linkedItem.id = itemLink.otherItem.id "
+ " AND pref.shared = true"
+ " )"
;

Query q = getSession().createQuery( hql );

q.setTimestamp(0, maxCreatedDate);

return q.list();

Anonymous said...

As I was reading, I was thinking "just use an HQL subquery", and sure enough...

It's worth mentioning that you can also use SQL criterions in your Criteria queries. Obviously the magic pixies don't like it, but they were nice enough to expose the raw databse for certain edge cases that simply can't be expressed with the built-in constructs. In this case, the HQL is a better solution, because it absolves you from writing the SQL (thereby keeping you in the object model), but not always possible.

Alistair Davidson said...

Yeah, I was trying to avoid using subqueries, as not all DB platforms support them, but in the end it just had to be done.

Raoul Duke said...

Wow. I'm not sure if this makes me feel better or worse about having to learn Hibernate. But, nevertheless, thanks for writing it, it is good to know not everybody is Hibernate-drunk.

blogOmatic said...

HoustonPoolService

Anonymous said...

For information about Street Magic Tricks

Secrets to performing hundreds of street magic tricks are revealed in the Street Magic Revealed ebook. One of the best things about the Street Magic Revealed ebook is...Read the full article at Street Magic Revealed Ebook Review

Unknown said...

Yeah, you can use subqueries to do what you propose in the article (that is, simply to find out if such a link *exists*), but what if I actually want the select list to include properties on the left-outer-join'd table? No-can-do with HQL and especially not with Criteria. This absolutely infuriates me at the moment because it means I have to drop all the way back down to a native query. HQL supports a theta-style join with a WITH clause for specifying arbitrary join conditions on non-associates entities, but that is inherently an inner-join. Why couldn't they have made an outer join that does this?

Unknown said...

... non-associate*d* ... (in case that confuzzled you)

Nikunj said...

What if I want a full join. I have a requirement to join two tables unrelated w.r.t. hibernate. I want something like:

List list=session.createQuery("select a,b from A a full join B b on a.col1=b.col1 where VERY LONG WHERE CONDITION").list();
list is a list of Object[]. Object[1] should be null when it does not have a link and viceversa.

If I user 1 inner join and two joins using "NOT EXIST" clause, I end up executing 3 queries with the same where condition.

Unknown said...

I completely agree with this. I've been banging my head against "Hibernate in Action" for weeks over a similar situation with multijoins coming in from two vectors. Right now, hibernate is on my hit list