Monday, August 17, 2009

Styling the Rails auto_complete plugin

Over the weekend I was having a fiddle around with the layout on Cragwag, as the existing design was a bit, shall we say, emergent. I'd being trying to avoid a LHS column, because everything always seems to end up with one, but in the end I had to give up and just go with it. The auto-tag cloud just makes more sense over there, and there's no point fighting it.

Anyway, the next question was what else to put there? I was looking for an intermediate stop-gap search until I got round to putting a proper Lucene-based search in there, and so the thought struck - I'd been looking for a way to browse tags that weren't necessarily in the top 20 (e.g. show me all the posts about Dave McLeod's 2006 E11 route), so why not try an auto-suggest tag search?

So I found DHH's auto_complete plugin (it used to be part of Rails core until version 2) and got to work. This should be easy, right?

Well...

I'll cut out the frustrations and skip to the solutions. :)

The documentation on this plugin is virtually non-existent, and there some extra bits you'll need - but I found this post very helpful.

One minor irritation I found was that it writes out a bunch of css attributes directly into your HTML inside a style tag - including a hard-coded absolute width of 350px. (see the auto_complete_stylesheet method)

Argh! said I, as my search needed to fit into a 150px width.

So how can you get round this? Simple - you can override the inline CSS in your stylesheet, provided you provide a CSS selector with higher specificity

Now, CSS specificity can be a fairly complicated topic, but I usually just remember it like this - if you've got two rules that apply to a particular thing, the more specific rule wins.

In this case, the inline CSS selector from the plugin:
 div.auto_complete {
width: 350px;
background: #fff;
}

gets trumped by Cragwag's more specific selector:
div#content div#lh_sidebar div.auto_complete {
width: 150px;
background: #fff;
}


The first applies to any div with class="auto_complete", but the second applies only to divs with class="auto_complete" which are inside a div with id="lh_sidebar" which is inside a div with id="content". So that's a more specific rule, so it wins.

Yay!

Wednesday, August 12, 2009

OutOfMemoryError in ActiveRecord-JDBC on INSERT SELECT

During some scale testing the other day, we came across this unusual / mildly amusing error in a database-bound command that just funnels INSERT SELECT statements down the ActiveRecord JDBC driver:


java/util/Arrays.java:2734:in `copyOf': java.lang.OutOfMemoryError: Java heap space (NativeException)
from java/util/ArrayList.java:167:in `ensureCapacity'
from java/util/ArrayList.java:351:in `add'
from com/mysql/jdbc/StatementImpl.java:1863:in `getGeneratedKeysInternal'
from com/mysql/jdbc/StatementImpl.java:1818:in `getGeneratedKeys'
from org/apache/commons/dbcp/DelegatingStatement.java:318:in `getGeneratedKeys'
from jdbc_adapter/JdbcAdapterInternalService.java:668:in `call'
from jdbc_adapter/JdbcAdapterInternalService.java:241:in `withConnectionAndRetry'
from jdbc_adapter/JdbcAdapterInternalService.java:662:in `execute_insert'
... 25 levels...


Now, there's a couple of things here that are worth pointing out.

  1. I REALLY LOVE the fact that it blew heap space in a method called ensureCapacity. That makes me smile.
  2. Why is it calling getGeneratedKeys() for an INSERT SELECT?


The getGeneratedKeys() method retrieves all the primary keys that are generated when you execute an INSERT statement. Fair enough - BUT the issue here is that we'd specifically structured the process and the SQL statements involved so as to be done with INSERT SELECTS, and hence avoid great chunks of data being transferred backwards and forwards between the app and the database.

It turns out that the ActiveRecord JDBC adapter is doing this :
(lib/active_record/connection_adapters/JdbcAdapterInternalService.java)

@JRubyMethod(name = "execute_insert", required = 1)
public static IRubyObject execute_insert(final IRubyObject recv, final IRubyObject sql) throws SQLException {
return withConnectionAndRetry(recv, new SQLBlock() {
public IRubyObject call(Connection c) throws SQLException {
Statement stmt = null;
try {
stmt = c.createStatement();
smt.executeUpdate(rubyApi.convertToRubyString(sql).getUnicodeValue(), Statement.RETURN_GENERATED_KEYS);
return unmarshal_id_result(recv.getRuntime(), stmt.getGeneratedKeys());
} finally {
if (null != stmt) {
try {
stmt.close();
} catch (Exception e) {
}
}
}
}
});
}


...in other words, explicitly telling the driver to return all the generated keys.
Hmm, OK, can we get round this by NOT calling the execute_insert method, and instead calling a raw execute method that doesn't return all the keys?

Well, no, unfortunately, because it also turns out that the ruby code is doing this:
(activerecord-jdbc-adapter-0.9/lib/active_record/connection_adapters/jdbc_adapter.rb)

# we need to do it this way, to allow Rails stupid tests to always work
# even if we define a new execute method. Instead of mixing in a new
# execute, an _execute should be mixed in.
def _execute(sql, name = nil)
if JdbcConnection::select?(sql)
@connection.execute_query(sql)
elsif JdbcConnection::insert?(sql)
@connection.execute_insert(sql)
else
@connection.execute_update(sql)
end
end


...and the JdbcConnection::insert? method is detecting if something's an insert by doing this:
(JdbcAdapterInternalService.java again)

@JRubyMethod(name = "insert?", required = 1, meta = true)
public static IRubyObject insert_p(IRubyObject recv, IRubyObject _sql) {
ByteList bl = rubyApi.convertToRubyString(_sql).getByteList();

int p = bl.begin;
int pend = p + bl.realSize;

p = whitespace(p, pend, bl);

if(pend - p >= 6) {
switch(bl.bytes[p++]) {
case 'i':
case 'I':
switch(bl.bytes[p++]) {
case 'n':
case 'N':
switch(bl.bytes[p++]) {
case 's':
case 'S':
switch(bl.bytes[p++]) {
case 'e':
case 'E':
switch(bl.bytes[p++]) {
case 'r':
case 'R':
switch(bl.bytes[p++]) {
case 't':
case 'T':
return recv.getRuntime().getTrue();
}
}
}
}
}
}
}
return recv.getRuntime().getFalse();
}


...in other words, if the sql contains the word INSERT, then it's an INSERT, and should be executed with an execute_insert call.

So, looks like we're a bit knacked here. There are two possible solutions:

  1. The proper solution - fix the AR JDBC adapter (and, arguably, the MySQL connector/J as well, to stop it blowing heap space), submit a patch, wait for it to be accepted and make it into the next release.

  2. Or, the pragmatic solution - rewrite the SQL-heavy command as a stored procedure and just call it with an EXECUTE and sod the DHH dogma.


We went with option 2 :)

Big kudos to D. R. MacIver for tracking down the source of the fail in double-quick time.

Monday, August 03, 2009

Introducing Cragwag.com!

You know those conversations you end up having in the pub, where after a couple of beers you end up saying "you know what there should be? There should be a site that does X" (where X can be anything at all)

I've had so many of those over the years, and never quite managed to work up the free time / motivation to actually get on and put the ideas into practice..... (and then what's tended to happen is that a couple of years later, someone else goes and does them and makes a fortune, but that's just sods law)

Well a few months ago, I decided enough was enough, and the next time I had one of those ideas, I should just stop talking about it and actually do it - so in my evening and weekends here and there, I've been noodling away on a couple of ideas, mainly just for my own amusement, and to keep me in the habit of actually following through on things.

So here's one of them - Cragwag - a climbing news aggregator.

Those of you who know me in person know that I'm a keen amateur climber. I'm under no illusions - I'm definitely in the "amateur" category for good reason :) - but it struck me that although there's a "definitive" go-to site for uk-centric climbing news ( UK Climbing.com ) it's still editorially filtered - an editor keeps himself up to date on everything that's going on in the scene, and then publishes what he thinks is significant.

That's all well and good, but typically what's significant is what's going on at the forefront of ability. I felt there was also a place for the (admittedly by-now-a-little-cliched) long tail of climbing-related blogs - unfiltered, un-edited, everybody's tales. Whether of heart-stopping epics in the Himalaya, or scrabbling up a Stanage slab. If you felt it enough to blog about it, then someone wants to hear about it.

Plus it was an experiment in automatic news tagging and cross-relating of posts based on content, so it was kind of techie fun too. Which is important :) I'd like to do something with a crowd-sourced google map and iphone gps too, but that's much more experimental. Need to learn the iPhone SDK first...

So that's Cragwag - all the climbing news from punters and pros alike. Just for fun, for the sake of an experiment, and - to paraphprase the most famous answer to "why?" of all time - because it wasn't there :) Yay!