Wednesday, June 23, 2004

Collecting Your Garbage in CFMX

I came up with a handy little trick the other day, while working on a Dreaded Import Job. I wouldn't recommend relying on it, but it's a nifty last-ditch "kludge" if the pressure is on and the deadline is tight.

I was looping round a few hundred records from database X to import them into database Y. Each time round the loop, depending on the data, it was grabbing lots of related records from DB X, and performing all sorts of checks on existing records in DB Y via CFC calls.

The issue was that every time it ran, it would slow to a crawl after a couple of hundred records. A little investigation showed that it was hogging server RAM at a quite frightening rate - maybe two or three MB every second - and never releasing it until the job finished. As the available RAM decreased, the rate of processing got slower and slower, and a hasty back-of-an-envelope calculation showed that it was extremely unlikely to finish the job in finite time.

Hmmm....

A bit more digging uncovered that the problem was most probably due to some less-than-optimal memory usage inside the CFCs. For instance, the "view" method, to retrieve a record from the DB and return an instance of the CFC with the properties populated, was following a process something like this -

- Get the record from the DB
- Create a new instance of the CFC
- Populate the fields
- Return it.

Hands up who spots whats wrong with that? Anyone? What happens to the new instance of the CFC? When does it get destroyed? Ah-hah....!

--- ASIDE -----
Variable scoping, memory management and the lifetimes of objects are critical issues in a "proper" object-oriented language like Java or (particularly) C++. In fact, one of the main motivations behind the development of Java in the first place was to provide a way to free devlopers from the headaches of manual memory management in C++.

Having written a few fairly complex apps in both languages, I can testify that having to manually allocate AND free the exact number of bytes that you need, in C/C++ is both a curse AND a blessing - it's very easy to write code that will allocate memory and never release it, and often very difficult to track down the cause of the problem when it happens. The upside of it is that it forces you to be very aware of what the code you write is actually doing on a very low level. As a result, variable-scoping is very well defined and documented in C++ - it HAS to be.

Java, on the other hand, uses a different approach. Because a Java app runs in a virtual machine, it allows the use of a Garbage Collector. Developers can happily create new instances of objects with abandon, knowing that every so often the system's garbage collector will check for objects that have gone out of scope, or have no remaining valid references to them, and destroy those instances and free the memory they were occupying back to the heap. This gives developers a lot more freedom, at the cost of losing the hands-on control that you get with C/C++. And sometimes, when things aren't quite working as they should, you can really miss that low-level control.

As CFMX is now a 100% Java application, and CF code is actually compiled into Java classes for execution, this means that intensive, long-run-time code needs to rely on the garbage collector to free resources, but the lifetime of CFC instances and the point where they go out of scope is extremely difficult to pin down - it doesn't seem to be clearly documented anywhere that I've found in a couple of hours of solid Googling.

--- /ASIDE ---


When the above method is called hundreds of times in a loop, it looks to be creating hundreds of instances of CFCs that never seem to die. As stated above, it's very very difficult to find any documentation on this, so all I have to go on is educated guesswork, and hunches.

My hunch is that CF doesn't explicitly run the garbage collector until the end of a request. It's a fairly reasonable way to have designed it, given that 99% of all CF requests are intended to display stuff to a browser within a short amount of time. It's not a language that's really designed for long back-end tasks.

What the view method above should really have been doing was more like -

- Get the record from the DB
- Populate the properties of THIS INSTANCE
- Return this

It's quite a subtle difference in approach - object-oriented versus procedural - and with CFCs still being a pretty new advance in ColdFusion, it's not something you'd be likely to spot without having developed object-oriented code in other languages, and come across similar issues.

When faced with a situation like this, the "correct" thing to do is to re-develop the CFCs to do it properly. However, that takes a lot of time for re-coding and retesting the whole application. That's the downside of centralised, modular code - when you change a bit of modularised code that gets called from all over the place, you have to re-test all over the place!
The deadline, unfortunately, was very tight, so a quick "kludge" had to be found. And here it is -

We can use the fact that CFMX is 100% Java to our advantage here, by FORCING the garbage collector to run. It's actually quite simple:


<cfscript>
// NOTE: I'm typing this code from memory, on a laptop with
// no access to the code, so apologies for any thing that
// may not be 100% correct. The aim here is to illustrate
// the principle, rather than give a copy-and-paste solution

// Create a java System object
objSystem = CreateObject( "java", "java.lang.System" );

// How often do we want it to run?
if( NOT isDefined( "attributes.forceGCEveryNLoops" ) ){
attributes.forceGCEveryNLoops = 25;
}
</cfscript>

<!--- get import data --->
<cfquery name="qryImportData" datasource="whatever"&gt;

</cfquery>

<cfset intLoopCount = 1>

<!--- start of big loop --->
<cfloop query="qryImportData">

<!--- do lots of complicated stuff here --->

<!--- do we need to run the GC? --->
<cfscript>
if( intLoopCount GT attributes.forceGCEveryNLoops ){
// run the GC
System.gc();
intLoopCount = 1;
} else {
intLoopCount = intLoopCount + 1;
}
</cfscript>
</cfloop>



And that's it! Yes, it's a kludge, it's a cheap-and-nasty workaround to cover over fundamental flaws in the code. But on deadline day, it might just save your skin until you get time to re-do those pesky CFC's properly.