Wednesday, August 30, 2006

RegEx to fully validate RFC822 email addresses

I love Perl. No, I really do - I can sit and look at the really good examples for hours, and still have no clue what they're doing. And here is a fantastic example....

Mail::RFC822::Address - a module that tells you whether the given string is a valid email address or not.

Most email validators only check the absolute basics - e.g. some characters followed by an at-sign, followed by some more characters and at least one dot. But have you ever actually read RFC822, the critical RFC that defines the standard of what's acceptable in an email and what isn't ? It's surprisingly loose on what's an acceptable address.

(go on, read chapter 6, and try to summarise it - you know you want to...!)

But no more, thanks to Paul Warren, who has solved all our email-address-validation woes with one almighty mother of a reg ex - and here it is. You'll like this. No, you will.... here we go:


(?:(?:\r\n)?[ \t])*(?:(?:(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t]
)+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:
\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(
?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[
\t]))*"(?:(?:\r\n)?[ \t])*))*@(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\0
31]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+
(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:
(?:\r\n)?[ \t])*))*|(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z
|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n) ?[
\t])*)*\<(?:(?:\r\n)?[ \t])*(?:@(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\ r\n)?[
\t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[
\t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n) ?[
\t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t]
)*))*(?:,@(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[
\t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*
)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t]
)+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*)
*:(?:(?:\r\n)?[ \t])*)?(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+
|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r \n)?[
\t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?: \r\n)?[
\t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t
]))*"(?:(?:\r\n)?[ \t])*))*@(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031
]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](
?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?
:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?
:\r\n)?[ \t])*))*\>(?:(?:\r\n)?[ \t])*)|(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?
:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)? [
\t]))*"(?:(?:\r\n)?[ \t])*)*:(?:(?:\r\n)?[ \t])*(?:(?:(?:[^()<>@,;:\\".\[\]
\000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|
\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>
@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"
(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*))*@(?:(?:\r\n)?[ \t]
)*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?
:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[
\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*|(?:[^()<>@,;:\\".\[\] \000-
\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(
?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*)*\<(?:(?:\r\n)?[ \t])*(?:@(?:[^()<>@,;
:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([
^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\" .\[\]
\000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*(?:,@(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\ [\]
\000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\]
\000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]
|\\.)*\](?:(?:\r\n)?[ \t])*))*)*:(?:(?:\r\n)?[ \t])*)?(?:[^()<>@,;:\\".\[\] \0
00-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,
;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?
:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*))*@(?:(?:\r\n)?[ \t])*
(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".
\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[
^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]
]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*\>(?:(?:\r\n)?[ \t])*)(?:,\s*(
?:(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*)(?:\.(?:(
?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[
\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t
])*))*@(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?
:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|
\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*|(?:
[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*)*\<(?:(?:\r\n) ?[
\t])*(?:@(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["
()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n) ?[
\t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>
@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*(?:,@(?:(?:\r\n)?[
\t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,
;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t]
)*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*)*:(?:(?:\r\n)?[ \t])*)?
(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".
\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*)(?:\.(?:(?:
\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\[
"()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])
*))*@(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])
+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z
|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*\>(?:(
?:\r\n)?[ \t])*))*)?;\s*)



After a while, all you see is ....blonde......brunette.....


PS - hope Paul doesn't mind me posting it here. It's truly a thing of beauty, but I'll take it down if requested.

URGENT html prototype designer/coder needed today and tomorrow!

Bit of a long shot this one, but Lise has been left high-and-dry by a contract developer taken ill. The job is mocking up flat HTML interfaces, and will probably require a bit of a late night tonight as it must be finished by close-of-play tomorrow for user testing on Friday. Ideally you'd be able to make it to Islington THIS AFTERNOON for a briefing and handover.

Very short notice, but they're a bit stuck - so if anyone finds themselves at a loose end for a couple of days, add a comment on this post and i'll pass your details on.

Tuesday, August 29, 2006

Flickr's Geotagging Let Down By Location Search

Thomas Hawk of Geo-tagging mashup Zoomr does a good job of keeping his objectivity in his review of Flickr's new Geotagging facility.

I just had a play with it myself, and I have to agree with Thomas - the nicest aspect is the way that it's integrated with the Flickr Organiser (I still keep spelling that Organizr...) and the by-now-expected Ajax drag-and-drop wizardry. But the most disappointing aspect of it all is the underlying Yahoo! maps data. It's fine if you want to geotag photos down to city block level in most major US cities, but stray off the beaten path to tag photos of mountaineering treks or rock climbing venues - even world-famous rock climbing venues - and the map detail just isn't there.

To be fair, this isn't just a limitation of Yahoo!'s map data, it's similarly limited in Google Maps aswell - and I guess it's a reflection of the underlying business drivers behind the map data. Constructing an index of the whole globe is a massive undertaking, and it has to be funded somehow. The obvious channel is advertising, but only businesses are willing to pay for advertising, and most businesses tend to be centred around urban areas, therefore it's more important to the map provider that the advertisers are kept happy.

It just seems to take on an added dimension of disappointment when this limitation applies to photos. To be honest, one U.S. city block looks pretty much the same as any other U.S. city block, in the grand scheme of things, and part of the whole joy of Flickr comes from discovering fascinating, beautiful images that you may never otherwise have seen. Almost by its very nature, this is going to involve out-of-the-way places like the Sim Gang Glacier in Pakistan, La Dibona of Les Ecrins in the Alps, or K2, which requires 14 days of hard trekking from the nearest road before you even reach the mountain - often cited as the hardest trek in the world, but surely one of the most beautiful. Try to find any of these places in the Flickr map, and you'll have trouble. Even closer to home, in the rugged mountain landscape of North Wales around Tryfan, the map engine draws a blank.

To take the next step in geotagging, and in mapping as a whole, requires the next step to be taken in the search technology that takes a user-supplied string and works out what the hell you actually meant. This is a mammoth task in itself, considering how many different ways people can refer to a particular langitude and latitude, across languages and character sets, let alone local names vs. standard names. But whoever cracks that problem can look forward to a very bright future indeed, and that's just one reason why Natural Language Processing - clearing away the cobwebs of context and language and working with raw chunks of meaning - is becoming such a hot topic right now. Stay tuned...

Thursday, August 24, 2006

So many parties, so little time...

Well, I had the best intentions of making it to tonight's London CFUG, but once again my path is strewn with cowpats from the Devil's own Satanic herd** as it once again conflicts with something else that I just can't miss. This time, it's the Trampoline Systems Summer Bash, graced by the intriguing Czechoslovakian Alternative Folktronica of the most delectable Miss Eva Eden. Should be a larf - I've never had to mike-up a Bontempi organ before.... I'll whack the more-salacious photos onto Flickr tomorrow, plus any videos I happen to get of any embarrassingly drunk bigwigs saying spectacularly inappropriate things.


So have a pint and a whinge for me at the CFUG tonight, and hopefully I'll make it to the next one. Unless I'm halfway up some mountain somewhere or something.


**series and episode, anyone?

Thursday, August 17, 2006

Who Wants To Buy An Ajax Calendar App?

In a move that's not-at-all-designed-to-create-an-internet-buzz, AJAX calendar app Kiko is for sale on eBay.

Starting price US $49,999.99. No Bids yet.

In the interests of completeness, I should point out that they're not the first, by a long way - which market sector do you think got there first? Of course - porn!.

The listing states

We are selling Kiko because we want to have time to work on other projects as a development team.

...and not because they are now in direct competition with Google, or anything...

best of luck Kiko guys

A Pox Upon AppleMail!

A lot of my time since joining Trampoline 6 weeks ago has been spent reaquainting myself with the black art of parsing and dismembering MIME emails with the JavaMail API. There's much I could say about the MIME format and particularly the JavaMail API, but those apopleptic rants deserve to be written up and nailed to church doors all of their own.

This post is about a problem I've been having with a mail generated by Apple Mail, that has been driving me nuts. It's not the first issue I've had with Apple Mail and it's funky attachment formatting, and I'm sure it won't be the last. However, it's the most maddening to date! Here's the problem:

In a multipart email, you separate each part with a unique string that must not occur in any of the parts. This is generated by the email client, and declared in the Content-Type header. RFC 2045 states that the boundary declaration is required for any multipart subtype. The header should look like this:


Content-Type: multipart/(whatever); boundary="----=(unique string)"


You will then get a set of Parts, each separated by an occurence of the boundary string, and each declaring what type of content it is by means of its own Content-Type header -


Content-Type: multipart/(whatever); boundary="----=(unique string)"


An example might be:


Content-Type: multipart/mixed; boundary="----=ABCDEFGHIJKLMNOP"

----=ABCDEFGHIJKLMNOP
Content-Type: text/plain; charset=US-ASCII

Hi Al,

Here's the schematic for the secret base under the island volcano. Note the new layout of the shark pools, and the trapdoor is now triggered from the pressure pad under your desk as requested. Will give the engineers a kick about the frickin' lasers and see what's taking them so long.

Cheers,

Dave

----=ABCDEFGHIJKLMNOP
Content-Type: image/png; name="plans.png"
Content-Transfer-Encoding: base64
Content-Disposition: inline; filename="plans.png"

(lots of data encoded into base 64 so that it can be transferred as text)
----=ABCDEFGHIJKLMNOP


All well and good so far.

It gets a bit more complicated when you introduce the fact that any part of the mail body can also be a multipart type, which must declare its own boundary string, but still, it should be parseable into a coherent tree structure, right?

Well yes - as long as you play by the rules.

Apple Mac files consist of two forks:

1) an apple-specific part called the RESOURCE fork which contains arbitrary information such as icon bitmaps and file info parameters,
2) a DATA fork which contains the actual file data.

This translates logically into a MIME multipart format - multipart/appledouble - with one part for each fork.

So, if we were to to send the example message above from a Mac using AppleMail, you would get something like this:


Content-Type: multipart/mixed; boundary="----=TOPLEVELBOUNDARY"

----=TOPLEVELBOUNDARY
Content-Type: text/plain; charset=US-ASCII

Hi Al,

Here's the schematic for the secret base under the island volcano. Note the new layout of the shark pools, and the trapdoor is now triggered from the pressure pad under your desk as requested. Will give the engineers a kick about the frickin' lasers and see what's taking them so long.

Cheers,

Dave

----=TOPLEVELBOUNDARY
Content-Type: multipart/appledouble; boundary="----=HEYIMTHEAPPLEDOUBLEBOUNDARY"

----=HEYIMTHEAPPLEDOUBLEBOUNDARY
Content-Type: application/applefile; name=plans.png
Content-Disposition: inline; filename="plans.png"

(apple-specific file information encoded into base 64)

----=HEYIMTHEAPPLEDOUBLEBOUNDARY
Content-Type: image/png; name=plans.png
Content-Transfer-Encoding: base64
Content-Disposition: inline; filename="plans.png"

(actual file data encoded into base 64)
----=HEYIMTHEAPPLEDOUBLEBOUNDARY

----=TOPLEVELBOUNDARY


Again, this is all well and good so far - apart from one or two minor irritations like that lack of quotes around the name of the file in the Content-Type header - which can cause some grief if the filename has spaces in it.... but that can be got round without much trouble using a bit of regex in pre-processing.

The problem comes when you have multiple appledouble-encoded attachments. What you would expect is something like this:


Content-Type: multipart/mixed; boundary="----=TOPLEVELBOUNDARY"

----=TOPLEVELBOUNDARY
Content-Type: text/plain; charset=US-ASCII

blah - message text

ATTACHMENT 1:

----=TOPLEVELBOUNDARY
Content-Type: multipart/appledouble; boundary="----=HEYIMTHEAPPLEDOUBLEBOUNDARY"

----=HEYIMTHEAPPLEDOUBLEBOUNDARY
Content-Type: application/applefile; name=plans.png
Content-Disposition: inline; filename="plans.png"

(apple-specific file information encoded into base 64)

----=HEYIMTHEAPPLEDOUBLEBOUNDARY
Content-Type: image/png; name=plans.png
Content-Transfer-Encoding: base64
Content-Disposition: inline; filename="plans.png"

(actual file data encoded into base 64)
----=HEYIMTHEAPPLEDOUBLEBOUNDARY

ATTACHMENT 2:

----=TOPLEVELBOUNDARY
Content-Type: multipart/appledouble; boundary="----=DIFFERENTAPPLEDOUBLEBOUNDARY"

----=DIFFERENTAPPLEDOUBLEBOUNDARY
Content-Type: application/applefile; name=plans.png
Content-Disposition: inline; filename="plans.png"

(apple-specific file information encoded into base 64)

----=DIFFERENTAPPLEDOUBLEBOUNDARY
Content-Type: image/png; name=plans.png
Content-Transfer-Encoding: base64
Content-Disposition: inline; filename="plans.png"

(actual file data encoded into base 64)
----=DIFFERENTAPPLEDOUBLEBOUNDARY

----=TOPLEVELBOUNDARY


But what's actually happening is that in the second attachment, the all-important Content-Type: declaration -

Content-Type: multipart/appledouble; boundary="----=DIFFERENTAPPLEDOUBLEBOUNDARY"

- is missing!

This line is absolutely vital, as it not only declares that this part is in appledouble format, but more fundamentally it declares that this part is itself a multipart and is split with THIS boundary marker rather than any other.

If this line is missing, then ONLY the FIRST attachment gets recognised. Any subsequent attachments which don't get the content-type header are then considered to be text/plain by default, so you get an email which has the first attachment properly parsed as an image, but everything after that appears inline as text. So anyone reading the email gets a big long string of base64 encoded image data. Not nice.

Wednesday, August 16, 2006

Slovenia - The Game!

At last, the Flash app you've all been waiting for - Slovenia, The Game! It's a very cute, virtual community based around various beauty spots in Slovenia. We went there last summer for a week, and completely fell in love with the place. It's just stupidly beautiful - everywhere you look, your jaw drops, and this quirky little game just brought it all back.... I'm sitting here with a silly grin on my face and a wistful look in my eye. Ah, memories...

by the way - you seem to get called "pedr" a lot - pedr loosely translates as "fag"

Monday, August 14, 2006

How to get a job in Silicon Valley

Good-old Guy Kawasaki has an endearingly cynical but still-so-true guide to How To Get Hired In Silicon Valley. Well worth a read, and just as applicable outside California. Not that I've ever applied for a job in Silicon Valley (yet), but it certainly brought a wry smile to my face.

I've read many CVs and conducted many interviews, and it's amazing just how easy some people make it for you to put their CV on the "no" pile. I know we're now living in the txt-spk generation, but seriously - if you can't spell or punctuate on your CV when you're promoting yourself, how well are you going to promote the company? Interviewers are busy people, and will be looking for any reason to say no - spell well, communicate well, keep it short, and if you get an interview, read and learn from Mr Kawasaki.

Friday, August 11, 2006

What do atoms look like?

....they look like THIS - Physics News Graphics have published a field-ion microscope image of the sharpest man-made object ever produced - a needle with a tip that's just a single tungsten atom. The thing I love about this image is the way some of the atoms look like blobs of mercury - because they moved during the 1-second-long imaging process.

Friday, August 04, 2006

Java Programmers are the Erotic Furries of Programming

Nice chart of smugness here on Some Guy Ranting

....but where would CF programmers sit on this chart?

Thursday, August 03, 2006

Demos is live

Nice to see that Headshift have got the new Demos site live. This was the last site I worked on at Headshift, and there's quite a few aspecs to it of which I - and m'former colleagues Neil Roberts and Andy Birchall - feel quite proud.

It's architected with the "Everything-is-a-Item" approach that I learned from my Headshift predecessor, erstwhile colleague and all-round-good-egg Matt Perdeaux, which opens a lot of doors in terms of code reuse and making the most of your content.

Put simply, everything on the site - blog posts, comments, people, users, CMS pages - has the basic common data such as title, summary, and full description abstracted to an Items table, to which you can join as required in your SQL. All links between items and other entities (e.g. tags) are done at the Items level. This makes things like a global search across all content trivially easy (at least in theory - doesn't always work out that way of course... ) and means that once you've implemented tagging for, say, blog posts, you can have tagging on everything else as well with virtually zero extra effort - because everything is-a Item.

As with a lot of systems I've written over the years, it gets a little frustrating that some of the nicest bits are hidden away from public view in the admin interfaces. But you'll just have to take my word for it that there are lots of self-contained AJAX-driven custom tags that do things like tag administration and item-to-item linkages, which only had to be written once, and could be brought into any form for any kind of item and would work straight away - because everything is-a Item.

Other aspects of the site which give me that warm fluffy feeling inside include the friendly url scheme. The site is written with a Fusebox front-end, but there's a very thin layer above that which translates human-friendly urls into fusebox-friendly urls. This is done with Apache mod_rewrite, translating a url like

http://www.demos.co.uk/projects/demoswebsite/blog/artistaudience

into :

http://www.demos.co.uk/projects.cfm?sParams=/demoswebsite/blog/artistaudience

projects.cfm then parses the sParams string into its expected parameters, retrieves any referenced items into request scope, checks permissions, works out the appropriate fuseaction, and passes on to the FB framework by including index.cfm.

It would make a good CFUG case study, I reckon. (Hey Matt - fancy another vaudevillian double-act? Gaw blimey, roll out the barrel guvnor...) Mind you, I'm not sure if it's appropriate for me to reveal the technical innards of a system when I don't work for the company anymore... hmm... over to you then Neil ;)