Tuesday, September 08, 2009

Moving a site to The Cloud

Last week I did a lot of reading and research into cloud hosting. The "Cloud" has been a buzzword for a while now, often bandied about by those who know no better as a simple sprinkle-on solution for all of your scale problems - much in the same way as AJAX was touted around a few years ago as a magic solution to all of your interface problems.

The perception can sometimes seem to be Hey, if we just shift X to The Cloud, we can scale it infinitely!. The reality, of course, is something rather more qualified. Yes, in theory, the cloud has all the capacity you're likely to need, unless you're going to be bigger than, say, Amazon - (are you? Are you reeeeeeeally? C'mon, be honest...) - provided - and it's a big proviso - that you architect it correctly. You can't just take an existing application and dump it into the cloud and expect to never have a transaction deadlock again, for instance. That's an application usage pattern issue that needs to be dealt with in your application, and no amount of hardware, physical or virtual, will solve it.

There are also some constraints that you'll need to work around, that may seem a little confusing at first. But once I got it, the light went on, and I became increasingly of the opinion that the cloud architecture is just sheer bloody genius.

What kind of constraints are they? Well, let's focus on Amazon's EC2, as it's the most well-known....

  • Your cloud-hosted servers are instances of an image
    They're not physical machines - you can think of them as copies of a template Virtual Machine, if you like. Like a VMWare image. OK, that one's fairly straightforward. Got it? Good. Next:

  • Instances are transient - they do not live for ever
    Bit more of the same, this means that you create and destroy instances as you need. The flipside is that there is no guarantee that the instance you created yesterday will still be there today. It should be, but it might not be. EC2 instances do die, and when they do, they can't be brought back - you need to create a new one. This is by design. Honestly!

  • Anything you write to an instance's disk after creation is non-persistent
    Now we're getting down to it. This means that if you create an instance of, say, a bare-bones Linux install, and then install some more software onto it, and set up a website, then the instance dies - everything you've written to that instance's disk is GONE. There are good strategies for dealing with this, which we'll come onto next, but this is also by design. Yes, it is...

  • You can attach EBS persistent storage volumes to an instance - but only to one instance per volume
    This one is maybe the most obscure constraint but is quite significant. Take a common architecture of two load-balanced web servers with a separate database server. It's obvious that the database needs to be stored on a persistent EBS volume - but what if the site involves users uploading files? Where do they live? A common pattern would be to have a shared file storage area mounted onto both web servers - but if an EBS volume can only be attached to one instance, you can't do that.

Think about that for a few seconds - this has some pretty serious implications for the architecture of a cloud-hosted site. BUT - and here's the sheer bloody genius - these are the kind of things you'd have to deal with for scaling out a site on physical servers anyway. Physical hardware and especially disks are not infallible and shouldn't be relied on. Servers can and do go down. Disks conk out. Scaling out horizontally needs up-front thought put into the architecture. The cloud constraints simply force you to accept that, and deal with it by designing your applications with horizontal scaling in mind from the start. And, coincidentally, provide some kick-ass tools to help you do that.

Take, for example, the last bullet point above - that EBS volumes can only be attached to one instance. So how do you have file storage shared between N load-balanced web servers? Well, the logical thing to do is to have a separate instance with a big persistent EBS volume attached to it, and have the web servers access it by some defined API - WebDAV, say, or something more application-specific.

But hang on.... isn't that what you should be doing anyway?. Isn't that a more scalable model? So that when your fileserver load becomes large, you could, say, create more instances to service your file requests, and maybe load-balance those, and....

See? It forces you to do the right thing - or, at least, put in the thought up front as to how you'll handle it. And if you then decide to stubbornly go ahead and do the wrong thing, then that's up to you... :)

So, anyway, I wanted to get my head round it, and thought I'd start by shifting Cragwag and Sybilline onto Amazons EC2 cloud hosting service. I did this over a two day period - most of which, it has to be said, was spent setting up Linux the way I was used to, rather than the cloud config - and I'll be blogging a few small, self-contained articles with handy tips I've learned along the way. Stay tuned....

1 comment: