I'm looking for a cost effective tool for managing an web app on Ec2. Rightscale seems to the big dog and charges for it. Scalr looks like a more cost effective solution but it's hard to find out any real customer experiences..
The key aspects I'm looking for is a load balancer (http and https) and a way to automatically bring online additional web servers capacity as load increases as well as terminate the instances when load falls off.
From what I can tell, lots of people are rolling their own stuff here. We're trying to release an app and don't really want to have to fight too many heavy sys admin battles. Given the importance of performance etc I'd be grateful to hear advise and experiences from the field on this.
I am a Scalr user, a Scalr.net subscriber, and have become a Scalr enthusiast. I cannot possibly afford Rightscale.
Scalr can do what you ask.
Scalr has three images (each with 32/64 bit versions), plus a base (generic) image:
1) A load balancer image, running nginx. A highly available setup requires two of these. Scalr will manage your nameservice, and round robin between them. If one goes down, Scalr will remove it from DNS and bring up another instance. It is possible to run other load balancers, but nginx is the default.
2) Several application server images are available, running Apache/Tomcat/Rails. You setup your application here, be it PHP/Perl/Python/Java/Ruby/whatever. nginx routes requests between these instances grouped by unique user (based on IP + browser). Scalr monitors these for upness too, and replaces broken instances.
3) A MySQL database image, with automatic master/slave replication. Just deploy your schema, and Scalr handles replication and replaces defunct servers. It will also backup your data periodically. Scalr's DNS provides master and slave hostnames, so you can have your app read from the slaves and write to the master.
All of these instance types will auto-scale based on load. You start with the base image closest to what you're doing, and then you customize them for your application. For instance, we deploy our Perl/Catalyst app on the apache server instances but we serve static content from the nginx front-end servers. We had to modify our application slightly to use read/write database handles.
All in all, it took about three weeks of working through bugs in Scalr to get our application to a reliable state where I am confident that it IS highly available with Scalr. Their support was phenomenal, so the bugs didn't bother me too much, and the system is really coming along. It is approaching serious reliability.
As a side note, the best feature of Scalr is the 'Synchronize to All' feature, which auto-bundles your AMI and re-deploys it on a new instance - all without a service interruption. This saves you the time of going through the lengthy EC2 image/AMI creation process, which can otherwise make very simple admin tasks take 20 minutes. You can use this whether you are scaling your server farm or not - it would be very handy even on a single instance.
I pay Scalr.net $50 a month to host the service for me because I think it saves me time and money. The bottom line so far is this: at my last gig, we had a systems guy working on our highly available Linux DB + app server setup for a year... and he failed to achieve the kind of reliability that I achieved in three weeks. The savings by using Scalr as compared to rolling my own are extreme.
All that being said, if I could afford Rightscale, I would be using Rightscale. But the up-front fee and $500 a month make that impossible. There has been talk of waving the up-front fee in exchange for waving the consulting that it includes, but the monthly service fee isn't going anywhere.
I should mention that at the moment, sclar.net's website is down, so if I wanted to manage any of my server farms (don't have them up atm), I simply couldn't right now. It is not clear whether scaling is working for scalr.net subscribers right now, or not. Which is to say... this is perhaps not a mature solution yet. This doesn't happen often, before tonight the only downtime I've experienced were in periods of a few minutes at a time. But yeah... its down RIGHT NOW, so I must mention it :)
I would suggest a thorough reading of the support group at http://groups.google.com/group/scalr-discuss before making your decision. If you pick Scalr, be prepared to test your setup and work through any issues you have on the google group.