Rails doesn’t scale, Amazon’s Elastic Compute Cloud does, but wait you need Scalr to actually scale.
So… What is ‘scaling‘?
Everyone’s an expert on it these days it seems. It’s talked about in terms of languages, cloud computing grids, databases, and more.
Just today, Ola Bini provides a humorous swag at how it’s talked about in terms of languages. Recently, John Willis had a nice overview in terms of the ‘cloud’.
I’ve helped scale networks, applications, systems, and databases. The one thing I’m certain about scaling is that it is not a one-size-fits-all solution. There is no silver bullet.
Unfortunately, the fact that there is no silver bullet hasn’t stopped people from talking about it like there is or assigning blame when ’scaling’ fails.
It would be great if the conversation could evolve a bit. These are age old issues that have a new shine because hardware, bandwidth, and software have all rapidly commoditized. In the world of systems management and architecture design there are new tools like cloud computing grids and virtualization, but the actual problems haven’t changed.
Scaling a system, application, or database is a non-trivial task that requires specific knowledge of the problem domain. No amount of hand-waving will create automagical scaling. We can only create better tools that help us build better infrastructure.
May 4th, 2008
I missed the announcement of Google’s App Engine[1] by a few hours, but it was quickly brought to my attention. Initially, it looked like a direct competitor to technology we have been working on. Fortunately, it’s not.
CloudScale made a decision in January to move away from building a commodity product that serviced the mass of web developers for a number of reasons. First, we had some inkling that Google was headed this direction. Second, after we saw the Heroku application it seemed apparent to us that they had really solved this particular problem very elegantly.
Now that Google has launched App Engine we can see that it is very similar to Heroku. Both have outstanding technology and are putting a stake in the ground that matters. It goes something like this:
If you are a small developer or want to vet an initial application idea you can do it without having to run or maintain your own infrastructure.
How many of you wind up spinning your wheels configuring databases, operating systems, application servers, etc. instead of working on your core application? We’ve all been there.
So how does this affect CloudScale? We think that generally it helps the CloudScale project.
None of our current ALPHA customer use cases can run under Google’s App Engine. The leverage we help our CloudScale-related customers derive from EC2 is largely related to helping them rapidly deploy and manage the life-cycle of very complex heterogeneous web applications.
This is a value that Amazon, Google, and Heroku can’t provide, while some of our other competitors can, but not as well. Most of the others are weighed down by needing to support commodity web developers for free or very little.
At the end of the day, it’s the app that matters. All applications go through an evolution from a proof-of-concept to a scalable web app to a very large globally distributed web app. In the initial phases, the app is the whole of the app. In the later stages, the scalability of the app is a critical component of the application itself.
CloudScale is firmly targeted for after-the-proof-of-concept phase to help you build a fully scalable web application. Most importantly, it’s designed to scale up to a large globally distributable web application.
[1] There is more background on App Engine on Niall Kennedy’s blog.
April 11th, 2008
Greg Borenstein, principal behind Music for Dozens and out loud thinker sums up the potential long term impact of Amazon’s successful cloud computing model. It’s an insightful article and I think worthy of a close read, including the comments.
First Greg correctly sums up the obvious:
With the announcement of Amazon’s much-anticipated SimpleDB service this week, we now officially live in a world where the kind of enormous systems run by Google, Yahoo, Ebay, et al — systems that power huge portions of the web (where 500+ million users is totally mundane) — are available on demand in small doses and at reasonable prices to anyone who needs them.
And then talks about the impact on typical web developers:
On this infrastructure, the only real difference between running a small application (a custom CMS for a medium-sized non-profit, for example) and a large one (say, Digg) is the size of your monthly bill.
…
Within a few years, a scale of computation that is currently only available to a handful of multi-billion dollar companies will be available to any pair of dorm room-bound hacker kids with $30/mo. and a pair of MacBooks.
Unfortunately, Greg misses the mark here. While Amazon Web Services (AWS) are a strong evolutionary step forward, it’s clear that by themselves, despite Amazon’s excellent marketing, they do not create ‘web-scale’ (we prefer ‘cloud-scale’) systems. They only enable developing web-scale systems without deploying your own infrastructure.
Even today’s giants, like Amazon, rely heavily on know-how and expertise that is not readily available to the masses. There are no systematic approaches to building scalable, distributed, secure, and reliable systems. Well, not yet anyway . . . (more below)
Greg makes some additional comparisons to watershed technology moments:
For a few clues as to what happens when the current, if obscure, state of the art becomes an industry-standard lowest common denominator, it helps to look at some history. This has happened at least twice in the last thirty years: once when industry standardization around x86 hardware lead to the collapse in prices for DOS- and then Windows-compatible PCs that made them ubiquitous around the world; and again when these same PCs reached a level of power and the open source software written to run on them reached a level of affordability and reliability that, together, they displaced the expensive and proprietary server systems and radically lowered the barrier to entry for web-development leading, as Tim O’Reilly has clearly outlined, to Web 2.0.
I think these are apt, comparisons, but not taken to their logical conclusion. It’s truly the synergy of commodity hardware and commodity (open source) software that allowed cheap x86 systems to displace Big Iron. The software, in this case, provides the systematic mechanisms for repeatably leveraging cheap hardware. Without it, people would have to individually cook their own operating system and the applications on top of it or pay for expensive enterprise software.
You can think of open source software in this context as the institutionalized know-how for running cheap commodity hardware.
There is currently no parallel for web-based cloud computing services
Tools like Amazon’s S3, EC2, SimpleDB, and the others are simply one-off point services. They are more akin, respectively, to a disk drive, a CPU, and a rolodex. By themselves they have only minimal use. It is only when they are combined together well (and with other services?) by an operations or development architect with knowhow that you have a synergistic effect that allows the creation of cloud-scale computing infrastructure.
Greg’s first commenter hits the nail on the head:
The mere existence of these services is far from breaking down the barrier necessary for *anyone* to scale to amazon/ebay/google size services. There are at least, as I see it, 2 things still missing that won’t be “commoditized.”
First, is the know how. Building a scalable application (really, application infrastructure) is not the same as building a basic app. And if anything, the general trends have shown that there is a great wide void of knowledge in the market about how to do this successfully.
Second, is the management of this architecture. Ask any company running a large number of EC2 instances how they are managing it. Most likely, they are using a number of custom built tools. If they’re lucky, they managed to shoehorn some of the existing systems management tools into doing what they need. Either way, there’s a lot that goes into making this work.
Which is accurate and part of why CloudScale Networks exists. It is possible to write a new generation of software that institutionalizes the running of Internet infrastructure. Possible and inevitable.
Greg follows through to the crux of the matter by highlighting how the new economics suddenly make it feasible for folks to pursue long tail plays or other business models that were previously not possible.
Cloud computing is here to stay and it is absolutely a game-changing evolution of Internet infrastructure. We need to embrace and hasten the arrival of this model.
December 18th, 2007