What Is Enterprise?
You may have heard that big Internet sites like Amazon, eBay, or Google have thousands--sometimes tens of thousands, or more--of servers powering their websites. If you're reading this book, you've probably already built at least one web application of your own, and it probably had only a handful of machines behind it, perhaps even just one application server and one database. In fact, maybe you had shared hosting and only had a fraction of a full server at your disposal.
If you had a great idea for an online business and were given 1,000 servers, what would you do with them? How would you make the most of them? What operational goals would you define for reliability and speed, and how would you leverage all of that hardware to achieve those goals?
Before diving into the pieces of an enterprise system, or discussing how to build one, a good starting point is to simply define enterprise.
Unfortunately, that is not an easy task. There is no particular set of tools that, if used, will make your architecture qualify as enterprise, even if the word "enterprise" is in the product names of the tools you use. The big companies mentioned earlier have built many of their own tools to support their software stack, but they are definitely still "enterprise." Similarly, there is no single configuration of pieces that together spell enterprise. If you looked at the configuration of Google's servers and compared their it to Amazon's, the two would look quite different. But they are both enterprises nonetheless--it just happens that the two enterprises have different goals, and therefore need different architectures to reach those goals.
In some sense, a site is enterprise when it feels like it is. All of the Internet behemoths once started out with a single application server and single database, just like you. It's anyone's best guess when they crossed the blurry line into "enterprise."
That said, there are certainly some criteria that, when satisfied, do make a site feel like it's enterprise. These criteria are topics of this book, and will be referred to again and again:
- It's fast. You can define a service level agreement (SLA) for how long it takes each component to do its job, which in turn allows you to define an SLA for end-to-end load times of any given web page.
- It's always available. You can define an SLA for your minimum uptimes for all critical components and aim for "four nines"--99.99% uptime.
- It scales linearly. You can scale to hundreds of thousands or even millions of users by adding additional hardware.
- It's fault-tolerant. If noncritical components go down, the majority of functionality stays intact, and your users don't know the difference.
- All source code is in a source control repository.
- All new code goes through a QA cycle before it is deployed.
- There is a deployment procedure, and failed deployments can be rolled back.
- Errors are logged in a central location, and the appropriate personnel are notified in real-time.
- Logfiles and databases are backed up in a central location.
- Statistics about the website's operation can be collected and analyzed to determine which areas need attention.
Implicit in the preceding list is a number of job functions and departments other than software development. Reading between the lines, you find:
- A database administrator (DBA) who sets up failover databases and ensures backups are available. A DBA can also tune database configuration parameters and control the physical mapping of data to disks to improve performance. Many also consult on schema design to ensure optimal performance and data integrity.
- A quality assurance engineer (QA) who tests release candidates before they are put into production and tracks issues to be fixed by developers.
- An operations or release engineer who manages the releases, creates deployment plans, and rolls out your new software in the wee hours of the night.
- An information technology engineer (IT) who maintains internal machines that house backups, logfiles, etc.
Having these people in your organization will push your systems architecture toward "enterprise." Similarly, designing your system to be enterprise creates the need for all of these individuals. In some sense, when your company itself feels like an enterprise, your software is probably getting to be enterprise, too. When the two are out of step, you will know it because either half of the engineers will have nothing to do or everyone will be stepping on each other's toes.
Every website begins its life with a single developer and a single line of code. Figure 1-1 shows a simple configuration of a Rails application connected to a database. You will likely spend quite a bit of time developing your application on this setup before it's ready for its first user.
When it's time to launch, there some issues that ought to be considered. Figure 1-2 shows the same configuration, but with redundancy at the application level, and failover at the database level.
There are two copies of the application so that in the event one machine fails, there is still another that can handle incoming traffic. Similarly, in the event of a hardware failure on the database machine, a copy that is a transaction or two behind can be brought online quickly.
Even if you are barely using any of the resources on either the application or database machine, redundancy and failover are a very good idea. At this point, neither of these considerations is aimed at managing load--that comes later. Rather, both are intended to ensure the availability of your web application. Reprovisioning a machine, configuring it, and loading all your software and data from backups can cause quite a bit of downtime. During that time, your customers can find your competitors' sites, and they are likely to form negative opinions about your site's reliability as well.
With this configuration, and perhaps even a good deployment strategy, there is plenty of work within the application and data layers that can be done before you need to add any additional complexity to your system in the form of encapsulated services or asynchronous processes. Depending on the feature set of your web application, this may even be as far as you need to go. You are already satisfying a number of the criteria that define the elusive concept of "enterprise."
There is, within an enterprise, the need to scale horizontally as well. Only so many engineers can work in one codebase before no one can work at all. Even if there is only one chef in the kitchen, there is still only so much space for the sous-chefs.
A common way to deal with this human scaling problem is to break up a large application into smaller pieces, or services, each responsible for a specific function within the enterprise. It's no surprise that the software splits often follow organization boundaries so that individual teams can take on full ownership of their pieces of the overall system.
Each service has its own full stack that mirrors the stack of the traditional website from Figure 1-2. The difference is that a service is responsible for a small fraction of the duties that make up the entire website, usually one specific, specialized group of related functionality. It's possible--and sometimes preferable--to abstract all database access behind services. The front-end website then becomes a consumer of these services and has no need for a database of its own, as shown in Figure 1-3.
When you add services into the mix, it's hard to argue your system is not enterprise.
There are a number of other components commonly found in an enterprise setup. Figure 1-4 shows a generic enterprise configuration. Powering the front-end website are a number of services. There are also a collection of asynchronous processes that receive information from services via a messaging queue. In addition to the front-end website, there is a web services layer aimed at providing external clients with a subset of the functionality available inside the firewall. There is also redundancy and failover in all critical places. Finally, each service database feeds a data warehouse, which powers site reporting and decision support.
Note, of course, that simply replicating this configuration is not enough. Each piece of the system is an independent, isolated, and encapsulated system in its own right and deserves thorough and thoughtful design. What goes where and how to implement each individual unit is as much an art as it is a science.
If you enjoyed this excerpt, buy a copy of Enterprise Rails.