Capacity and performance management always have been important for application planners. The basic mechanism of...
both is well understood; you set goals and then establish resources and resource relationships that meet those goals at various levels of demand. Like so many things, the cloud changes capacity and performance planning, often profoundly. To address capacity and performance management in the cloud, planners need to understand their demand, understand the constraint points, assess performance in the cloud and assess how resources and configurations affect it.
Application demand can be visualized as a curve that shows performance (quality of experience or QoE) under various load levels. In most cases, load or demand is measured by "transactions," meaning discrete units of work presented by users and processed by applications. An application update is a transaction, but an email also can be one. The basis for capacity and performance planning is to establish the range of demand expected and the tolerable QoE within that range.
Start your planning with the users
The best place to start this process is with the users. Normally, work is presented to applications in a request and response form and QoE is measured by tracking how long it takes for a user to complete that cycle. Demand measurements have to define this processing interval and also count the number of transactions. To get truly reliable data, you should measure through a complete business cycle -- at least a quarter or even a full year -- but where that's not possible, gather at least a full month's data and use application logs to correlate it with annual activity rates.
The biggest difference between the cloud and traditional IT is that the cloud normally is designed to scale resources under load. Many planners have a vision of an application whose capacity expands and contracts at will, and although this would be wonderful in theory, in practice there are usually application components not easily scaled. Database updates and accesses often form a bottleneck, for example. You'll need to measure not only the number of database updates and accesses but also the associated delay.
Network delay also is probably greater in the cloud and, consequently, harder to predict. That's particularly true if you're using the Internet for cloud access, or if your cloud provider is likely to distribute application copies geographically. It's often difficult to break out network delay in an application because it's hard to time-stamp all the steps. A satisfactory alternative is to use an "echo" transaction that sends a time-stamped message and receives an immediate response. By testing this across the range of cloud options and locations available, you can measure the variability of response.
Establish 'performance zones'
The goal is to establish "performance zones," including the network, front-end processing and database activities. An application's performance will be the sum of the delays experienced in these zones, and the capacity plans will have to augment zone performance as needed to pull performance levels into QoE boundaries set by business operations.
Using zones will help identify delay choke points, but it also divides application performance according to the type of "capacity augmentation" needed to remedy issues, and the difficulty or expense in applying them. Generally, the easiest zone to target to improve performance through capacity augmentation is the "front-end" zone responsible for structuring information for the user and checking the updates provided before committing them to a database. You should target this zone first.
Zone-based performance and capacity correlation is the first practical step to doing performance and capacity planning for the cloud. A given zone can be optimized and its response time changed, and that will change the overall application performance level. Thinking by zone ensures you recognize that most capacity changes will affect only a limited part of the application performance picture; don't expect to speed everything up proportionally.
Don't make too many changes at one time
In fact, adding capacity in a single zone may have little effect on performance. If that zone contributes little delay, then little will be gained by adding capacity there. Because elastic scaling of cloud components can add overhead through load balancing, you may actually wind up stepping backward. To make sure your capacity changes deliver positive results, you may have to test the effect of adding capacity. Be sure to do this by zone; don't change too many things or you won't be able to correlate your actions with results.
Generally, cloud scaling delivers the greatest changes early in the scaling, simply because the percentage of change in capacity is greatest there. Going from one application instance to two doubles the resources; but going from 10 to 11 adds only 10%. If you are planning to scale an application, it's wise to test performance with a single instance, but with load balancing enabled, add other instances to observe the performance curve. This will help identify the point at which additional capacity doesn't pay sufficient dividends to justify the cost.
In nearly all cases, the application's "back end," where database activity is concentrated, will be the primary choke point. Limitations are best addressed with careful planning of database and storage hierarchy. Storage on SSDs, for example, can significantly improve access and update performance, but unit storage cost is higher so it will be important to measure the response time improvement and work through cost benefits.
Be careful to ensure you have accurate data
The total response time and application QoE is the sum of the zone response times -- network, front-end, back-end database. It's critical to keep this in mind, both to ensure you have accurate data at the boundary points and that you understand the overall performance change zone-specific capacity augmentation will deliver.
Application capacity and performance management is in many ways like application deployment management; it's best addressed in parallel with development rather than considered later, when problems emerge. Make performance management part of the application development process and you'll find it far easier to accomplish.
What to do when cloud apps fail
Tips for effective cloud application performance management