For decades, buyers of network services dealt with issues of service assurance through a service-level agreement (SLA) from their provider. Because Software as a Service (SaaS) is network-delivered, both buyers
Essentially, all “stock" SaaS SLAs are the same. They offer a refund for failure to deliver service in a given period. Yet, SaaS users’ goal is to get service, not a refund. Unfortunately, traditional options for financial recovery in SaaS SLAs are minimal. There’s essentially no chance a provider would agree to an SLA that compensates a user for downtime or lost business, what lawyers call “consequential damages,” so what an SLA will generally offer is simply a refund of the service price for the period of outage. That’s not seen by many cloud buyers as enough incentive for their provider to keep them up and running.
The good news is that you can get closer to your uptime goal with special attention to two things rarely discussed in SaaS service contracts, proactivity and escalation.
Proactivity is a cloud provider fault management system that focuses on telling the buyer/user of SaaS about a problem or potential problem when the condition is detected in the cloud and not when it’s already impacting the user’s service. Proactive fault notification can be critical for SaaS applications that are used irregularly but that are important when run (many users cite business analytics as an example). In these cases, it can let the SaaS user reschedule activities or adopt work-around practices rather than running into the failure head-on.
Most cloud providers will have some service status available to customers, but most will require the customer log-on to obtain it, or in some cases even try to run the service. A better approach is to have email or IM alerts sent to a list of your support professionals and both line and IT management. These alerts can then trigger an internal discussion on response, which could range from testing the service performance in-house, changing operating procedures to reduce the impact of an outage, or activating a customer service complaint to the provider. In most cases, the first step will be to check the details of the SaaS provider’s service status and make decisions based on the specifics.
Escalation is the strategy that enterprises report has the greatest positive effect on their service levels. An escalation term in a SaaS contract requires the provider to involve steadily higher-level people in the resolution of the service problem as the duration of the outage continues. An example would be that after one hour of outage, the local tech manager for the provider must call the designated buyer tech contact. After two, the regional tech must be involved, and after three the regional manager must contact the contract officer for the buyer. Technical escalation prevents a problem from being stalled by absence of local resources to remedy it, and escalation into the management team insures that key people are aware of buyer problems, and perhaps embarrassed by them.
More resources on SaaS applications
That two-layer goal of technical and management escalation is critical in crafting the agreement. The technical escalation track should start shortly after the outage becomes annoying to users of the SaaS offering, and should continue until the provider’s central technical experts are involved. Each escalation step should be accompanied by a report of who is notifying the next link in the support chain and what the new contact is being told about the problem. For management escalation, each newly engaged SaaS application manager (or executive) should be expected to provide a report of where the technical processes are, what billing adjustments are being entered and when a milestone in problem determination or service restoration is expected.
Proactivity and escalation can be costly to the provider, which means that they may involve additional charges for the buyer. There’s always a temptation to demand every possible concession in a deal, but it’s neither realistic nor cost-effective to do that with cloud services. Knowing the real tolerance of application users for degraded service or service failures, and synchronizing the proactive notification and escalation terms to this reality, will insure that cloud services are optimally available, and optimally priced.
This was first published in May 2012