High Availability – Planned and Unplanned Downtime

Calculating the degree of availability is tricky and require exercise to segregate system into components and then then calculate availability of each component and consolidate the availability of entire system. Here are the three steps to calculate, how much your system is available.

**Step 1:** Decide the level of Availability you need

Downtime can be categorized in to Planned (or Scheduled) and Unplanned (or Unscheduled) downtime. Usually maintenance tasks; such as installing updates, configuration changes results into planned downtimes. Unplanned downtime is caused by events which were unknown until they occur such as hardware failure, network outage etc.

As planned downtime is well informed in advance and does not impact user base due to workarounds, hence sometimes planned downtime is excluded in calculating the availability. So it is your discretion if you want to include the planned down time. Depending upon the considerations, there are three types of Availability levels

- Highly Available: System available during specified operating hours with No unplanned outage
- Continuous Operations: System available 24 x 7, with No planned outage
- Continuous Availability: System available 24 x 7, with No planned/unplanned outage

**Step 2:** Break the System in to components

A software system is built by integrating the various software/hardware subsystems (components) and downtime/failure of any subsystem results in partial/full unavailability of the system. Hence you would need to calculate the availability of each subsystem to determine the availability of the target system. Hence you need to break the system in to components. Each components should be capable enough as a unit to fail the system or some other components. Then identify the availability of each component.

**Step 3:** Measure the Availability of each component

To measure the Availability of a component, you need to know the Mean Time Between Failures (MTBF) and Mean Time To Recover (MTTR) for each component. Once you have this information then use the formula, Availability = MTBF/ (MTBF+MTTR), to get the availability of the components.

You can find the Availability data from your Vendors who are providing infrastructure or softwares.

**Step 4:** Consolidate the availability of the components

Components of a sub system, are called operating in series if failure of any of the components causes failure of the sub system. In such case, multiply the availability (A) of components, to find availability of the Sub system. A_{substem}= A_{component_1} x A_{component_2}

Components of a sub system, are called operating in parallel if failure of ALL components causes failure of the sub system. In case a components fails, other components take over. In such case, multiply the Unavailability (UA) of components, to find availability of the Sub system.

A_{substem}= 1- (UA_{component_1} x UA_{component_2 x ………} UA_{component_n)}

_{ }Where UA_{component}= 1-A_{component}

Consider a System with 3 subsystems/components A, B, and C. The component B is a combination of components B1 and B2. Here A, B, and C are in series and component B1 and B2 are in Parallel.

Calculating Degree of High Availability

Hence the to calculate the Availability of above sample system, following are the steps.

Availability = A_{A} x A_{B} x A_{C }

= A_{A} x {1- (1-A_{B1}) x (1-A_{B2}) } x A_{C}

= 99.00% x { 1- (1-99.00%) x (1-99.99%) } x 99.99%

= 99.00% x 99.9999% x 99.99%

= 98.99%

Do you know any other better way to calculate the Availability? Leave your thoughts in the comments box.