UNLIMITED FREE
ACCESS
TO THE WORLD'S BEST IDEAS

SUBMIT
Already a GlobalSpec user? Log in.

This is embarrasing...

An error occurred while processing the form. Please try again in a few minutes.

Customize Your GlobalSpec Experience

Finish!
Privacy Policy

This is embarrasing...

An error occurred while processing the form. Please try again in a few minutes.

ATIS - 0100030

Mean Time Between Outages – A Metric for Assessing Production Failure Rates in IP Routers

inactive
Organization: ATIS
Publication Date: 1 December 2010
Status: inactive
Page Count: 11
scope:

SCOPE & PURPOSE

Internet Service Providers (ISP) face the challenge of needing to continuously upgrade the network and grow network capacity, while providing a service that meets stringent customer reliability expectations. While telecommunications companies have significant experience providing reliable telephone service, the challenge for an ISP is more difficult because changes in Internet technology -- particularly router software -- are significantly more frequent and less rigorously tested than was the case in circuit-switched telephone networks. ISPs cannot wait until router technology matures - a large ISP has to meet high reliability requirements for critical applications like financial transactions, Voice over IP (VoIP), and IPTV using commercially available technology.

The need to use less mature technology has resulted in a variety of redundancy solutions at the edge of the network, and in well-thought-out designs for a resilient core network that is shared by traffic from all applications. The field reliability of modern provider edge routers, which have a large variety of interface cards, cannot be accurately characterized by a single downtime or reliability metric because it requires averaging the contributions of the various router components that may hide the poor reliability of some components. This challenge is addressed by introducing granular metrics for quantifying the reliability of IP routers. The goal is to provide a practical way of applying the traditional reliability metric Mean Time Between Failures (MTBF) to a large network of edge routers.

This is done by considering a set of identical Customer Facing Line Cards and count failures of these cards caused by hardware and software, including entire router failures, while in the traditional MTBF metric only line card failures leading to their replacement are counted. This is the motivation for the introduction of the new metric which is referred to as Mean Time Between Outages (MTBO). In contrast with MTBF, MTBO provides the frequency of all router failures attributed to the vendor. The MTBO metric has been accepted as a key industry metric by the QuEST Forum/TL9000 organization.

This document formalizes the metric definition as an industry standard and provides an outline for metric assessment.

Document History

August 1, 2012
Mean Time Between Outages – A Generalized Metric for Assessing Production Failure Rates in Telecommunications Network Elements
Telecommunications Service Providers (SPs) face the challenge of needing to continuously upgrade the network and grow network capacity, while providing a service that meets stringent customer...
August 1, 2012
Mean Time Between Outages – A Generalized Metric for Assessing Production Failure Rates in Telecommunications Network Elements
Scope & Purpose Telecommunications Service Providers (SPs) face the challenge of needing to continuously upgrade the network and grow network capacity, while providing a service that meets stringent...
0100030
December 1, 2010
Mean Time Between Outages – A Metric for Assessing Production Failure Rates in IP Routers
SCOPE & PURPOSE Internet Service Providers (ISP) face the challenge of needing to continuously upgrade the network and grow network capacity, while providing a service that meets stringent customer...

References

Advertisement