Lessons from the Derecho: When Industry Self-Regulation Is Not Enough

 img

The FCC released a fairly thorough report on the widespread 9-1-1 failure that followed the June 2012 “derecho” windstorm. For those who don’t remember, the derecho differs from most weather events by coming up almost without warning. According to the report, carriers had approximately two hours of warning from the time the derecho started in the Ohio Valley to when it hit the D.C. Metro region.

As a consequence of the damage done by the derecho, Northern Virginia experienced a massive failure of its 9-1-1 network, leaving over 1 million people with working phones (at least in some places) but no access to 9-1-1.  West Virginia experienced systemic problems as well, as a did a scattering of locations in other states impacted by the derecho. Verizon maintains the network in Northern Virginia, while West Virginia is managed by Frontier.

In both cases, the report concluded that both Verizon and Frontier failed to follow industry best practices or their own internal procedures. To be clear, this was not a massive dereliction of duty. But the accumulation of some corner cutting over here, some poor practice over there, meant that when the unpredicted crisis hit the system suffered critical failures precisely when most needed. Unlike just about every other part of the network, where providers balance the cost of hardening a network against potential events with a number of other factors, the core 9-1-1 system is explicitly supposed to remain operational in even the most extraordinary circumstances.  It is the foundation of public access to emergency services. As long as I can contact the phone network, I should be able to get 9-1-1 service. Public safety responders rely on the public reporting emergencies so that they can efficiently deploy resources as much as the public depends on its ability to contact emergency services through 9-1-1.

It is particularly shocking to see this happen with Verizon. As I’ve observed before, Verizon generally prides itself on maintaining a high-quality network. Customer satisfaction surveys generally agree with this self-assessment. It is part of Verizon’s overall business strategy: build the best network, attract the customers willing to pay a premium for it. Yes, part of that strategy means selling off or generally neglecting the copper footprint outside the FIOS territories. But the 9-1-1 failure occurred in Northern Virginia, precisely the kind of high rate-of-return territory Verizon targets for top-of-the-line service. So what happened?

What happened in the derecho illustrates the problem of relying on general incentives and industry best practices instead of regulation for critical safety and reliability issues. In understanding this lesson, we need to understand that it is not that Verizon (or Frontier, or other companies) are evil or greedy or callous or any of the other delightful adjectives people like to use to personalize this. It’s not. Verizon and Frontier understand about keeping 9-1-1 going just fine. They are all for it. They are part of the group that develops the industry best practices, which they supplement with their own internal procedures.

Nevertheless, in economic terms, maintaining 9-1-1 is a deadweight cost of doing business. Expenditures on 9-1-1 do not contribute to the bottom line in the way fiber deployment or spectrum acquisitions do. Maintaining the 9-1-1 system is essentially a “public safety tax.” Yes, they get to recover the expenses in a general way through fees tacked on to the phone bill. But when deciding whether to spend a dollar more on 9-1-1 or how carefully to follow industry best practices rather than cut corners, the rational network operator has plenty of countervailing incentive at the margin to divert that dollar from 9-1-1 to something that contributes to the bottom line, or to push off some recommended maintenance just a little bit, or otherwise cut a few corners here and there.

The other important lesson is that, when dealing with emergency services in extreme conditions, it takes relatively few errors to have potentially far reaching impact. This makes 9-1-1 even more expensive to maintain. What can pass as a reasonable practice in managing a giant enterprise like Verizon’s telephone system is unacceptable in 9-1-1. Since all humans make errors from time to time, this means additional checks and redundancies (and workers). Again, while carriers have plenty of generic incentive to make sure their 9-1-1 systems remain intact, but for any specific decision about investing resources, the line worker or manager will invariably weigh the cost against much more attractive alternatives.

 

Regulation Comes In Where Industry Standards Are Not Enough

Regulation and enforcement change the incentive equation at the margin. First, in a behavioral way, making a rule mandatory and putting in an enforcement and monitoring structure generally make people take the requirements more seriously. To take an every day example, we all have incentive to drive carefully. We still have posted speed limits, because that sets an expectation about what it actually means to drive carefully.  We have enforcement and penalties, because otherwise people decide to let their own judgment about what’s safe be their guide.

The other rather direct way it alters incentive is that it increases the direct cost of not making the investment. To use the speed limits analogy, I may still decide to go 50 in a 30 MPH zone because I’m in a rush and decide to risk it. But I’m a lot less likely to do it in a photo-enforcement zone and where the fine is doubled for going more than 15 miles above the speed limit.

In looking at the lessons from the derecho, we need to recognize that “industry has incentive to do X” is often not enough. In particular, in the case of emergency and reliability requirements for networks, we need to recognize that the general incentive of carriers to make sure their network stays up in a crisis doesn’t guarantee the industry will do what they need to do, because industry participants also have incentive to spend that money on things that more directly contribute to the bottom line.

That has several immediate implications. First, the FCC needs to make sure that all carriers are up to standard on 9-1-1, especially those carriers that actually maintain the 9-1-1 network rather than simply route 9-1-1 calls to the network. As I said above, Verizon and Frontier are not uniquely bad actors. Odds are good that every carrier -- while generally compliant and taking their 9-1-1 obligations seriously, has similarly cut a few corners or neglected a few procedures here and there. Why? Because they all have the same incentives. A few spot inspections to make sure every carrier is on top of things will help prevent any similar 9-1-1 failures elsewhere.

Longer term, as the FCC considers both the response to Hurricane Sandy and the future of the phone network, we need to carefully consider where we need to rely on regulation rather than on industry incentives and self-regulation. That certainly applies to 9-1-1, but it also applies to network reliability generally. No network wants off time for any reason. But that does not gaurantee that carriers will find the right trade off between cost and benefit that we as a nation need in order to have a reliable infrastructure. When our broadband networks go down, it is more than an inconvenience. It disrupts business and disrupts people's lives. Carriers have incentive to minimize dowtime, but they have other incentives as well. We need federal and state agencies to be willing to push carriers to spend that extra dollar when it's needed, even if they would rather spend it on something else.

In short, Commissioner Rosenworcel is spot on in her statement when she says “it is time for an honest accounting of the resiliency of our Nation’s network infrastructure in the digital and wireless age.” That starts with recognizing that actual, enforceable rules with real teeth are necessary to make sure that core emergency functions work properly. General incentives are all well and good, but for critical emergency functions, they are simply not good enough. 

The Latest