The-Day-After Networks: A First-Response Edge-Network Architecture for Disaster Relief

Recent disasters, including natural phenomena (e.g., hurricane Katrina) and terrorism (e.g., 9/11), have exposed the fragility of our national communication infrastructure. Many familiar services that are relied on for everyday communications (e.g., ubiquitous cell phone connectivity and always-on Internet connections) immediately become nonfunctional in emergency situations due to the failure of the supporting infrastructure through both system damage and system overuse. Unfortunately, these situations result in increased demand and so require effective communication support. Given the limitations of current network architectures, the goal of this proposal is to develop a new network architecture, including supporting communication mechanisms, to enable survivable communication and networking in disaster scenarios. The novelty of our architecture for the day after networks stems from the fact that the protocols and services will be designed from the bottom up to support the specific communication demands required for disaster management and recovery.

The main goal in supporting disaster rescue and recovery efforts is enabling effective communication amongst the diverse rescue workers, as well as providing connectivity to survivors. The day after networks (DAN) need to provide communication to support recovery and relief efforts. However, during a disaster, the standard communication infrastructure, including switching stations, underground fibers and cell phone towers, may be damaged, destroyed or without power. Consequently, landline phones, Internet modems, and television sets will not be operational. While emergency response personnel may have some backup communication support, current provisioning for emergency response personnel relies on infrastructure support, such as cell towers, that may not be operational during a major disaster. Since we cannot simply throw more traditional resources at DAN environments, it is important to first understand the real communication requirements needed in these environments. 

There are two main differences between traditional Internet communication and communication during disaster and recovery. First, the communication paradigm is inherently different than the standard host-to-host communication seen in the Internet today. Consider the site of a natural disaster. Survivors must be located, information regarding relief resources must be efficiently disseminated to rescue workers, survivors may want to connect with rescue workers or with their families, rescue workers may need fast communication channels with their superiors, and emergency organizers need to delegate tasks to volunteers. While the goal of the Internet has been to maintain end-to-end connectivity for hosts, the main goal of a DAN is to support the services needed by the users and so ensure host-to-service communication. Since the people involved in disaster recovery play particular roles to provide these services (e.g., police officers, fire fighters, rescue workers, volunteers, survivors), we believe that communication in disaster recovery networks must be role-based. 

Second, since repair of the existing infrastructure may be slow and the current provisioning for emergency response is insufficient, the need for more immediate connectivity demands the inclusion of all available communication resources, and so will result in a network with heterogeneous communication technologies that alone may only have intermittent connectivity to any remaining infrastructure. Unfortunately, the fundamental Internet design tenets make it ineffective to adopt the Internet architecture for communication and networking in disaster scenarios, even with incremental updates (e.g., adding redundancy and enriching the topological connectivity). First and foremost, the Internet was designed for fail-stop robustness. That is, the failure of some part of the Internet will be locally confined and will not propagate or impact the proper functioning of other surviving parts of the Internet. This fail-stop robustness was appropriate sixty years ago when the nation was under the threat of massive nuclear attacks. In the new century where the primary concerns have moved to natural and terrorism disasters, it is unacceptable to have a network that has parts that simply stop working in a crisis. Instead, the network must survive modern disasters, even with degraded performance. Second, the Internet was designed with the main goal of supporting scalable unicast routing. For modern disaster scenarios that focus on providing services, this focus on unicast communication is clearly off-target. Third, the Internet paradigm assumes connectivity at all times, and also assumes that all roads lead to the Internet. In DAN environments, nodes are expected to have intermittent connectivity and may be able to reach a different subset of nodes through different interfaces. Finally, the inherent need for prioritization and security of traffic demands that they be natively supported components in the network architecture. 

While support for communication in disconnected networks has been proposed using delay tolerant networking, the role-based communication needed for disaster recovery does not t the current model of delay tolerant networks. Although there will always be a need to support some host-to-host communication, the host-to-service model of communication demands different underlying communication paradigms at all layers of the protocol stack. At the routing layer, we believe that such communication can best be supported through native anycast routing, enabled with a combination of hop-by-hop reliability and intelligent flooding. At the transport layer, since the underlying communication paradigm is no longer host-to-host communication, the definition of end-to-end is no longer clear. Therefore, we believe that the transport layer will be a best effort entity that manages the expected reliability of communication through redundancy and caching. 

 We present Phoenix, a complete network architecture for the day after networks, including a suite of enabling mechanisms and algorithms for survivable communication and networking in disasters. Phoenix differs from the Internet in almost all aspects, including its infrastructure construction, naming, native communication paradigm, protocol stack structure, and built-in services. In the following sections, we identify the challenges and opportunities for Phoenix, present our design of the Phoenix architecture that is both technically feasible and practically usable, and propose controls and optimizations that will likely be enforced in future DANs.

If successful, this proposed work will lead to more robust national communication capabilities that will significantly improve disaster management. By providing an information backplane when it is most needed, the Phoenix will improve the efficiency of recovery efforts, significantly reducing the human and resource cost of catastrophic events. This work may also play a role in setting new standards for communication providers (e.g., cell phone manufacturers and service providers) and the automotive industry such that future generations of communication devices (including ones embedded in vehicles) are compliant with post-disaster requirements. Consequently, better safety is assured in a future where we become more dependent on technological artifacts and so are increasingly vulnerable to attacks that subvert them.