Back to Search
Start Over
Joint Progressive Network and Datacenter Recovery after Large-Scale Disasters
- Publication Year :
- 2020
-
Abstract
- Large-scale disasters affecting both network and datacenter (DC) infrastructures can cause severe disruptions in cloud-based services. During post-disaster recovery, repairs are usually carried out in stages in a progressive manner due to limited repair resource availability. The order in which network elements and DCs are repaired can significantly impact users’ reachability to important contents/services. We investigate joint progressive network and DC recovery in which network recovery and DC recovery are conducted in a coordinated manner such that users have access to the maximum possible amount of contents/services at each repair stage. We first solve the optimization problem of joint progressive recovery to find the optimal sequence of network element and DC repairs with the objective to maximize cumulative weighted content reachability in the network. We then propose a scalable heuristic for scheduling the sequential repair of network nodes/links and DCs. Our model assumes that, at each repair stage, one network node with adjacent links and one DC can be fully repaired; however, full recovery may not be guaranteed due to limited resource availability. Hence, we also propose a “resource-aware” approach (with two resource-allocation strategies, namely “selective allocation” and “adaptive allocation”), which considers both full and partial recovery of elements based on available resources at each stage. We show that, compared to disjoint progressive recovery approach, in which network recovery and DC recovery plans are independent, our joint progressive recovery approach provides significantly higher per-stage content reachability in the network.
- Subjects :
- Optimization problem
Computer Networks and Communications
business.industry
Computer science
resource allocation
020206 networking & telecommunications
Cloud computing
02 engineering and technology
Maintenance engineering
Cloud network
Scheduling (computing)
progressive disaster recovery
Network element
Reachability
Scalability
0202 electrical engineering, electronic engineering, information engineering
Electrical and Electronic Engineering
business
Recovery approach
content reachability
Computer network
Subjects
Details
- Language :
- English
- Database :
- OpenAIRE
- Accession number :
- edsair.doi.dedup.....86082fc9bc6afdc62182c920914c361f