Differences

This shows you the differences between two versions of the page.

--- collaborative_development_projects:rescuer [2015/04/10 04:15]
Zhipeng (Howard) Huang [Project description:]
+++ collaborative_development_projects:rescuer [2015/04/10 07:18]
Zhipeng (Howard) Huang
@@ Line 7: / Line 7: @@
 ==== Project description: ====
-Disaster Recovery (DR) is a very important issue in NFV, as when a VIM instance or even a complete site goes down, we need a strong DR scheme to keep the service continuity to meet the requirement defined by the terms like RPO or RTO.
+Disaster Recovery (DR) is a very important issue in NFV, for example when dealing with burst hours during holidays , shut down or malfunction of a VIM instance or even a complete site may cause severe service interruption, or a complete service termination. without a strong infrastructure level disaster recovery support. Therefore this project is proposed to develop use cases requirements, as well as upstream project blue prints, focusing on how to make infrastructure DR-capable to keep the service continuity to meet the requirement defined by the terms like RPO or RTO when extreme scenario strikes.
 === ETSI NFV Requirements ===
@@ Line 23: / Line 23: @@
   * 	After the disaster situation recedes, Network Operators should restore the impacted NFVI-PoP back to its original state as swiftly as possible, or deploy a new NFVI-PoP to replace the impacted NFVI-PoP based on the comprehensive assessment of the situation. All on-site Service Chains must be reconfigured by instantiating fresh VNFs at the original location. All redundant VNFs activated at the designated Disaster Recovery site to support the disaster condition must be de-linked from the on-site Service Chains by draining and re-directing traffic as needed to maintain service continuity. The redundant VNFs are then placed on standby mode per disaster recovery policy.
+=== DR In OpenStack ===
+Disaster Recovery (DR) for OpenStack is an umbrella topic that describes what needs to be done for applications and services (generally referred to as workload) running in an OpenStack cloud to survive a large scale disaster. Providing DR for a workload is a complex task involving infrastructure, software and an understanding of the workload. To enable recovery following a disaster, the administrator needs to execute a complex set of provisioning operations that will mimic the day-to-day setup in a different environment. Enabling DR for OpenStack hosted workloads requires enablement (APIs) in OpenStack components (e.g., Cinder) and tools which may be outside of OpenStack (e.g., scripts) to invoke, orchestrate and leverage the component specific APIs.
+{{:collaborative_development_projects:dr.png?300|}}
+Disaster Recovery should include support for:
+Capturing the metadata of the cloud management stack, relevant for the protected workloads/resources: either as point-in-time snapshots of the metadata, or as continuous replication of the metadata.
+Making available the VM images needed to run the hosted workload on the target cloud.
+Replication of the workload data using storage replication, application level replication, or backup/restore.
+We note that metadata changes are less frequent than application data changes, and different mechanisms can handle replication of different portions of the metadata and data (volumes, images, etc)
+The approach is built around:
+Identify required enablement and missing features in OpenStack projects
+Create enablement in specific OpenStack projects
+Create orchestration scripts to demonstrate DR
+When resources to be protected are logically associated with a workload (or a set of inter-related workloads), both the replication and the recovery processes should be able to incorporate hooks to ensure consistency of the replicated data & metadata, as well as to enable customization (automated or manual) of the individual workload components at recovery site. Heat can be used to represent such workloads, as well as to automate the above processes (when applicable).
 ==== Scope: ====
   * ''Describe the problem being solved by project''
-The project aims to develop the requirements for NFVI and VIM on supporting Telco grade DR implementation which covers :
+The project aims to develop the requirements and use cases for NFVI and VIM on supporting Telco grade DR implementation :
-  * Requirements for VIM and NFVI to support Single Site DR, including both active-active and active-standby design
-  * Requirements for VIM and NFVI to support Multisite DR, including both active-active and active-standby design
+  * Requirements for VIM and NFVI to support Multisite DR, including:
+    a. For active-active, active-hot standby, active-cold standby design
+    b. Replication of all configuration and metadata required by an application - Neutron, Cinder, Nova, etc.
+    c. Ability to ensure consistency of the replicated data & metadata
+    d. Supporting a wide range of data replication methods: Storage systems based replication, Hypervisor assisted (possibly between heterogeneous storage systems). For example, using DRBD or Qemu based replication, Backup and Restore methods, Pluggable application level replication methods
+  * DR Use Cases to provide more requirements.
+  * Formulate BPs that would reflect the requirements, and implement those BPs in the upstream community.
   * ..
-The requirements would cover DR for compute, storage and network.
   * ''Specify any interface/API specification proposed''
@@ Line 53: / Line 80: @@
 This project is extendable for future functions.
 ==== Testability: ''(optional, Project Categories: Integration & Testing)'' ====
@@ Line 70: / Line 96: @@
   * ''Identify similar projects is underway or being proposed in OPNFV or upstream project''
-OPNFV Multisite.
+OPNFV: Multisite, HA For VNF, Doctor.
   * ''Identify any open source upstream projects and release timeline.''

Wiki

User Tools

Site Tools

Differences

Page Tools