User Tools

Site Tools


requirements_projects:failure_prediction

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
requirements_projects:failure_prediction [2015/01/27 06:49]
Hai Liu removed
— (current)
Line 1: Line 1:
-==== OPNFV failure prediction: ==== 
- 
-  * Proposed name for the project: ''​data collection of failure prediction''​ 
-  * Proposed name for the repository: ''​opnfv-fp-dc''​ 
-  * Project Categories: ''​(Requirements)''​ 
- 
-==== Project description:​ ==== 
- 
-Failure prediction is an important step of a failure prevention system, which could be deployed to help the NFV system avoid the operational failure in the real time environment. ​ 
- 
-Core technology of failure prediction is big data, it requires different kinds of data as enough as possible. The data come from log files, real time parameters of hardware and software, environment parameter, etc.  
- 
-For failure prediction, the first step should focus on data collection. Failure predictor can get data from a list of softwares: Ceilometer of Openstack, OSS, VNFM and others. In OPNFV first release, we need to identify which kind of data are needed and analyses the gap of Ceilometer. ​ 
- 
-Failure prediction has been studied in ETSI NFV ISG, and there also has developed some general requirements,​ which should be the initial input for this topic in OPNFV. 
- 
-For example, in NFV GS Draft ETSI GS NFV-REL 001 (v1.0.0, 2014-11), failure prediction has collected several requirements have been captured as below: 
-  * The real-time resource usage such as the disk usage, CPU Load, memory usage, network IO and virtual IO usage and their loss rate, and available vCPUs, virtual memory, etc. shall be provided to VIM at configurable intervals by entities of infrastructure. It should also be possible to configure the thresholds or watermarks for the event notification instead of configuring the reporting interval. ​ 
-  * Each entity of infrastructure shall provide the open interfaces for communicating the performance and the consumption resource and for allowing polling of its working state and resource usage. 
-  * The failure prediction framework should include the functionality of the false alarm filtering to avoid triggering unnecessary prevention procedure for anomaly alarming. ​ 
-  * The failure prediction framework should include trend identification,​ period or seasonal variations, and randomness analysis of the collected data on resource usage (e.g., memory, file descriptors,​ sockets, database connections) to predict the progression of the operated NFV system to an unhealthy state, e.g., resource exhausting. 
-  * The failure prediction framework should be able to diagnose or verify which entity is suspected to be progressing towards a failure and which VNFs might be affected due to the predicted anomaly. 
-  * The entities of VNF and its supported infrastructure should have their own self-diagnostic functionality in order to provide their health information to the VIM. 
-  * The log report associated with a NFVI resource failure or an error detected by a hardware component, software module, hypervisor, VM, or the network should include the error severity. 
-  * The log report should include an indication of the failure cause. 
- 
-While Fault Management topic shall also be one of the important issues in ETSI NFV Phase 2, so it is possible that such requirements for fault prediction would be updated or enriched during the period of this OPNFV project, if it happens, then the updated part will be captured as well in this project. 
- 
-While considering the upstream projects, there also have existed some projects in this area, e.g.  
-- the Ceilometer project in OpenStack for system resource monitoring; 
-- the new project proposal “Time Series Data Repository (TSDR)” in OpenDaylight for ODL system monitoring. ​ 
- 
-However, they do not cover all specific requirements in the OPNFV environment. Therefore our first task is to investigate the gaps between those upstream projects, other OpenStack components and the OPNFV requirements. After that, we plan to deliver some documents on the VIM northbound API, implementation architecture and plan. Finally, we need to realize the failure prediction framework. 
- 
-==== Architecture:​ ==== 
-  
-The failure prediction includes training system, predictor and failure management module. It is described in following figure. 
- 
- 
-The training system can generate rules and input to predictor. Online data source for predictor are ceilometer, VNFM, OSS and others. Predictor predicts failure and inputs to doctor module. Doctor module should handle this failure. ​ 
- 
-In OPNFV release 1, this project limits scope to the interface between ceilometer and predictor. ​ 
- 
-==== Scope: ==== 
- 
-__Describe the problem being solved by project:__ 
-As a requirements category project, it plans to solve the problem as following: 
-  * The current OpenStack version Juno is not able to totally realize the data collection of failure prediction management. Based on this, this project is to solve this problem by analyzing the gaps between OPNFV failure prediction and ETSI NFV REL GS, OpenStack Ceilometer project and OpenDaylight Time Series Data Repository project. 
-  
-__Specify any interface/​API specification proposed:__ 
-Additional interface specifications:​ 
-  * VI-Ha 
-  * Vf-Vi 
-  * Vi-Vnfm 
-  * Or-Vi 
-  * Other interfaces potentially to be brought during the project 
- 
-__Specify testing and integration:​__ 
-  * Debugging and Tracing ​ 
-  * Unit/​Integration Test plans 
-  * Client tools developed for status shows etc.  
- 
-__Identity a list of features and functionality will be developed:​__ 
-  * Additional features of Ceilometer and TSDR to support OPNFV failure prediction. ​ 
- 
-__Identify what is in or out of scope. So during the development phase, it helps reduce discussion:​__ 
-In scope: ​ 
-  * Considering the ETSI NFV REL GS version 1.0.0 (2014-11) as one of the input. The conclusion for failure prediction in ETSI NFV REL phase 2.0 before the project’s deadline, will also be considered. 
-  * The ongoing discussion and conclusion of the upstream projects (i.e. Ceilometer, TSDR), before this OPNFV failure prediction’s deadline, shall be considered as parts of the input. 
-  * VIM northbound interfaces 
-  * The user stories of failure prediction 
-  * The monitoring functionality of hypervisors (e.g. KVM, XEN) 
-Out of scope 
-  * The ongoing discussion for failure prediction after this OPNFV failure prediction deadline will not be captured in it, e.g. for the ETSI NFV REL phase 2.0, related upstream projects. 
-  * General software design framework for this project. 
-  * An engine for collecting real time information,​ data analysis and the failure prediciton. 
- 
-__Describe how the project is extensible in future:__ 
-The achievements of this project will be used as the input for next stage, e.g. Integration & Testing, and Collaborative Development. ​ 
- 
-==== Testability:​ ''​(optional,​ Project Categories: Integration & Testing)''​ ==== 
-Specify testing and integration like interoperability,​ scalability,​ high availablity 
-  * N/A 
- 
-==== Documentation:​ ''​(optional,​ Project Categories: Documention)''​ ==== 
-  * N/A  
- 
-==== Dependencies:​ ==== 
-Identify similar projects is underway or being proposed in OPNFV or upstream project 
-  * The “Doctor (Fault Management and Maintenance)” project. ​ 
-  * The “High availability for OPNFV”. 
- 
-Identify any open source upstream projects and release timeline. ​ 
-  * Ceilometer project (https://​wiki.openstack.org/​wiki/​Ceilometer ) is the upstream project of this project. It would be aligned with OpenStack release schedule and OPNFV Release 1 schedule. 
-  * TSDR Project (https://​wiki.opendaylight.org/​view/​Project_Proposals:​Time_Series_Data_Repository ) is the upstream project of this project. It would be aligned with Opendaylight release schedule and OPNFV Release 1 schedule. 
-  * OpenStack Juno Release 
-  * OpenDaylight Helium Release 
- 
-Identify any specific development be staged with respect to the upstream project and releases. 
-  * none 
- 
-Are there any external fora or standard development organization dependencies. If possible, list and informative and normative reference specifications. 
-  * ETSI NFV draft REL GS (v1.0.0, 2014-11) 
-  * ETSI GS NFV REL004 
- 
-==== Committers and Contributors:​ ==== 
-Names and affiliations of the committers: ​ 
-  * Hai Liu,  hai.liu@huawei.com 
-  * Yijun Yu, yuyijun@huawei.com ​ 
-  * Jun Li, matthew.lijun@huawei.com ​ 
-  * Yifei Xue, xueyifei@huawei.com ​ 
-  * Linghui Zeng, linghui.zeng@huawei.com 
-  * Lanchao Zheng, zhenglanchao@huawei.com 
-  * Qiao Fu, fuqiao@chinamobile.com 
-Any other contributors:​ TBD 
- 
-==== Planned deliverables ==== 
-Described the project release package as OPNFV or open source upstream projects. 
-  * OPNFV data collection of failure prediction gap analysis (e.g. ETSI NFV draft NFV GS, Ceilometer, TSDR); 
-  * Fault Prediction API Specification in the interfaces, e.g. VI-Ha, Vf-Vi, Vi-Vnfm, Or-Vi and other potential impacted interfaces during the project; 
- 
-If project deliverables have multiple dependencies across other project categories, described linkage of the deliverables. 
-  * N/A 
- 
-==== Proposed Release Schedule: ==== 
-When is the first release planned? 
-  * March, 2015. 
- 
-Will this align with the current release cadence 
-  * Yes 
  
requirements_projects/failure_prediction.1422341386.txt.gz · Last modified: 2015/01/27 06:49 by Hai Liu