1. Predicting Transient Downtime in Virtual Server Systems: An Efficient Sample Path Randomization Approach.
- Author
-
Du, Anna Ye, Smith, Sanjukta Das, Yang, Zhouhan, Qiao, Chunming, and Ramesh, Ram
- Subjects
- *
CLIENT/SERVER computing , *PREDICTION models , *CLOUD computing , *SAMPLE path analysis , *RANDOMIZATION (Statistics) , *MARKOV processes - Abstract
A central challenge in developing cloud datacenters Service Level Agreements is the estimation of downtime distribution of a set of provisioned servers over a service window, which is compounded by three facts. First, while steady-state probabilities have been derived for birth-death processes involving server failures and repairs, they could be highly inaccurate under transience. Furthermore, steady-state cannot be assured under typical service windows. Therefore, estimation of transient distributions is essential. Second, the processes of failures and repairs may follow any distribution and hence need to be extracted using system log data and modeled using appropriate general distributions. Third, downtime distributions over service windows depend on the number of servers and their deployment structure for a contract. We develop an efficient and generalized sample path randomization approach to precisely estimate transient probabilities under three different checkpointing strategies and three flexible failure distribution models. The estimators are unbiased, consistent, efficient and sufficient. Their asymptotic convergence is established. The estimation algorithms are computationally efficient in solving practical problems and yield rich information on transient system behaviors. The methodology is general and extensible to various server failure and repair processes characterized using birth-death modeling. [ABSTRACT FROM PUBLISHER]
- Published
- 2015
- Full Text
- View/download PDF