Article Preview
TopIntroduction
Nowadays, cloud computing is a revolutionary technology that reduces the cost involved in computing. Even though it has gained widespread popularity in the industry, load balancing, resource management, security, workflow, scheduling and fault tolerance (FT) are the significant challenges in it. Among these, a virtual machine (VM) management is the most considerable challenge because of the possibility of fault presents in the dynamic cloud environment, which results in an unreliable outcome. There are different types of faults such as hardware, software, timing, value, permanent, transient, network, processor, interaction, process and omission fault that may occur in cloud-based computing resources (Salfner et al., 2010). The categorization of faults is shown in Figure 1. These faults can result in different types of failures in the cloud infrastructure such as network, hardware, software, database, overflow, and time-out-failure (Agarwal & Sharma, 2016; Mariani, 2003) as shown in Figure 2. Similarly, failures can also occur due to network congestion, server overload, malicious attacks, human factors and different unknown errors (Oliner & Stearley, 2007; Oppenheimer & Patterson, 2002). In the series of publications, various causes of failure have been reported in the literature (Schroeder & Gibson, 2010, 2007).