Assessing the Impact of Virtual Standby Systems in Failure Propagation for Complex Wastewater Treatment Processes

.is article proposes an original probabilistic modelling methodology named Virtual Standby (VSB), which enables a practical simulation, analysis, and evaluation of the impact on availability and reliability achieved by potential buffering policies on the performance of complex production systems. Virtual Standby (VSB) corresponds to a design and operational characteristic where some machines under a failure scenario are capable to provide for a limited time, continuity to the subsystems downstream before suffering delay which is currently not considered when assessing availability. .is feature plays a relevant role on the propagation of the effect of a failure; indeed, it could prevent the propagation by guaranteeing the isolation time needed to recover from its failure, controlling and reducing the production losses downstream. A case study of the preliminary treatment process of a wastewater treatment facility (WWTF) is developed bearing in mind the systemic behaviour in the event of a failure and the specific features of each equipment. VSB is a big advantage for the representation of this complex processes because, among other things, it considers the impact of buffering policies on the perceived availability of the system. .is model allows determining different production levels, with a better and easier fitting of the reliability, availability, and production forecast of the process. Finally, the comparison between the VSB simulation results with traditional procedures that do not consider the operational continuity under a failure scenario confirms the strength and precision of the proposal for complex systems.


Introduction
e performance of a system is the result of the synergic work of different sets of machines and individual machines adding to the overall performance. Each individual or set of machines is bounded by a set of constraints inherent to each machine or set of machines. Some of these are maintainability and maintenance requirements, reliability, nominal capacity, maintenance plan, operational limitations, layout of the system, and complexity degree.
e combination of all these aspects may create production bottlenecks [1,2] and delays; hence, they must be corrected in a manner that is effective and accurate [3,4]. erefore, a combined analysis of reliability and productivity must be performed to allow optimal use of resources and achieve the required production goals [5,6]. e traditional reliability analysis of complex systems is usually based on a logical and probabilistic modelling approach, which contributes to improve the key performance indicators (KPIs) of production systems [7,8]. Nowadays, it is possible to find in the literature many alternatives available for reliability analysis of complex systems [9,10]. e systematic studies are usually developed considering techniques and methodologies as Reliability Block Diagrams (RBDs) [11,12], Fault Trees (FTs) [13], Reliability Graphics (RGs) [14], Petri Nets (PNs) [15], and Monte Carlo simulation [16][17][18] among others. More recently, other techniques have emerged such as Multistate Systems [19], Graph Topology [20], and fuzzy approaches [3] which have allowed to reveal subjacent connections rising from the process dynamic. Another approach would be to implement specially designed algorithms to assess availability and reliability, such as computing the Equivalent Availability (EA) index that makes use of the shared load between pieces of equipment working under lower loads than their nominal capacity allowing the use of different combinations of equipment to achieve the availability goal [21]. In different scenarios, these techniques must be adapted or extended to account for the particularities of the system, especially for large, complex, and dynamic systems. Such is the case for classic RBD which must be adapted in order to measure effect of WIP or inventory buffering on the performance and availability of the system [4] (other techniques exist, for example, to adapt these types of analysis to demand fluctuations [22]). is is where the methodology developed in this paper fits.
Buffering policies allow machines, under any failure scenario, to provide continuity for a limited time to the production subsystem downstream [23,24]. e effect and propagation under planned or unplanned stoppages and delays could be total or partially guaranteed, controlling and reducing the production losses depending on the time needed to recover, proper operating conditions (time to repair), and the required capacity to avoid material starvation.
e primary concern of this paper's proposed methodology (VSB) is to ease the process of building probabilistic models to simulate and analyse real production scenarios (wastewater treatment process in this case) involving different buffering policy opportunities [25,26]. An initial approach for this method has been already developed in a case study for a mining process, which proved the potential for further research [27].
VSB is used within existing Monte Carlo simulation models which will be implemented in an especially designed environment for the case study that can estimate a set of expected performance indicators of a complex system and its equipment with which is possible to estimate statistical variability.
Alternatives to the VSB methodology to model reliability of a complex system, which currently exist in the literature, are as follows: (i) Traditional RBD Methodology [11,12]. is is a very useful and well-known method; nevertheless, this modelling does not allow to include the differential time effect due to the elements only having two states, and thus failure propagation is immediate. (ii) Markov Chain [5]. In this case, it is only possible to model using constant or discrete-time evolution failure rate, restricting the assessment of the operational reality and complicating production and availability analyses. In general, this procedure does not reach enough detail in the results. (iii) Traditional RBD Methodology Using the Universal Generating Function for Data Analysis. [19] is methodology combines classical RBD with a more accurate data analysis, which translates in better data fitting for failure rate and density functions because it considers the differences raised from the operation of multifunctional systems.
(iv) Finally, the operational continuity could be evaluated through the analysis or simulation of a buffer configuration [4], but considering the characteristics of this methodology, it would be necessary to incorporate and evaluate new variables, not currently contained in the problem under study, such as isolation time, upstream and downstream capacity, availability, nominal throughput, and physical buffer capacity. Even more, the model will have greater complexity if the operational continuity is provided by more than one element, implicating the generation of n buffers for each case and the incorporation of buffer model variables [4] without efficient resource utilization and possible loss of study focus.
is research claims that the development of VSB as a very specific methodology to model these specific buffering situations in production systems along with the use of Monte Carlo simulation provides an excellent and very practical tool to measure and assess the impact of buffering options on both the reliability and availability of complex production systems. ese tools may help the analyst to focus on the study of specific modelling variables and therefore help solve problems in an effective and efficient way. Table 1 shows a comparative analysis between the abovementioned methodologies related to their capacity to model operational continuity after a failure event or delay.
is table exposes the differentiating strengths of VSB over the rest of the methodologies. It is necessary to emphasize the capacity of VSB to get valid results using relatively few information and with a moderate analysis effort.
Wastewater treatment is one of the several contexts in which the limitations of industrial processes play a critical role, because of the high impact of failure consequences, not just for the process but for human health and the environment also.
Water is the main responsible for life on the planet Earth and is one of the most important, if not the most important resources for any human settlement in the world. According to a press release from the UN in 2010 where they coined the term "sick water" [28], they address the need of transforming wastewater from a real hazard to health and the environment into a quality and useful resource that is a must for the 21 st century in which water crisis is a fact as it is for Africa where it is forecasted that around 3 billion people will live in areas with water scarcity. In this context, they state that "improved sanitation and wastewater management are central to poverty reduction and improved human health" [28].
Since it is clear that sick water crisis is a highly critical problem for humanity to guarantee clean water access for people, the aim of this paper is to improve the assessment of availability and reliability in wastewater treatment processes through a novel method for modelling complex production lines using Virtual Standby.

Objective
e main goal of this research is to propose a novel modelling procedure for industrial processes accountable for failure propagation wherein buffering WIP is possible using probabilistic-based simulations of Virtual Standby backups for units performing specific tasks to minimize workflow interruptions.
According to the goal, this article is organized as follows: first, the problem statement and application of the proposed methodology are exposed in detail. After which, the analysis process is developed and abridged following the proposed methodology, and then an assessment is performed on the analysed data of reliability and maintainability analyses. Finally, a case study is developed, modelled, and solved concluding with some important remarks.

Problem Statement and VSB Proposal Methodology.
As it was expressed in the Introduction section, in a manufacturing industrial process under specific conditions, the failure of one or more elements might not generate a system detention immediately; this capability depends on the system's ability to provide production during a limited time interval after failure, such may be the case for downstream work in the process, for example. is effect could be considered as a buffer [4], but the main variables of each situation are very different. In buffer modelling, the throughput capacity is a key variable to calculate what the starvation level should be for the proper isolation time. e buffer is a physical asset, with a specific capacity and of course with a required investment and maintenance cost and as such it should be considered when assessing availability; therefore, this is where this VSB becomes relevant because it will potentially improve the overall availability of a process reflecting the importance of buffering policies when analysing availability and reliability. In the VSB model, capacity is explained by two factors: a random variable (after failure capacity) and its relationship with the repair time (repair function). ere is no relation with bottlenecks (upstream or downstream) or the starvation level. e main principles of VSB methodology are as follows: (i) To model and represent the VSB scenario, a "virtual" backup must be created bounded by specific parameters for modelling failure and repair times which starts working at the time of failure of the primary equipment. Both primary and "virtual" backup equipment are necessary to model VSB scenario.
(ii) e VSB scenario must be applied only in machines where the above explained operational continuity effect exists. It is a very specific condition, so it is necessarily a deep process analysis to validate the VSB scenario inclusion. (iii) As a preliminary criterion when modelling, the operating time of the "virtual" backup equipment i (OT B i,j ) should start at time t � TTF i , along with intervention j. e consecutive time to repair of the virtual backup i (TTR B i,j ) which is also the effective time to repair perceived by the system must be equal to the time to repair of the primary equipment at intervention j (TTR i,j ) less than the operating time of "virtual" backup equipment i (OT B i,j ). e rules for the algorithm are expressed in the following equations: where OT B i,j is the operational backup time of equipment i during intervention j; f vsb i,j (t) is the distribution function of autonomy time of equipment i at intervention j; TTR i,j is the time to repair of equipment i at intervention j; and TTR B i, j is the time to repair of the virtual backup equipment i at intervention j.
It is a conservative scenario because with this condition we make sure that after any intervention of the primary system, both assets are restored at the same time with perfect conditions (perfect renewal). is criterion will be graphical and numerically explained next. Figure 1 represents both cases, with and without VSB scenario.
e "Not VSB scenario" shows that any intervention of any single equipment i will affect directly to the operational time. In the second case, VSB can be modelled as a standby system, including "virtual" backup equipment i. e timeline for each equipment and system is depicted in Figure 2; it is possible to observe the effect of VSB which rises real operating time (OT i,j ) to the effective operating time of the equipment (OT E i,j ) and reducing the real time to repair of the (TTR i,j ) into the effective time to repair (TTR E i,j ). Each operating time increase for the system (Equipment i + Backup i) is equal to the operating time defined for the backup equipment (OT B i, j ). is logic also applies for the time to repair of each equipment, which is equal to the real   Complexity time to repair of the primary equipment and less the operating time of the "virtual" backup equipment. In terms of formulation, it is expressed through the following equations: where the sum of each effective operating time of equipment i at the time of intervention j (OT E i,j ) defines the effective operating time of the system (OT E ), i.e., Likewise, effective time to repair of the system (TTR E ) is defined as follows: us, to introduce VSB impact on production performance evaluation, the simulation model must account for two scenarios: first, a scenario in which to measure the immediate effect of failure or detention and second a scenario in which a VSB is incorporated. is approach allows for the analysis to be more accurate.
As it was indicated at the beginning of this article, the motivation for this study is to develop an integral, flexible, and probabilistic methodology to model the behaviour and impact of buffering policies in complex systems; the following analysis will study historical statistical data regarding time to repair (TTR), operating time (OT), and its relation with reliability and delays due to maintainability. Figure 3 describes the main stages of this proposal. Later on, methodology will be explained step by step to ease understanding through a case study.
As shown in Figure 3, VSB methodology is a framework which involves modelling the whole system from the beginning, recognizing the effects of failure on the whole process and the existence of VSB type buffer conditions. Fault Tree Diagrams can be performed to understand the operating logic of the system. e following is the parameterization of the operation and maintenance data of the involved equipment to perform the simulations using graphical models that follow the VSB logic (considering a virtual machine in standby). Finally, the interpretation of the simulation results is made. is interpretation is made in terms of reliability and maintainability indicators.

Case Study
As it was mentioned, wastewater treatment or sick water treatment is a critical problem to be addressed by every human settlement; therefore, in this context, it is important to find new and better ways to optimize the said process. e inherent nature of the process to cumulate WIP along the workflow is that buffering WIP is available at several stages of the wastewater treatment process, which often is not considered when assessing operational continuity; for this reason, using VSB will potentially improve the availability and reliability analysis.
Most WWTF workflows consist of two stages: a primary and a secondary stage, and there are also many different settings for these two stages. For the purpose of this paper, a primary stage will be considered where wastewater collected from the city through the sewage system flows into the facility which is immediately screened, usually using metal screens to dispose big elements that wastewater may contain, then it flows through a grit chamber to dispose medium size element, and finally it goes into a primary settling tank to clarify it where suspended solids are collected through settling; this collected material is called "primary sludge." Secondary treatment starts with aeration using blowers connected to aeration basins, then the wastewater flow goes into a secondary clarifier where the sludge is collected again, this time is called "activated sludge" because of the previous aeration process, and finally, before the treated water is released to the environment, it undergoes a decontamination process using UV light for modern processes or chlorine for older processes.
As for most industrial processes, failure is a constant threat randomly waiting to arise and the wastewater treatment process is not an exception. On the contrary, since this process involves working with human activity residues, the raw materials for the process have a wide range of possibilities, meaning that it is impossible for the operator of this process to control which residues will arrive to the plant. In this context, all systems of this process are exposed to different and unpredictable types of material damaging the equipment and therefore producing failures along the process and deeply affecting reliability levels; more specifically, when equipment fails because of the aforementioned hazardous materials, the process downstream will normally continue for a measurable period of time. is time frame is not considered in the classical analysis and therefore is not included when assessing availability or reliability in most (if not all) cases. is paper presents and analyses a case study developed in the preliminary treatment stage (Figure 4) of a wastewater treatment facility (WWTF). e main goal of this stage is to protect the facility from clogs, jams, or materials that may render excessive wear of the machinery [29]. ese are the first stages for most, if not for all, wastewater treatment processes, and its importance relies on the capability for removing undesired objects from the raw wastewater that, apart from being dangerous for the machinery, they take valuable space from the process.
A brief description of the process is as follows. An average of 8 MGD of wastewater flows to the plant; this influent from the plant first undergoes a fine screening process using metal bar screens after which wastewater is stored into two 2,400 ft 3 tanks; and then grit and scum is collected using a 2-grit teacup system of 8 MGD capacity (each). Wastewater is then collected into a 3955 ft 3 tank from which is pumped through a 150 hp, 10 MGD 4-pump system for preliminary treatment, which occurs in two 50 ft × 50 ft clarifiers.

Complexity
Most important features of the t preliminary treatment process shown in Figure 4 are listed in Table 2.

Modelling the System
e relationship between subsystems operation under the same process (functional dependency) arises when asking "what if . . .?" is translates into a necessity to track any effect produced by a planned or random state change of a subsystem or equipment embedded in the system. en, the effect on functioning and workload capacity over the system and its components must be studied and analysed. Usually and for the purpose of this proposal, two possible states are considered: degradation (normal established functioning) and nondegradation (failure state, preventive intervention, or operational detention) [30].
For the case study, four machines from a subprocess of the WWTF are set in serial. erefore, if one of the pieces of equipment fails, the whole system fails. Accordingly, it was identified that to consider a VSB process for bar screens or grit chamber when they fail should be most beneficial for the expected results. In the case of failure of one or both of the mentioned equipment, the process downstream will continue to work properly for approximately. is feature is comparable with machines with the capacity of accumulating WIP during regular operation. is capacity is estimated of supplying around 30 minutes of downstream operation. Regarding the historical data analysis of this supplying capability, the simulation model will consider a discrete uniform distribution (f vsb i,j (t)) between 26 and 30 minutes.
As an approximation, the VSB scenario is equivalent to add a standby system [31]; this standby is a redundancy method that involves having one system as a backup for another identical primary system. e standby system is required only upon failure of the primary system. is configuration is constrained by random variables, perfect repair, instantaneous, and perfect switch, mindful that the lifetime of the backup is equal to the defined time for the VSB.
Continuing with the wastewater treatment process, the following Fault Tree diagrams were developed (Figures 5 and  6) to support the understanding and representation of VSB logic.
Under the purpose of reducing the amount of analysis and not sacrificing the outcome quality, it is considered that     6 Complexity the simulation will not account for operational or planned detentions, as it is graphically represented by the FT diagrams ( Figures 5 and 6).

Data Parameterization.
Usually, when describing failure behaviour or repair processes, it is necessary to define a probability distribution to model said features. Hence, several statistical distributions have been assessed and parameters are estimated using, a specially designed environment for the analysis. Table 3 shows most important parameters and KPI regarding reliability and maintainability.

Simulation Methodology Application.
As it was mentioned, before modelling the system and performing a simulation, the model has to consider all specific features of the system regarding operational conditions and all constraints that may exist due to the real physical relation between components. ese selected features are listed in Table 2. Details of constraints regarding logical and functional dependency can be found in section: Modelling the System. e simulation must include an average production rate, which is equivalent for all equipment based on the serial relationship presented. Each piece of equipment must be able to produce at the required rate by the process, and this being totally or partially as demanded by the system.

Complexity
For the case study, the production rate considered is 8 MGD, and it assumes that the influent is equivalent to the daily output demanded by the process or the daily rate of effluent. is means that in a classical analysis, the whole system will stop for lack of influent or for capacity problems when critical equipment fails upstream. e graphical models (based for the simulation) developed are presented and analysed next.

Considerations for the Simulation Model.
Processing systems depend in part on the established operating logic. In general, the continuous simulators, or discrete that includes continuous control and monitoring variables, develop the estimation of indicators and identification of states through monitoring at certain intervals of time. In most cases, said procedure is slightly more efficient compared with methods that focus on the state change of components in the system where monitoring and consultation are performed when something in the system changes state, either a random or a planned condition. For this, a continued evaluation of the state of each element of the system is not needed since for the interest of this proposal, it is important to analyse the impact on operational time, availability, and reliability by comparing the behaviour of the system with and without VSB. Hence, for simulation purposes, the statistical environment designed in this paper is based on discrete-time event occurrence data allowing the impact of functional dependencies to be visible.
It is possible to establish the principal components to develop a modelling task such as tree of components representing the hierarchical structure of the systems and the flow chart.

Implementing VSB Simulation Methodology and Analysis.
As it was described, this proposal considers a traditional scenario in contrast with the VSB scenario (i.e., immediate effect scenario vs. VSB scenario) as it can be observed in Figures 7 and 8.
For both scenarios, data inputs about the characteristics of each piece of equipment considered in the simulation are required (see Table 2). Furthermore, for VSB scenario, it is considered that repair interventions are independent, and that the standby equipment (VSB) starts working at the exact moment of failure of the primary equipment (bar screens and grit chamber in this case).
is is usually known as standby [31].
As was explained before, the parameters of life degradation for the analysed equipment are modelled through a discrete uniform distribution, In the simulation model, the VSB machine must provide downstream systems the same autonomy level provided by the primary machine to survive after a failure. us, the expected operating time of the virtual machine cannot exceed, in equivalent terms, the autonomy level of the primary machine. Accordingly, the operating time of the virtual machine in the case study will be modelled by a uniform distribution (f vsb i,j (t)) [27][28][29][30][31] min. For the virtual machine approximation (standby), it must be met that the virtual machine must be in perfect reliable condition every time that the primary machine starts operating (after an intervention). en, the time to repair (TTR) of the virtual machine must be less or equal than the difference between the TTR of the primary machine and the equivalent time of autonomy for the virtual machine. With this, the virtual machine operates, and it is maintained while the primary machine is been restored. In the best case, when the primary machine is repaired in less time than the autonomy equivalent time, the system assumes TTR equal to 0.
It is important to highlight that the mean time to failure (MTTF) for the virtual machine will be directly dependent on the uniform distribution considered. If the MTTF of the virtual machine is compared with the MTTR of the virtual machine, most of the time the MTTF will be shorter than MTTR.

Simulation Results
Considering the elements that compose the system and the redundancy configurations, a horizon of 365 days of operation was selected (approximately 8,760 hours under normal conditions) rendering 100,000 replications of said horizon. is is mostly to assess a representative sample with which generate more accurate indicators and histograms. It is also important to highlight that some machines have very short autonomy times (e.g., Primary Clarifier); therefore, when analysing the time horizon in cases where the system is unable to provide influent to the aforementioned pieces of equipment, this autonomy time will become significant.

Analysis and Results 1: Immediate Effect Scenario.
e performance indicators to measure are availability, operation time, mean time to failure (MTTF), mean time to repair (MTTR), and the total effluent produced by the system. e outcome for the scenario with immediate effect is compared with traditional statistical analysis (RBD). e expected indicators of simulation approach to RBD are quite different because in the simulation model, the failure propagation is direct, and in RBD approach, the assets are  Table 4.
According to the results of simulation plus the requirements for VSB, BS_001 and GRIT_001 would be the incumbent equipment based on the availability indicator (96.68% and 96.19% respectably) and their buffering capabilities. In this scenario, the expected mean percentual availability of the system is 85.91%, in which during this available time, 88.98% corresponds to the system actually working. Since the logical configuration is in series, any change of state of a machine or set of machines will induce a state change on the overall system.
For that last reason, it will be important to identify which pieces of equipment require reliability improvements in order to decrease frequency of system failures; in this case, both bar screen (BS_001) and grit chamber (GRIT_001) were identified. is means increasing the mean time to failure, 45.88 and 68.86, respectively. When analysing and comparing between the simulation and RBD results, it is possible to verify a specific deviation. As it was commented before, the difference originates from the assumption of independent machine behaviour in the RBD model, while in the simulation model, the effect of individual failure propagation is incorporated. Indeed, in the simulation model, whenever failure occurs, the operating time of all working machines will stop (wear stops because of failure propagation, and of course the reliability is not improved). During that time, the machine that failed is maintained; furthermore, for maintenance, actions shorter than the buffer spam failure will not spread on the system, and it will continue working normally. is feature is essential to conclude the incompatibility with RBD modelling, and that the VSB proposal is an interesting alternative to address the problem.

Analysis and Results 2: VSB Scenario.
When introducing the Virtual Standby effect, a more realistic availability estimation is obtained. Table 5 shows the main results for availability, operation time, expected production, and maintainability.
Again, bar screens and grit chamber are the incumbent equipment because of their availability and buffering capabilities.
e pump system and clarifier have increased their operating time thanks to the VSB because, as mentioned before, it considers buffers acting as actual working pieces of equipment and not just an accumulation of inventory as they have been considered until now. For the VSB scenario, the mean percentage availability for the system is 86.62%, in which during this available time 89.72% corresponds to the system actually working. is last indicator is most important since it allows for the scenarios to be compared. It is also possible to observe that the mean time to repair (1.72 hours) and frequency of failure (11.13 hours of functioning) have improved, which is reflected on the fact that produced effluent increased (+0.13 million gallons) along with availability (+0.71%). As a particular case, it was considered that the amount of time that takes to repair the primary equipment is the same as the time horizon used for the calculation of the percentage mean availability for virtual equipment (BS_002 and GRIT_002). In other words, said percentage indicates the relative amount of time where the virtual equipment operates supporting the primary equipment, which in this case 26.97% for the bar screen and 21.15% for the grit chamber.

Discussing Simulation Results and VSB Methodology Advantages
Comparing the results presented in Tables 4 and 5, it is possible to determine the effect the incumbent machines under a failure scenario are capable to provide, by themselves and for a limited time, granting continuity to the subsystem downstream.
To understand the differences between the simulation models (with and without VSB) is relevant to analyse key indicators such as the outcome of the VSB simulation for MTTR (1.72 hours) and frequency of failure (11.13 hours of uninterrupted work) that are higher than the values obtained in the immediate effect scenario (10.31 and 1.67, respectively), supporting the increased production (+0.13 million gallons) and availability (+0.71%) results. e VSB simulation model generates improvement in the reliability of the process (some specific detentions have no effect on the overall system). From the maintainability  Complexity point of view, the real downtime will be reduced or compensated according to the isolation time generated by the upstream system. erefore, it is important to study in detail each process to understand and find improvement opportunities.
When analysing the simulation results, it is possible to understand the strength of VSB proposal for complex systems. First, when a simulation model is built, the RBD model is relegated because of the supposition of machine independence that avoids the inclusion of the operation continuity effect. When the immediate effect scenario is compared with the results of VSB simulation, the positive impact of the proposal is evident, considering the precision addressed for reliability, maintainability, and availability indicators. Summarizing, it is possible to evidence the following VSB model advantages: (i) It incorporates dependencies between the machines of a process (ii) It evaluates the effect of the machines under a failure scenario or subject to delays, being capable to provide continuity to the subsystem downstream (iii) It adjusts the operational capacity to a specific process condition (iv) It has the flexibility to include buffering effects or self-autonomy, without complex modelling (v) In processes with low reliability level, the VSB model will have a high impact because virtual machines will be required with a higher frequency and the operational continuity downstream will be activated when each failure occurs (vi) In processes where the operational continuity effect is presented in many machines, the VSB model will be more efficient than other methodologies, considering limitations of the representation that RBD and Markov Chains have and the complexity of traditional buffer inventory level modelling, as explained in the Introduction section (vii) When including VSB proposal, the model will be more accurate in reliability and maintainability assessment, mainly due to the representation of more realistic operational conditions associated to the operational continuity and repair processes

Conclusions
Performance analysis must be an integral part of engineering and reliability assessment and operational management, controlling operating plants or evaluating newly designed projects, especially for complex systems. Simulation is a widely used method to estimate indicators such as performance on early stages of development, especially when features such as physical dependency, maintainability, and reliability among others can be embedded in the model. Evidently the most important outcome of this paper is the validation of the proposal methodology introducing the Two new subsystems (standby configuration) are recognized, representing the integration of the main and virtual equipment. e creation of these new subsystems is needed according to evaluation of the resilience operational impact over the indicators of interest. 2 is virtual equipment approaches the impact of resilience condition on the main system and subsystems. 10 Complexity VSB effect to improve accuracy when modelling an industrial facility and the development of a case study of a wastewater treatment process (primary treatment). e obtained indicators show that when using VSB, computed availability increases a 0.71% and consequently so does the produced effluent by the plant. ey also evidence critical equipment or possible bottlenecks due to maintainability and reliability issues. ese results are detailed in the Simulation Results section. As a summary, the results of the modelling allow the following: (i) Forecast performance of each equipment, subsystem, and overall wastewater treatment system (ii) To evidence the equipment with the poorest performance (iii) Track relevant incumbents on the outcome of performance, especially for reliability and maintainability (iv) Acknowledge the risk level (probability) for decision making processes (v) Evaluate the results for the scenarios, and determine the expected effect of VSB operational restriction Concluding, this proposal has developed an innovative probabilistic methodology to simulate, analyse, and evaluate quantitatively the Virtual Standby (VSB) impact on production performance. A case study in a wastewater treatment line was developed, and the model has allowed to determine different production levels based on VSB impact. It also encouraged the use of this model on the early stages of any project (design stage) to promote highly efficient investments and future productivity.

Data Availability
e data used to support the findings of this study are provided in the Supplementary Materials. Disclosure e research work was partially performed within the context of PhD research work of Pablo Viveros and Fredy Kristjanpoller at University of Seville. e research work was performed within the context of UTFSM Project-PI_LIR_2020_5.