Dynamic Probability Analysis for Construction Schedule Using Subset Simulation

Schedule management is an essential part of construction project management. In practical management affairs, many uncertainties may lead to potential project delays and make the schedule risky. To quantify such risk, the Probabilistic Critical Path Method (PCPM) is used to compute the overdue probability. Survey shows it could help project managers understand the schedule better. However, two critical factors limited the application of PCPM: computational efficiency and timeliness. To solve these constraints, we combined subset simulation and statistical learning to build a computationally efficient and dynamic simulation system. Numerical experiment shows that this method can effectively improve the computation efficiency without losing any accuracy and outperforms the other approaches with the same assumptions. Besides, we proposed a machine learningbased way to estimate task duration distributions in PCPM automatically. It collects real-time progress data through user interactions and learns the best PERT-Beta parameters based on these historical data. Our estimator provides our simulation system the ability to handle dynamic assessment without laborious human work. -ese improvements reduce the limitations of PCPM, making the application of PCPM in practical management affairs possible.


Introduction
Schedule management is a vital part of construction project management. Overdue projects will bring potential economic losses to the project. But, on the other hand, groundless period reductions will inevitably bring risks to the project. So, applying scientific management methods is particularly important to evaluate the project schedule and control the risk of delay.
Critical Path Method (CPM) is a widely accepted and well-used tool in planning, controlling, and scheduling construction projects. It can help the project manager to identify the critical path in the schedule and determine the project duration when all tasks' durations are determined [1]. However, in practice, there exists uncertainty and volatility in projects. ese will undermine classical deterministic CPM. To deal with this situation, lots of improvements were put forward to extend CPM to a nondeterministic environment. Unlike classical CPM, extended CPM computes the critical path on assumption that the task duration is nondeterministic, and it has to estimate the probability of project delay. is feature provides the project manager a quantitative way to evaluate the risk in project progress and makes extended CPM more appealing. Generally, major existing methodologies used in extended CPM could be classified into three types. e "most critical path" method was firstly proposed by Soroush in 1994 [2]. is method focuses on the "most critical path" problem and provides a heuristic to identify the near-optimal path to such a problem. Since this method only uses the "most critical path," rather than the entire schedule network, to compute the delay probability, it runs very fast and is more accurate than the classical CPM approach. is method provides a good perspective to understand critical path [3] as well as an extensible solution framework. Some researchers combined it with artificial intelligence to find the optimal critical path [4,5], and some extended it to fuzzy domain [6].
Unlike classical CPM, Fuzzy CPM (FCPM) represents an activity duration by replacing a real number with a fuzzy number and uses fuzzy operations to calculate the fuzzy critical path and the fuzzy project duration. By combining different kinds of fuzzy numbers and fuzzy operations, researchers put forward many constructive and useful methods. ese contributions resolved some critical defects of FCPM, like negative time and invalid fuzzy subtraction, and provide an efficient way to resolve the fuzzy critical path problem for a project network. Nowadays, many researchers are still working on its variants and verifying its effectiveness and practicability in practice [7][8][9][10].
Similar to Fuzzy CPM, Probabilistic CPM (PCPM) represents an activity duration by a random variable. is makes the completion time also a random variable. e ultimate goal of PCPM is to find out the probability where the completion time is higher than the given deadline and identify the critical path in a probabilistic context. Researches showed that because of its simple probabilistic interpretation, this methodology could be easily understood by project managers and help them to manage a project better as well as minimizing delay risks [11][12][13][14][15].
It is worth noticing that, though formulated in a simple way, PCPM is in fact hard to solve and is still under study.
Four key factors might explain such difficulty. First, PCPM is a high-dimensional integral of the probability density function on the failure domain defined by an implicit limit state function. Second, this limit state function is composed of several mutual-referenced and nested computation blocks, making it behave like a black box system. ird, since there are lots of nonlinear logical operands, like max and min, in nested blocks, the limit state function is also highly nonlinear. is fact makes some popular delegate models, like First-Order Reliability Method and Response Surface, fail to approximate the target function. Fourth, Direct Monte Carlo Simulation (DMCS) is time-consuming to reach an unbiased numerical estimation, which might be seemingly unpromising in practice. Considering all the difficulties described above, we suggest adopting a well-designed black box system reliability algorithm, which could solve a reliability assessment problem without caring about the details of our target system. Subset simulation is such an algorithm. It is a numerical method widely used in structural engineering [16][17][18] and soil engineering [19,20]. Existing research studies showed that subset simulation is computationally efficient in estimating small failure probabilities in highdimensional black box reliability problems. But this method is rarely used in project management. Given the similarities in implicity and nonlinearity between our system and existing researches, we think subset simulation will also work on the PCPM problem.
Apart from the computation efficiency problem, there is another important issue to address. Most researches mentioned above did not consider the timeliness of assessment. As a project goes on, the task duration estimations in the schedule will change. A feasible solution is to reestimate them manually, but it will bring lots of extra workloads, which could be avoided. To realize an automated real-time reliability assessment, we introduce statistical learning into PCPM. is method can handle the dynamic evaluation in the construction life cycle without much extra work. Based on historical progress data collected from management software, the duration distribution of a task could be estimated automatically using machine learning. is feature would make PCPM more useful in practical management affairs.
In this study, we will provide a new method, combining subset simulation and statistical learning to resolve the project delay problem efficiently and dynamically. e three primary research objectives are described as follows: (1) Introduce subset simulation-based PCPM and demonstrate its working process as well as implementation details (2) Illustrate how statistical learning can be plugged into PCPM, propose the cost function of this learning system, and derive the formula of task duration distribution given collected data (3) Verify the effectiveness of the proposed method and compare it with other methods e organization of the rest of this paper is summarized as follows. Section 2 provides a short introduction to Critical Path Method (CPM) as well as a well-designed algorithm solving Critical Path Problem and then formulates the Schedule Reliability Problem. is problem will be solved in Section 3, using subset simulation. is method is a computation-efficient algorithm to estimate small failure probability. We will demonstrate how to use this method to solve the Schedule Reliability Problem. After that, we will explain how we can apply statistical learning in our PCPM system in Section 4. By combining these parts, Section 5 will demonstrate our method as a whole and display its design and implementation. en, we used a hypothetical test case to compare our proposed method with other models in Section 6. Finally, Section 7 gives our concluding remarks.

Preliminaries
In this section, we will first formulate the Schedule Reliability Problem we want to resolve. en, we will introduce a well-designed algorithm, solving the critical path problem. is algorithm will work as an implicit subexpression of the Schedule Reliability Problem. Basics of Critical Path Method and Schedule Network are omitted for conciseness.

Schedule Reliability Problem.
A schedule does not always work as project managers expect. Sometimes it might fail. In other words, the real completion time of a project would exceed the time limit given by the contract. is is a disastrous thing we need to avoid. To quantify delay risk, we assume that the completion time of a project is a random variable. Given a deadline, the delay probability could be formulated as follows: where D � (D 1 , D 2 , . . . , D n ) T is a vector collecting the duration variables D k of every task; T c (D) is an implicit function transforming durations of tasks to the completion time; I(x) is an indicator function which takes 1 if x is positive; otherwise, it takes 0; p(D) is the joint probability density function of the duration variables. us, our goal is to compute the delay probability and ensure it is less than a given threshold, for example, 5%.

Modified Dijkstra Algorithm.
To solve equation (1), we need to solve the Critical Path Problem first. Since the critical path is the longest in a schedule network, we could compute the completion time T c and get the Critical Path using the Modified Dijkstra Algorithm (MDA). is algorithm is proposed by Ravi Shankar and Sireesha to find the Critical Path in Schedule Network [21].
MDA uses Activity-On-Node (AoN) representation and represents a schedule by a graph G, denoted as a tuple G � (V, E). V are the vertices in the graph, representing the tasks in the schedule. For each task v k , a vertex attribute d k is assigned, defining task's duration. E are the directed edges in the graph, representing the work dependencies.
MDA improves the classic Dijkstra Algorithm by adding a topological sorting [22] before computing the critical path.
is improvement could improve efficiency significantly, especially when repeat sampling is needed. After toposorting, the forward pass is applied to get the earliest start (ES) and earliest finish (EF) of each task. e completion time is the latest EF among all the tasks. Once completion time is computed, the backward pass is applied to get the latest start (LS) and latest finish (LF) of each task. Total Float (TF) is the difference between the Earliest Start (ES) and the Latest Start (LS) of an activity. A task is a critical task when its total float is zero. e pseudocode of MDA is given as follows, and note that the nodes in MDA are presorted in topological order, which satisfies two indexing constraints i < j∀i ∈ Node[j] · predecessors and i < j∀j ∈ Node[i] · successors (Algorithms 1 and 2).

Subset Simulation
is a nested and implicit function, it is hard to solve equation (1) analytically for most schedules. Instead, Monte Carlo Simulation is used to get a numerical solution, formulated as follows: However, Direct Monte Carlo Simulation (DMCS) is known to be computationally inefficient. To get a stable estimation, lots of samples are required, making the computation process time-consuming. To solve equation (2) efficiently, subset simulation is applied in our study.
Subset simulation was first proposed by S. K. Au to estimate the small failure probability of a structural system in high dimensions. Because of its efficiency, this method was widely used in structural engineering and soil engineering. e basic idea of subset simulation is to express a small probability as a product of a series of larger conditional probabilities by introducing intermediate failure events. Since intermediate failure events are much easier to solve, this decomposition could reduce unnecessary sampling and accelerate the computation. is process could be conceptually expressed as follows: where T (i) d is the generated time limit in ith iteration and there is T (n) d � T d in final iteration. According to the definition above, two critical questions should be answered: (1) How to choose the intermediate failure event In the following parts, these questions will be answered by Au's standard subset simulation algorithm. And we will illustrate how it could solve Schedule Reliability Problem.

Standard Subset Simulation Algorithm.
For simplicity, our study used S. K. Au's standard algorithm. In this algorithm, samples in each iteration are generated by Markov Chain Monte Carlo (MCMC) using seeds, and intermediate time limits and seed samples in iteration are chosen by a fixed threshold percentile [16].
ere are only two parameters used in standard algorithm: batch size N b which is the size of samples in each iteration and seed ratio p s which is the ratio between seeds and samples in each iteration.

Intermediate Event Selection. Suppose a batch of samples
using Modified Dijkstra Algorithm. en, we sort these times in a decreasing order. By selecting the samples with the top [p s N b ] longest completion time, we can ensure that the intermediate events where r � [p s N b ] and T (k) cr is the rth largest completion time among samples.

Conditional Batch Sampling.
e simplest sampling policy is the Metropolis-Hastings sampling, which is based on a symmetric proposal with q(D 0 )π(D 0 , D 1 ) � q(D 1 )π (D 1 , D 0 ). In our study, transition π(D 0 , D 1 ) ∼ N(0, σ) is used. We accept the move on Markov chain with an acceptance probability of α � min 1, Advances in Civil Engineering 3 where p(d) is PERT-Beta Distribution described in Section 4.1 and F is the intermediate failure event

Procedures. e pseudocode of standard algorithm is shown below (Algorithm 3).
is process is illustrated in Figure 1.

Dynamic Distribution Estimation
In this section, a historic data-based duration distribution estimation method is proposed. is method uses the PERT-Beta distribution as the probabilistic model. And it takes three assumptions to make the problem easier to solve. Finally, it uses gradient descending, a widely used optimization tool in machine learning, to estimate observational duration distribution.

PERT-Beta Distribution.
To make PCPM work, the duration of each task should be modelled properly. An empirical distribution is usually used when there is a lack of observational data. Some survey shows beta distribution is a suitable one [23][24][25]. is distribution requires three empirical parameters: most likely duration b, optimistic duration a, and pessimistic duration c of the activities. e process of defining these subjective values is called the PERT three-point estimation method. e density function of PERT-Beta Distribution is given by equation (6) and illustrated in Figure 2.
where parameters in beta function are determined by However, estimation is not a static thing. With the project going on, time distribution should be corrected and reestimated according to the status quo. e project manager could do these works but will spare lots of effort. It would be time-saving if we could estimate the distribution automatically.

Assumptions.
Statistical learning could be applied in our work to estimate distribution automatically. However, to build such a learning system, we need some basic assumptions to construct the cost function of this system. In our study, we make three assumptions about task progress and time distribution.
(1) Constant Working Rate. In project management, the relationship between time and remaining works, Advances in Civil Engineering burndown chart, a curve depicting is usually an S shape curve. But to simplify the computation, we assume that this curve is a straight line. is assumption works on optimistic, average, and pessimistic working rates r a , r b , and r c . Given any remaining progress p, the parameters could be computed using linear interpolation as follows: (2) Fixed Risk Preference. Parameters α and β control the shape of Standard Beta distribution, deciding its skewness, or in another perspective, a likelihood of a task finished in a short time. If a task is more likely to be completed in a shorter time, α is greater and β is (1) generate initial samples: f < p s (6) select seeds and intermediate failure event T (k) d using method in Section 3.2.1 (7) generate next batch of samples D (k+1) using method in Section 3.2.2 (8) else (9) break (10)    Advances in Civil Engineering smaller. To some degree, they reflect an implicit risk preference of a project manager. So, in our study, once α and β are chosen, they do not change anymore.
(3) Two-Step Estimation. ere is a growing uncertainty when we want to predict the future, while with progress going on, this uncertainty could be diminished. To depict this fact, we take a two-step method to estimate observational distribution. We first assume all the observations come from the "future" and use them to estimate the working rate at different conditions. en, these working rates are used to construct the final distribution using the current state.
ese assumptions are illustrated in Figure 3.

Statistical Learning.
Based on the assumptions before, we could derive the cost function of our learning system. According to equation (9), the likelihood could be expressed in Since α and β are assumed to be fixed and the working rate is constant, by taking the negative log-likelihood of observations, cost function L(T) could be derived as follows: us, our purpose is to optimize the loss function min r a ,r c L(T) and estimate three working rates r a , r b , and r c . Taking the partial derivative of loss function, we will have Using gradient descending, we could have numerical optima r * a , r * c using the update rule as follows: After determining r * a and r * c , r b could be computed using the definition of α; see equation (7).
With all three working rates r * a , r * b , and r * c computed, the final estimation could be determined. Suppose the current remaining progress is p c ; then the updated range (a 0 , b 0 , c 0 ) could be given as follows:

Design of Simulation System.
A detailed design of our system and the relationship between major classes are described in the class diagram in Figure 4. Some important members and methods of major classes are presented, but some trivial members or methods are omitted for lack of space.
(1) e class App is the entry class that handles all the interactions from users: we could configure the simulation using class Setting. e Setting class defines the time limit t d , batch size N b , and candidate ratio p s . And it affects the result and performance of subset simulation. We use class RecordManager to add a progress record to the database and estimate the duration distribution of a task. We use class SubsetSimulation to evaluate our schedule and get some useful indicators.
(2) e class RecordManager is the manager class that takes in the record inputs and stores them. Besides, it also provides the function to estimate the distribution of a task using historical data. is process is described in Section 4. And when a simulation begins, the estimated beta distribution parameters would be provided to samplers to construct a batch of samples. (3) e class SubsetSimulation is the entry class where our simulation happens. is class computes the failure probability and sets the average total float of tasks respectively. We described this process thoroughly in Section 3. SubsetSimulation will use MDA class to compute the Critical Path and completion time of the schedule. You can find the theory about MDA in preliminaries. (4) e class Graph is a helper class that represents a schedule network using an adjacency list. It contains the topological structure of the schedule network and could be used to find the successors or predecessors of any given task. e structure is stored in database separately, using class Task and Dependency. ese two classes correspond to two tables, respectively, using Object Relation Mapping (ORM). ese classes will generate three kinds of data:

Advances in Civil Engineering
(1) Persistent constant: these data cannot be changed and stored persistently in memory, like graph structure and history records (2) User-defined variable: these data are generated by user interactions, like configuration parameters of simulation (3) Intermediate variable: these data are generated by program and will only be used once, like beta distribution parameters and sample e data flow diagram explaining how data are passed between classes is shown in Figure 5.

Process of Simulation System.
e following process describes all the steps in our simulation system, including the creation of the schedule network, distribution estimation, and estimation for failure probability.
(1) Schedule Network Construction. Construct the schedule network using an adjacency list, and presort the network using Khan's algorithm to get the topological order [22]. In this step, the network could be validated; for example, circular working dependency could be detected. Besides, precomputed topoorder could save unnecessary computing time in subset simulation sampling. (2) Task Duration Distribution Estimation. Use empirical parameters and observational data to estimate the time distribution for each task. is step is thoroughly described in the previous section.
(3) Failure Probability Estimation. Estimate the failure probability of a given schedule network using subset simulation and check whether the answer is acceptable. Usually, a criterion is chosen (e.g., 1% or 5%). If the estimation is higher than this criterion, we have to consider adjusting our plan. (4) Schedule Adjustment. If failure probability is unacceptable, we have to find out the critical activities in the schedule like in classical CPM. By averaging the total float in samples, we could find those critical activities with small floats. We could allocate more resources to these activities to reduce the completion time and further reduce the failure probability.

Features and Limitations.
ere are three practical features in our proposed simulation system: (1) Schedule Evaluation. Using the algorithm in Section 3, we could use a failure probability to quantize the delay risk. If this probability is too high, a project manager needs to change the schedule or extend the deadline. To modify the plan, we have to find critical activities and lower their duration. To prolong the deadline, we can use the empirical distribution of completion time to select a proper deadline. (2) Critical Activities Detection. When there is a high possibility for a schedule to be late, critical activities could be located using a threshold of total float. If the total float of a task is lower than this threshold, it is a critical activity. is threshold describes the flexibility we can manage. A lower threshold will sift out fewer activities and usually requires a better management capability.
(3) Task Duration Prediction. When the records are stored, the estimation of task duration will also tell us the most likely finish date. We could compare this date to the planned finish date. If there is an unacceptable lag, this activity would be marked. And the system will alert the project manager. is process is automatic and data-driven and could improve project manager's efficiency and provide more information.
All these three features are implemented through two web pages, one for collecting progress data and one for visualizing and analyzing project progress; see Figure 6.
ough there are some practical features in our method, there are still some limitations we are striving to solve. First, the threshold may not be a proper way of finding critical activities. Since there is a possibility that an improper threshold may select all or none of the tasks, such a threshold would be unhelpful. is fact may make threshold selection a potential problem. Besides, the total float threshold only captures the average behaviour of a task. It could help reduce the risk of schedule while reducing the variance of task duration will also work. Currently, there is no indicator describing the relationship between the variance of task durations and the risk of schedule in our method.

Illustrative Examples and Computational Results
is section will verify the performance of our proposed method and compare it with DMCS, Soroush's LUBE [2], and Chen's FCPM [10] in terms of precision, efficiency, and stability.

Soroush's Benchmark Example.
In this part, we will use an illustrative example from Soroush's work.
is hypothetical test case consists of 21 activities shown in Figure 7. Our experiment also takes the assumptions in Soroush's work. e activity times are assumed to be beta distributed, and events v 1 and v 14 are the starting and terminal events, respectively. e optimistic time a k , most likely time m k , and pessimistic time b k for each activity are given in the parentheses beside that activity. To compare our work to Fuzzy CPM (FCPM), each activity's fuzzy duration is formulated in a triangular fuzzy number d k � (a k , m k , b k ) using the same parameters in PCPM.
For each method, we repeat simulations 20 times to get the mean estimation p f and its average running time t. And to compare the stability of numerical results, variation coefficients c v � σ p f /p f are also computed. As for the setting of sampling-based methods, 50,000 samples were used in DMCS, and a batch size of N b � 5000 and selection ratio p s � 0.1 were used to configure the Subset Simulation. Our  Table 1. Since Direct MCS and our method are based on sampling, generally they are slower than the one-pass method, like LUBE and FCPM.
is is a common shorthand of sampling-based PCPM, but subset simulation accelerates this process while nearly retaining the same precision and stability. is makes the application of PCPM possible. On the other hand, compared to the one-pass method, subset simulation-based PCPM reaches the lowest error compared to Direct MCS. is result shows that subset simulation could provide a more reliable solution. In all, subset simulation combines computation performance and efficiency. ese make it a probably good way to solve the probabilistic CPM problem.
By drawing the relationship between completion time limit and failure probability in Figure 8, we also find that the curve given by subset simulation is the closest one to the DMCS one, while there exists a systematic error in LUBE and FCPM. is observation could be explained by the fact that LUBE only focuses on one most critical path, while other paths may also lead to failure, leading to a higher failure probability. Fuzzy numbers and fuzzy operations cannot capture all the natural complexity of probabilistic modelling.
us, deviation will occur. In contrast, subset simulation could depict this relationship very well. is would provide the project manager a reliable decision basis.
Another important indicator, total float, is also computed. A precise estimation of float would help the project manager identifying important tasks and allocate resources to them. Figure 9 shows that the total float estimation given by FCPM is lower than the ones given by PCPM, which may lead to more resource demand and hurried plans, while subset simulation gives a similar result to DMCS, and numerical errors are acceptable.

A Schedule Network in Real Project.
In order to verify the performance of our algorithm in real practice, a schedule network from a real project containing 173 activities and 195 work dependencies is used. is project is a multifunctional building, including three parts: the main tower for office usage, the auxiliary building for business, and the basement for parking. e construction work is broken down in the following way: the basement is divided into 24 individual working segments each floor and auxiliary building is divided into 3 individual working segments each floor. Each segment is constructed floor by floor, which creates working dependencies. e division of our project and schedule network are illustrated in Figure 10.      e activity times of each working segment are given in Table 2.
e hardware and software environments of this experiment are consistent with those of the previous one. As we can see in Table 3, comparing with other methods, our estimation is the most closet to the DMCS result. And LUBE is still lower than ours, which will be explained in the following discussion. Another point worth noting is that with the growth of the schedule network's size, our method's computational efficiency advantage becomes more obvious compared with DMCS. And its running time is acceptable in real practice. Project managers can get more accurate estimates while waiting less time.
By drawing the relationship between completion time limit and failure probability in Figure 11, we also find that curve given by our method is the closest one to the DMCS one. Still, LUBE curve is under DMCS curve, which means it underestimates the overdue risk. is fact is caused by the inherent flaw of LUBE and would mislead project managers to make an excessively optimistic estimation, which may cause potential overdue risks. But our method solved this problem well: since our method will traverse the whole schedule network, rather than one path, our PCPM will give a comprehensive consideration which depicts the risk of the network as a whole, rather than only one "most critical" path.
is mechanism makes our method work well for the full range of time limit.

Discussions.
From the experiments above, we can see that although those one-pass algorithms could give results quickly, due to the inherent deficiencies of algorithm design, there will be systematic errors. For example, LUBE fails when the failure probability is relatively high. is could be explained easily by a very simple example. Supposing there is    a project containing N identical paths. And each path has an independent failure probability q f . us, we can easily get the failure probability of this project using some basic probability theory. e right answer should be p f � 1 − (1 − q f ) N . However, according to LUBE, the failure probability is defined by the "most critical" path. Since all the paths are identical, LUBE estimation is q f . Notice that So, LUBE estimation is always lower than the failure probability of the whole schedule. is difference is especially obvious when the time limit is relatively tight. In this situation, the failure probability of each path will increase. For 0 < a < 1, a N ≈ a only works when a ≈ 1, which means q f shall be very small. However, with a tight time limit, q f cannot be small, which makes LUBE fail to approximate. Besides if there is more than one critical path, according to equation (15), the difference between p f and q f will become obvious. But this situation is common in those projects where parallel construction is needed. Besides, in most cases, paths are inherently correlated. If a node is overdue, all the paths containing this node will be affected. ese observations tell us when dealing with the schedule network, it will be better to treat the schedule as a whole, rather than a group of single paths. ough we have to admit sampling-based PCPM takes time during sampling, it is still worth doing it, since it can provide a more accurate estimation. From the experiments above, we can find that subset simulation can accelerate the sampling process greatly, making the running time of PCPM acceptable.

Conclusions
is paper proposed a new PCPM based on a data-driven subset simulation to solve the Schedule Network Failure Problem in a dynamic way. is method could compute the failure probability efficiently and effectively without loss of any result accuracy and outperforms the other approaches with the same assumptions. Besides, another important contribution of our study is to plug a data-based task duration distribution estimation into PCPM.
is finding provides a more objective way to estimate task duration distribution, reducing the variance in project managers' experience and improving the knowledge sharing between teams. ese key features would provide a good foundation for applying PCPM in practical management affairs.

Data Availability
e data used to support the findings of this study are included within the article.