Subway Tunnel Disease Associations Mining Based on Fault Tree Analysis Algorithm

Monitoring and control of subway tunnel diseases throughout operation determine whether the operation of the subway is safe or not. In order to ensure operation safety, in-depth analysis of tunnel disease risks must be conducted. We constructed a fault tree based on tunnel diseases of Shanghai Subway at first. Using the subway tunnel maintenance work data, we calculated the probability of occurrence of elementary events of the fault tree, conducted quantitative calculation and analysis on the tunnel diseases, and found major diseases of the tunnels and their causes in light of the calculation results. Then, indicated by the precise fault tree analysis (FTA) we conducted, common tunnel diseases mainly include large passenger flow, shortage of maintenance personnel, maintenance error, personal carelessness, hot weather, and poor lighting. Analysis was conducted on the probability importance of elementary events of the tunnel diseases as well. In the end, we proposed the tunnel disease association rulemining algorithm based on the support degree. Via the calculation of association among major diseases, we explored the elaborate association mechanism of the diseases. The in-depth mining on the association mechanism can provide theoretical support and decision support for prevention and comprehensive control of the tunnel diseases and lay a solid foundation of practice guidance for subway operation safety of megacities.


Introduction
Against the backdrop of great development of rail transit construction of China, subway tunnel diseases are getting worse and are commonly seen in the operating tunnels.The association mechanism analysis of the diseases will be conducive to maintenance of the tunnel diseases and improvement of the subway operation safety.The development of the tunnel diseases is a gradual course.To detect the tunnel diseases before bad consequences are caused and take measures in time can greatly improve the operation safety of the subway tunnels.So far, the subway management department has detected the tunnel safety indicator data every year and collected a huge amount of data.However, these data are basically unused and have yet to play an effective role in the prediction and control of the diseases.Some traditional management systems just made some simple query and statistics on these data and have not dug into the hidden disease association mechanism of these data.Although the management department invests lots of manpower, material resources, and funds into maintenance and control of the tunnel diseases every year, great improvement still has not been seen in the tunnel disease condition; sometimes even fatal accidents may be caused.This has drawn attention of the relevant departments.To this end, it is necessary to conduct in-depth analysis of the subway tunnel diseases and their association mechanism.

Literature Review
Zhang [1] analyzed the tunnel diseases by means of FTA, found out the impact on the tunnel operation safety as the bottom events, and figured out the importance of relevant cut sets of any one bottom event in the fault tree system.By means of grey relational analysis (GRA), he drew the association degree of any minimum cut set, proposed the bottom events in need of attention according to the association degree, and gave the specific management improvement advice.This method designed the FTA method from the perspective of reliability and is of great significance for practice guidance as applied to the analysis of tunnel diseases.Ling [2] proposed a mixed analysis method for the dynamic fault tree, modularized it, and separated dynamic and static modules as categories of gates.He also improved the existing static fault tree and converted it into the composition method of binary decision diagram (BBD) and directly inferred the minimum cut set of the static tree module.In terms of the categories of dynamic gates, the cut set sequence is defined.The dynamic fault tree module is solved according to the cut set sequence.The results of these two parts are combined to obtain the simplest cut set sequence of the whole fault tree.This algorithm can quickly and effectively complete the qualitative analysis of the dynamic fault tree.Luo and Gao [3] proposed an existing line subway tunnel disease grading method and evaluation system based on "power exponent" method.They calculated the impact of a single disease on the tunnel structure in the different development dimensions by means of different power exponents, graded the single tunnel disease, and realized the comprehensive analysis from single factor to multiple factors via the power exponent operation, which provides the basis for integrative grading of the tunnel disease and evaluation of the overall health state.Gao and Zhang [4] analyzed the disease of existing line subway tunnels and proposed corresponding treatment method according to the disease causation and stage.They studied the treatment measures and supporting equipment for several types of common diseases and, on this basis, proposed the corresponding treatment equipment supporting system for the treatment measures of these tunnel diseases.Wei and Su [5] proposed a high-performance association rule mining algorithm FP-Growth based on FP-tree.They dug into those data in high volume but with sparse data items and applied the new algorithm for mining the association of the subway tunnel diseases.Wei and Su conducted relational analysis on disease data of 343 key tunnels out of 2787 tunnels administered by Chengdu Railway Bureau and excavated the hidden association mechanism of the tunnel diseases.Ding [6] designed dispatching fault log management and analysis database system (DFLMIS), which contains almost all kinds of accidents that have occurred in the operation of Metro.The research and design of DFLMIS would be of great help to the Metro operators to identify the risk and promote the safety management level.
Many researches have been done on the tunnel diseases and safety-related problems at home and abroad.Foreign scholars mainly lay emphasis on the periodic maintenance and repair of the tunnels, while scholars in China laid particular stress on prevention.Most of researches are confined to a specific type of disease and a certain engineering condition and seldom go deep into the tunnel disease mechanism.Therefore, reliability and operability of prevention measures present defects.In addition, for lack of reference to the historical disease data, these researches are not systematic, scientific, and in-depth enough, so that the disease treatment process is comparatively underdeveloped.
To date, subways have been through the rapid development stage.Tunnels in operation have a variety of diseases of different types and degrees, which cause grease risk for safety operation.With the progress of technology, descriptive analysis on the tunnel diseases which only depends on traditional qualitative indexes cannot satisfy the threshold value determination of maintenance.It has to formulate the maintenance standard so as to provide theoretical and practical references for guiding maintenance practice of the subway tunnel diseases.This paper tried to explore a great amount of extant tunnel defect data by systematically analyzing the subway tunnel diseases and found out the association mechanism hidden in the tunnel disease data.This can provide scientific basis and strong decision support for daily maintenance, disease detection, and disease treatment of the subway tunnels.

Construction and Quantitative Calculation of Fault Tree of Subway Tunnel Diseases
By investigating operation and maintenance data of Shanghai Subway tunnels and surveying frontline technicians, we drew an accident tree of the tunnel disease fault mode and conducted corresponding quantitative calculation according to the causation logic relation of the subway tunnel diseases.

Design of Tunnel Disease Accident Tree
. By reading many documents [7,8] about tunnel diseases in the operation and maintenance period and the disease treatment method and detecting and investigating 32 tunnels of 14 lines of Shanghai Subway, the data distribution of part of disease causations is shown in Figure 1.
Via basic operations like preprocessing fault log data of the subway tunnel operation and elimination of redundancy, we obtained the major data of the tunnel diseases and calculated respective occurrence frequency, as shown in Figure 1.
Checking around and looking up the fault transmission mechanism of relevant tunnel diseases, frontline technicians analyzed the logic relationship of the elementary events of the tunnel diseases.Based on these, the fault tree model of the tunnel diseases in the operation is constructed, as shown in Figure 2.

Quantitative Calculation of Shanghai Subway Tunnel
Diseases Based on Fault Tree

Top-Event Occurrence Probability of Tunnel Diseases.
For the subway tunnels under operation, the top events of the accident tree refer to the subway operation accidents caused by tunnel damage due to the tunnel diseases, including plugged-in blackout, ATC dysfunction, large-scale train delay, and jamming caused by sudden increase of passenger flow.We analyzed the logic relationship of the fault tree of the tunnel diseases and applied the fault tree simplifying law based on Boolean logic operation with respect to the topevent occurrence probability of the accident tree, and the topevent occurrence probability of the accident tree is calculated as shown in Figure 2, and the values of elementary events are shown in Table 1.

𝑇 = 𝑀
It can be indicated from analysis and simplified calculation of the accident tree that, after the subway tunnels are put into operation for some years, the occurrence probability of faults will be high, and the subway tunnels need to be periodically monitored.In addition, the internal association and the impact mechanism among the causations of the tunnel diseases are in urgent need of in-depth mining and study, or else the prevention of the tunnel diseases cannot be rapidly and accurately implemented in place.

Minimum Cut Set of Tunnel Diseases.
The cut set is a set of elementary events giving rise to the top events.The minimum cut set is the minimal cut set that causes the top events.Acquiring the minimum cut set can find out diverse causation combinations of accidents and the risk level of the as well.Each minimum cut set is a possible mode of occurrence of the top events.The number of minimum cut sets determines the number of possibilities of top events.The more the minimum cut sets are, the more risks the system has.It can be seen from the minimum cut sets directly which events are the worst, which are worse, and which are negligible and how to take measures to reduce the occurrence probability of accidents as soon as possible.
Calculating the minimum cut sets of the accident tree of tunnel diseases in Figure 2 quantitatively can yield the major fault mode   of tunnel diseases: The minimum cut set calculated is the minimum causation combination of occurrences of the top events of the tunnel diseases.Occurrence of the top events on the accident tree must be the result that the elementary events in a certain minimum cut set occur concurrently.Once accidents take place, all possible ways that cause accidents can be immediately located and the major cause of accidents can be found out as soon as possible.However, in respect of prevention and control of the tunnel diseases, only by the causation combination of accidents, the risk cannot be well controlled, and analyzing the importance of the elementary events is still needed; meanwhile, more specific prevention measures should be taken.

Importance Analysis of Elementary Events of Tunnel
Diseases.The importance analysis of the accident tree mainly refers to structural importance, critical importance, and probability importance [9].With the analysis characteristics of the subway tunnel diseases, the probability importance is as shown in formula (2), which can visually reflect the importance of the elementary events of the tunnel diseases.() is used to express the top-event occurrence probability.
Substituting the elementary events in Table 1 and their occurrence probability into formula (2) yields the probability importance of the elementary events.(3) The calculation results of the probability importance of the elementary events are shown in Table 2.
Referring to the relevant documents and combining with the practice of prevention and control of the tunnel diseases, this paper took the elementary event of   () > 0.4 as the mining object of the association mechanism of the tunnel diseases: 17 : high temperature;  2 : unallocated personnel;  1 : personnel slack;  12 : bad weather  13 : lamp fault;  4 : maintenance error Through calculation of the occurrence probability of the tunnel diseases and importance analysis of the elementary events, we obtained the major object for mining the association mechanism of the subway tunnel diseases.

Association Mechanism Mining of Subway Tunnel Diseases
In Section 3, we have calculated the minimum cut set and the probability importance of the elementary events for the tunnel disease data and obtained 6 elementary fault causation events, which indicated that the association mechanism mining of the tunnel diseases is of great importance.

Association Support of Causation Events of Subway Tunnel
Diseases.First, we conducted elementary-event fuzzy discretization based on the fault tree to obtain the elementary event sequence  = ( 1 ,  2 ,  3 , . . .,   ); then the intermediate events in width  which are disintegrated from the top events can be obtained:   = ( 1 ,  2 ,  3 , . . .,  +−1 ), and a series of subsequence  1 ,  2 ,  3 , . . .,  +−1 , in width  can be formed by single-step slip, as shown in the following formula: (, ) is taken as a point  −  + 1 in the dimensional Euclidean space  and randomly classified into the category of , the center of each category is calculated as the representative of each category, the elements    = 1, 2, . . ., −+1 of the set (, ) are calculated, and the membership attribute function   (  ) of the th category of representatives is as shown in the following formula: (1/(−1)) where  > 1 indicates the constant that can control the fuzzy degree of the elementary events.‖  −  ‖ 2 indicates the square of the distance from each point to the representative point of the th category.
When in calculation of the support of the elementary events for fault occurrence of the top events, make    → .,  ∈ { 1 ,  2 , . . .,   } indicates that the elementary event  takes place, so does the top event , where the occurrence frequency of the elementary event  can be indicated by the following formula: where   (  ) is the membership of the point   for the elementary event .
The support  of the association rule    →  is as follows: where (  | ) indicates the occurrence probability of the top event  under the normal operation condition  after the elementary event  occurs; the -means method of the reference rule    →  is shown as follows: where () denotes the occurrence probability of the elementary event ; the right side of formula (8) presents the information transmission and evolution process from a priori probability (  ) to a posteriori probability (  | ) [10].

Association Mechanism Mining Algorithm of Subway
Tunnel Diseases.Over years of operation, many prevention data of the tunnel diseases have been accumulated, and the data volume of oracle has exceeded 1 million.Based on the quantitative calculation of the minimum cut set of the elementary events of tunnel diseases in Section 3.2.2 and the probability importance of the elementary events in Section 3.2.3,we obtained 6 elementary events that cause diseases and established the tunnel disease algorithm to mine the association mechanism among the elementary events of the tunnel diseases, hoping to acquire the causation mechanism of the tunnel diseases, control the potential risk of diseases in advance, and effectively reduce loss caused by the tunnel safety accidents.The tunnel disease algorithm is shown as follows.
Step 1. Screen the tunnel diseases and maintain the recording database by SQL statements to obtain the disease elementary event sets reaching a certain threshold value.
Step 2. Conduct relevant quantitative calculation of the fault tree based on the tunnel diseases to obtain the minimum cut set   of faults and the importance   () of the elementary events and determine the elementary event combination larger than the threshold importance.
Step 4. Initialize the tunnel disease fault passenger flow database according to the mining condition, scan the event database  ID to find out all item sets with length of  =  and form an alternative 1-item set  1 , substitute formulas ( 6) and ( 8), calculate the support of each item, subject it to comparison with the minimum support, and form the frequent -item set   with the support larger than minsupport.
Step 5. Generate the alternative item set  +1 of the alternative ( + 1) according to the frequency  item set and enter the next step if  +1 ̸ = Φ, or else stop the cycle.
Step 6. Scan the database so as to calculate and determine the support of each alternative item set.
Step 7. Delete the alternative items with the support smaller than minsupport to form the frequent item set  +1 of ( + 1) items.
Step 9. Acquire the association characterized data of the elementary events of the tunnel diseases.
Step 10.Convert the association characterized data into the practice operation rule so as to guide the prevention and maintenance of the subway tunnels.

Association Mechanism Mining of Shanghai Subway
Tunnel Diseases.By investigating the original disease data of 32 tunnels of 14 lines of Shanghai Subway, we found that the same tunnel may have more than one disease type.Therefore, we firstly found the tunnels with the disease type larger than

Database field Field meaning tunnelname
The name of tunnel crackitemid The id of tunnel line The line of the tunnel Table 4: Data records tunnel-item-y of tunnel diseases.

Database field Field meaning tunnelname
The name of tunnel crackitemid The id of tunnel line The line of the tunnel crackgrade The grade of tunnel disease checker The name who checked the tunnel checkyear The year when the tunnel was checked a certain threshold and set the threshold at 3 as appropriate through practical verification, which is reflected in the SQL statement.Before data entry, we designed the tunnel disease storage database.The structure of the database is shown in Tables 3 and 4.
In oracle database, data are screened and preprocessed by using the SQL statements: Through processing, we have 1310 tunnel disease records in total, read the cause names of the elementary events, tunnel lines, and segment numbers of the 1310 records, and set the data mine codes, shown in Table 5.
The association mechanism mining can reflect the action mechanism among the elementary causation events better than the minimum cut set of the accident tree.From the association mechanism mining result of the tunnel diseases, it can be seen that most subway tunnel diseases are due to diversified causation combinations.One of the diseases always gives rise to occurrence of another or several diseases, which presents as an endless chain.Only when some important links (points of the elementary events with high probability importance) in the endless chain are controlled in time can worsening of the accidents be well controlled.

Discussion and Conclusions
The disease risk management in the tunnel operation period is the key factor in guaranteeing the subway operation safety of megacities.At present, most of the subway tunnel disease treatment methods are initiated in line with the loss level of diseases, and the diseases are not treated comprehensively and circularly based on the association among diseases.Therefore the treatment effect and efficiency are not satisfied.If concurrent treatment in line with the association mechanism of diseases can be applied, treatment efficiency will be definitely improved.By constructing the fault tree analysis model of the tunnel diseases, quantitative calculation of the minimum cut set and the importance of the elementary events can be conducted on the fault tree, the association mechanism mining algorithm of the tunnel diseases is designed, and the association between the diseases is found.All these can provide better theoretical support and decision support for the prevention and overall treatment of the tunnel diseases.

Figure 1 :Figure 2 :
Figure 1: Data distribution of part of tunnel diseases.

Table 1 :
List of elementary events.

Table 2 :
Probability importance of elementary events.

Table 3 :
Original information tunnel-item-x of tunnels.

Table 5 :
Mapping table of tunnel disease codes.