System Reliability Evaluation of Data Transmission in Commercial Banks with Multiple Branches

The main purpose of this paper is to assess the system reliability of electronic transaction data transmissions made by commercial banks in terms of stochastic flow network. System reliability is defined as the probability of demand satisfaction and it can be used to measure quality of service. In this paper, we study the system reliability of data transmission from the headquarters of a commercial bank to its multiple branches. The network structure of the bank and the probability of successful data transmission are obtained through the collection of real data.The system reliability, calculated using the minimal path method and the recursive sum of disjoint products algorithm, provides banking managers with a view to comprehend the current state of the entire system. Besides, the system reliability can be used not only as a measurement of quality of service, but also an improvement reference of the system by adopting sensitivity analysis.


Introduction
"The safest building of 99.999% reliability is ruined by a 0.001% small fire-True reason for breakdown of Facebook and Yahoo on February 25. " This is the cover title of the 1320th Taiwan Business Week [1].At 10:00 am on February 25, 2013, a fire broke out and power was cut on the account of equipment repair.The emergency diesel power supply failed to start due to the safety issue at hand as well as wear and tear.The building, where this incident occurred, was an important generator room for connecting Taiwan to 80% of its external bandwidth.Although the network-switching center passed TruSecure certification, a small fire caused a failure in the power supply, internet, safety, fire control, and air-conditioning areas of the building.This breakdown lasted 17 hours and several businesses, including commercial banks performing network transactions, were seriously affected.This disaster or accident reminds the administrators that stringent safety is critical to the data transmission.
Business enterprises, especially commercial banks, are well aware that accidents like small fires can result in system breakdowns.Therefore, according to the case mentioned above, even if a node (e.g., Internet data center (IDC) or bank branch) is extremely reliable, once the node fails it can have an effect on the normal operation of the whole system.The quality of system is becoming increasingly important to both enterprises and Internet service providers (ISPs).System supervisors strive for safe and stable network transmissions [2][3][4].Modern commercial banks provide diversified services, which focus on electronic transaction platforms that are safe, quick, convenient, and simple and are not constrained by time and distance.These electronic transaction platforms provide automated teller machine, personal financial, consumer financial, and business financial services, as well as various other electronic financial commodity services provided by network banks or mobile banks.The real-time services provided by these diversified financial commodities are dependent on the bandwidth of the electronic transaction data.
When the headquarters (HQs) of a commercial bank transmits transaction data electronically to its distant branches, its system should maintain a constant bandwidth to meet the demand of  units for each time unit (e.g., bit per second; bps).The ability to transmit data refers to the probability of data being successfully transmitted from a source node to a sink node.The probability that the system can transmit a specified demand is defined as system reliability (SR), which is a performance indicator used for measuring data transmission in service level agreement (SLA) [3,[5][6][7].Supervisors of a system often formulate their policies on demand based on past bandwidth use and in accordance with future growth demand in order to determine the expected capacity (i.e., capacity is always greater than throughput).Hence, before evaluating the SR, it is necessary to first ascertain the state of system.
Practical systems, such as internet, traffic, and power systems, can be constructed as a network consisting of nodes and edges.In these situations, each edge and node in the network may have a certain transmission capacity, and this is typically referred to as a data-flow network.Most data-flow networks consist of components (edges and nodes) with multiple states.Each edge or node may have different capacity states, which may partially be invalid or valid.Such a network is called a stochastic flow network (SFN) [5,[8][9][10][11][12][13][14][15].It is possible to gain an accurate view of this type of network and evaluate the quality of data transmission.Transmission edges may have several states in a system environment, and the probability distribution for each state capacity of a transmission edge varies due to equipment loss or environmental factors.For instance, in an internet, an edge consists of  physical lines, which have several capacity levels ranging from 0 (a complete malfunction) to  (the highest level of operation).Therefore, a computer network characterized by such edges and nodes also has stochastic capacities, making it a typical SFN.
The minimal path (MP) method is widely applied to evaluate the SR of an SFN [5,[16][17][18][19][20].By using the MP concept, we can generate all the minimal capacity vectors (-MP) that satisfy the given demand  [8,17,19,21].The probability distribution of different states for each edge is analyzed and summarized, and the SR is subsequently calculated by using the recursive sum of disjoint product (RSDP) algorithm in terms of -MP [8,19].Figure 1 summarizes the SR evaluation procedure.
Besides, the number of sink nodes (i.e., branches) also has a major influence on SR.In this paper, the SR of an SFN with a single source to multiple sinks will be discussed in an actual system environment [15,22,23].We mainly address the performance of electronic transaction data transmissions from HQs of the commercial bank to multiple branches.

Acronyms, Notations, Problem Description, and Assumptions
Acronyms and notations used in this study are introduced as follows. 2

Problem Description.
The main purpose of this paper is to evaluate the SR of the commercial bank in Taiwan.The bank rents telecommunication lines of ISP to transmit data based on its current business demands and future financial services developments.The system supervisor is able to determine the amount of line capacity to rent based on the demand for data transmission.
Due to the aforementioned factors, the capacity of each edge is stochastic according to a given probability distribution observed from the statistics data.Thus, the bank network consisting of such edges and nodes is also stochastic and can be modeled as an SFN.To obtain the probability distribution of edges, we can monitor the failures of each edge in a specified period, such as one day, one month, or one year.Over this time frame, the records of transmission disconnects should be gathered.In this study, we conducted a long-term observation of the system through one year of data collection.We then summarized the capacity probability distribution of the system connections in each state of each edge.The successful data transmission probability refers to the ratio of data successfully transmitted between two connected nodes of each edge in a specified observation period of time.It is one of the most important factors affecting SR.

Assumptions.
The structure of a minimal path is a sequence of nodes and edges from source  to sink   .For an easier expression, we constrain all edges as  1 ,  2 , . . .,   , where  is the total number of edges.Let  = (, , ) be a stochastic flow network, where  = {  | 1 ≤  ≤ } is the set of edges,  is the set of nodes, and  = ( 11 ,  12 , . . .,    ) with    (an integer) being the maximum capacity of each component   .The (current) state   of each edge   takes possible values 0 =  1 <  2 < ⋅ ⋅ ⋅ <    according to a given probability distribution from historical record.In particular, the value of   is determined by the number of transmission lines that operate successfully.In such a , data is transmitted through only one MP; no data flow will disappear or be created during transmission via such an MP.In the entity network, the capacities of each edge   will not influence each other because they are different transmission lines physically.Hence, the capacities of different edges (transmission) are statistically independent.Such a  is assumed to further satisfy the following assumptions.
(i) The capacity of each edge   is an integer-valued random variable which takes values 0 =  1 <  2 < ⋅ ⋅ ⋅ <    according to a given probability distribution.
(ii) Data flow in the system must satisfy the flowconservation law [24].
(iii) The capacities of different edges are statistically independent.

Network Model Construction
Based on the SFN theory proposed by Lin [5], a network model is constructed to describe the relationship between data flow, capacity, demand, and SR formulation.

Data-Flow
where ∑  =1 ∑ :  ∈ ,  , is the total data flow through edge   .Constraint (1) signifies that the total data flow through   cannot exceed the maximal capacity of   for  = 1, 2, . . ., .For convenience, let F  be the set of  feasible under .Similarly, any  is said to be feasible under  = ( 1 ,  2 , . . .,   ) if and only if it satisfies the following constraint: Hence, the notation F  denotes the set of  feasible under .

System
Reliability Formulation for Demand.Any  ∈ F  fulfills the exact demand vector  = ( 1 ,  2 , . . .,   ) if it satisfies the following constraint: where ∑   =1  , is the total data flow from HQs to branch   .Constraint (3) signifies that the data-flow vector must satisfy the demand .
A capacity vector  is said to meet the demand  if and only there exists at least one  ∈ F  that satisfies the exact .Let Ψ be the set of such .The SR denoted by   is defined as the probability that the data transmission can successfully transmit   units of data from the source  to each sink   .That is,   = ∑ ∈Ψ Pr ().

Theory of Minimal Capacity Vector for Demand
Enumerating all  ∈ Ψ and then summing up their probabilities to get the SR is an inefficient way [5].Instead, this paper adopts the concept of minimal capacity vector for the demand vector  to improve the computational efficiency for the SR evaluation.For convenience, let -MP denote such a minimal capacity vector for .[5,9,10], disjoint subset method [9,18], state-space decomposition method [17,24], or RSDP.The RSDP algorithm proposed by Zuo et al. [19] has been proven to be superior to the inclusion-exclusion rules, disjoint subset rules, and statespace decomposition rules.RSDP algorithm is an efficient recursive algorithm for probability evaluation based on the sum of disjoint products (SDP).This method is more efficient than the existing algorithm when the number of components of a system is large.RSDP algorithm provides an efficient, systematic, and simple approach for evaluating SFN system reliability [8,19,20,23,25].Thus, the RSDP is used to calculate the SR in this paper.Proof.Let  be a -MP and an  ∈ Φ be feasible under .Suppose to the contrary that there exists an edge   such that   >   ≥ ∑  =1 ∑ :  ∈ ,  , .Set  = ( 1 ,  2 , . . .,   ), where   =   and   =   for all  ̸ = .That is,  < .Because of ∑  =1 ∑ :  ∈ ,  , ≤   for each   ,  is feasible under .This signifies that  ∈ Ψ and  <  that contradicts the fact that  is a -MP.The proof is completed.

The Property of the
Based on the lemma, each  transformed from an  ∈ Φ is treated as a -MP candidate.Hence, each -MP candidate must be checked whether it is a -MP or not.

System Reliability Algorithm Development
According to the above-proposed model and theory, the following algorithm is developed to evaluate the SR.
Step 1. Find all data-flow vector  satisfying the following constraints: , ≤    for  = 1, 2, . . ., . (5 If there is no  satisfying the constraints, then   = 0 and quit the algorithm. Step 2. Transform each feasible  from Step 1 into the corresponding capacity vector  via the following equation: (3.4)If   <   , then   is not a -MP,  = ∪{}, and go to step (3.7).Else if   ≥   , then   is not a -MP, and  =  ∪ {}.

Case Study and Analyses
Developments in information technology, the growing presence of global communication systems, and improvements in information security have driven frequent and regular electronic transactions by banks.To satisfy the demands of diversified financial commodities, banks have to depend on additional hardware and software.The bank's cores systems, branch terminal systems, and mobile systems, as well as financial service items like account openings, account transfer, remittance, inquiries, loans, security, insurance, and financial management can all be performed by bank staff and customers all over the world.
In this paper, one commercial bank of Taiwan is studied in order to investigate the quality of its data transmission in the absence of time and distance constraints, as modern commercial banks provide diversified, round-the-clock financial services.Because the banking industry holds "information security" in the highest regard, connection failure or dataflow monitoring records related to the study are highly confidential and are controlled by the company.We first focus on collecting real data related to the capacity probability distribution for each edge.The data was collected from January 1, 2012 to December 31, 2012, which is a time span that is rather valuable for this paper and for further research related to this topic.Hence, a network structure diagram of the bank is constructed.The probability distributions of each edge capacity are subsequently defined according to the collected data.Finally, utilize these materials to assess the performance of electronic transaction data transmissions among HQs of the commercial bank with multiple branches.

Network Structure of the Bank in the Case Study.
The bank studied in this paper has been established for more than 20 years and has over 100 branches.It is chosen above other banks because of the fact that it rents a multitype bandwidth from an ISP.The administrator of information department takes information security, business requirements, cost considerations, bandwidth management, and backup system seriously.As long as the banking system remains connected, banking staff and customers are able to carry out electronic transaction services at any time and from anywhere.According to the collected information of the studied bank and attempt to describe the network structure concept graphs (Figure 2 shows an illustrative structure), it is evident that the bank employs three distinct methods for transmitting data: point-to-point leased line (LL), point-tomultipoint virtual private network (VPN), and asymmetric digital subscriber line (ADSL).
The LL is mainly used for connecting the symmetric bandwidth to two important nodes.The ISP provides an optical backbone bandwidth to administer a safe, stable, and high-quality private system.The bank uses LLs to link HQs to the information & communication center (ICC), HQs to JN, and the ICC to NK or regional IDC to   .This transmission design offers a bandwidth of 80 Mbps, with two states (successful or failure).
The second design of transmitting data is through a VPN.The bank rents Ethernet system for use in its intranet.For example, the data transmission between local branches and NK are linked through regional IDC of ISP, the regional IDC and NK are linked through a VPN at 20 Mbps to 40 Mbps, and local branches and the regional IDC are linked by 4 Mbps * 2 lines.VPN connections include three separate states: complete success, partial success, and complete failure.
The final connection design employed by the bank is ADSL.This is used to link JN to all branches.ADSL offers guaranteed 4 Mbps bandwidth to connect JN with all the branches of the bank.
The network structure concept of the bank is clearly outlined in Figure 2.However, in order to obtain the SR of single source to multiple sinks, we observe the case that data from HQs to five branches are transmitted simultaneously.In order to aid in the understanding of SR and calculate it more easily, we design a new network structure chart (Figure 3) to show the case with a single source to multiple sinks.

Capacity and Probability Distribution.
Edge capacity refers to the data transmission capacity between two nodes of each edge.Different states of data transmission capacity can exist on each edge.For example, the studied bank has a typical SFN and it is possible for each edge to have several states.How is the probability distribution of each edge obtained for its different states?This involves calculating the invalid probability distribution of each edge by using bank's disconnection and data-flow monitoring records from 2012 to obtain the total number of hours that each edge was disconnected in various states.These values were then divided by the total number of hours in one year (8,760).We call this the probability distribution of the edge capacity.The valid probability distribution was finally obtained through statistics.Based on previous descriptions, we suppose for our study that the bank HQs wants to transmit data to five critical branches.The probability of a successful transmission by each  edge is summarized and listed in Table 1, which shows the probability distribution of each edge capacity.

Analysis of the Case Study.
In this study, we focus on five critical branches, as other branches are not listed temporarily because their capacity was low with respect to the top five.The SR of individual branches when HQs transmits data to the five branches and the overall SR when HQs transmits data to the five branches are compared based on their differences.Two measurements, single source to single sink and single source to multiple sinks, are both addressed here to evaluate the SR of the bank.

Single Source to Single
Sink.We first intend to transmit data from bank's HQs to a single branch (e.g.,  1 ).In this case, single source to single sink can help administrators to understand the transmission quality provided by the HQs to each branch.A representation of the data-flow network is shown in Figure 4. To satisfy the guaranteed bandwidth of 4 Mbps for each branch, the demand is denoted as  = 4.
In this example, three 4-MP, say  1 = {0, 0, 0, 1, 0, 0, 1},  2 = {0, 1, 1, 0, 1, 1, 0}, and  3 = {1, 0, 1, 0, 1, 1, 0}, are generated by the proposed algorithm.Each derived 4-MP can provide a minimal capacity to meet the required demand.The SR is calculated to be 0.9999697106 in terms of the 4-MP by using the RSDP algorithm.In the same method, we calculate the corresponding SR of HQs to each of the five critical branches.Given that the demand of each branch is 4 Mbps, the SR of the data transmitted to each branch is not the same because the probability distribution of each edge is different.The results of data transmission can be seen in Figure 5.When data is transmitted from the HQs to each branch, it is obviously affected by the probability of successful transmission of each edge.Generally, when the successful transmission probability of an edge is higher, the SR of the entire system is also increased.For example, the SR from HQs to  2 is 0.9999767555 and the SR from HQs to  5 is 0.9999967998.

Single Source to Multiple Sinks.
We have a clear understanding of the data transmission from the HQs to each branch.The SR of single source to single sink, however, does not provide sufficient information to understand the transmission quality of the entire system.It is necessary to canvass a further understanding by evaluating the SR of single source to multiple sinks.Given that HQs need to transmit data to five branches simultaneously, to satisfy the guaranteed bandwidth of 4 Mbps for all branches, the demand vector is denoted as  = ( 1 ,  2 ,  3 ,  4 ,  5 ) = (4,4,4,4,4).The SR of the entire system is calculated to be 0.9999424761 by utilizing the proposed algorithm.In a straightforward concept, the SR of data transmitted from HQs to its five branches simultaneously should be equal to the system reliabilities of the transmission lines from HQs to each of the five branches multiplied together.However, multiplying the SRs from each of the branches resulted in an overall SR of 0.9999346872.This value differs from the SR calculated from transmissions made from HQs to the all five branches simultaneously.This is mainly because each branch has a different transmission line, but the main lines ( 1 to  4 ) are shared.When HQs simultaneously transmits data to its five branches, the bandwidth required by the main line ( 1 to  4 ) is 20 Mbps, which is clearly higher than the demand (4 Mbps) by the main line when HQs transmits data to each branch separately.Clearly, SR is affected.As a result, we chose to use the single source to multiple sinks mode because it could reflect the true SR of the entire system.According to the results analysis of the SR, not only could we ascertain the transmission quality of the entire system, but it could also provide ISP with the SLA level stipulated in its contracts (e.g., golden level: transmit rate ≥99.9%).

Discussion of the Case Study.
In this paper we research the case study by commercial banks.The result of the real example validates the proposed model can be applied to other commercial banks that have big data to be transmitted among branches.That is, we extend the original model with single sink to multiple sinks.The method is suitable in other enterprise with similar network or multiple branches such as the stock exchange network, and electronic ticket network, the E-business network.It can not only apply to the inside network in the enterprise, but also even more apply to the network environments of B2B, B2C and C2C.According to the research method of this paper we can calculate the system reliability fast.

Conclusion
This paper examines a case study of the Taiwan commercial bank to evaluate its data transmission performance.The corresponding SFN is constructed to calculate the SR in order to assess the demand satisfaction of the data the bank transmitted.Based on the resulting assessment and analysis of the SR, we now not only understand the existing system environment, but also can use the results as an important reference indicator for future system structure planning and signing SLA agreements.Obtaining a comprehensive view of the bank's entire system is crucial in this case study.If we only judge the transmission demands of specific branches, this could lead to an infrastructure that only meets the demands of a specific branch and neglects the transmitting capacity of the entire system.By evaluating the SR for the entire system, we are able to gain clear and full comprehension of the system's structural transmitting capacity and use it as a key indicator for making management decisions.According to the proposed method, when HQs transmits data to each branch, calculating the resulting SR is relatively simple.That is, the proposed SR calculation method for commercial banks is workable and effective.Business organizations modeled by an SFN can also utilize the proposed method to quickly calculate their SR and use it as a reference to help improve their system environment improvement and aid in future planning.This study opens up several avenues for future research for example, by examining cases, where the IDC is subject to a failure, by studying the effects on an entire system environment when the backup line is added, or by further adding to the sensitivity analysis method to observe changes in SR.The sensitivity analysis is mainly based on assuming that the capacity of an edge is perfectly reliable (i.e., with probability 1.0) at a time; then recalculate the system reliability.According to the recalculated system reliability, we can find the most important edges that improve the system reliability most.The sensitivity analysis will become one for reference of decision maker.These varying conditions can all be used as references for further research, as it would be interesting to see whether they will directly affect SR or not.

Figure 2 :
Figure 2: An illustrative network structure of the studied bank.

T 5 Figure 5 :
Figure 5: The SR of data transmission from HQs to different branches.The computer model: Lenovo ThinkPad Edge notebook.The computer allocates: Intel(R) Corei5 CPU U470 @ 1.33 GHZ * 4 core, 8.0 GB RAM.

Table 1 :
The probability distributions of each edge capacity.