A Slice-Based Change Impact Analysis for Regression Test Case Prioritization of Object-Oriented Programs

Test case prioritization focuses on finding a suitable order of execution of the test cases in a test suite to meet some performance goals like detecting faults early. It is likely that some test cases execute the program parts that are more prone to errors and will detect more errors if executed early during the testing process. Finding an optimal order of execution for the selected regression test cases saves time and cost of retesting. This paper presents a static approach to prioritizing the test cases by computing the affected component coupling (ACC) of the affected parts of object-oriented programs. We construct a graph named affected slice graph (ASG) to represent these affected program parts. We determine the fault-proneness of the nodes of ASG by computing their respective ACC values. We assign higher priority to those test cases that cover the nodes with higher ACC values. Our analysis with mutation faults shows that the test cases executing the fault-prone program parts have a higher chance to reveal faults earlier than other test cases in the test suite. The result obtained from seven case studies justifies that our approach is feasible and gives acceptable performance in comparison to some existing techniques.


Introduction
In the software development life cycle, regression testing is considered an important part.This is because it is essential to validate the modification and to ensure that no other parts of the program have been affected by the change [1].Regression testing [2][3][4][5][6] is defined as the selective retesting of a system or component to verify that modifications have not caused unintended effects and that the system or component still complies with its specified requirements [2].Therefore, this paper follows a selective approach [3][4][5] to identify and retest only those parts of the program that are affected by the change.Thus, it is even more important to improve the effectiveness of regression testing and reduce the cost of test case execution.Therefore, in this paper, we focus on test case prioritization (TCP) of a given test suite .Test case prioritization problem is best described using the example in Table 1.If the test cases are executed in the order {1, 2, 3, 4, 5, 6}, then we achieve 100% coverage of faults only after the sixth test case is executed, whereas if the ordering of the test case execution is changed to {6, 5, 4, 1, 3, 2}, then we achieve 100% coverage after the execution of the fourth test case.Therefore, finding the order of test case execution is essential to detect the faults [7][8][9] early during regression testing.All the existing approaches target finding an optimal ordering of the test cases based on the rate of fault detection or rate of satisfiability of coverage criterion under consideration.However, these existing techniques [3,6,[10][11][12][13] were primarily developed to target procedural programs.Very few existing works [14][15][16][17][18] focus on the test case prioritization for object-oriented programs.This paper presents a static approach of prioritizing the test cases of object-oriented programs.It is reported that a module having high coupling value is more erroneous than other modules [19,20].As a result, a test case that executes a module with high coupling value will reveal more faults than other test cases.Many techniques and metrics [21] exist to measure the coupling value of the program segments [19] and establish these values as an indicator of fault-proneness [20].None of the prioritization techniques available in the literature have reported the use of coupling measure to prioritize the test cases.Thus, this paper uses the coupling value of the affected program parts for prioritizing the selected test cases for regression testing.
% of faults detected by two sample test case orderings 1, 2, 3, 4, 5, 6 37.5 50 75.075.0 87.5 100 6, 5, 4, 1, 3, 2 62.5 75 87.5 100 100 100 Based on the above motivations, we propose an approach to prioritize a selected test suite of an object-oriented program using the coupling value of the affected program parts covered by the test cases.For experimentation, we have taken a sample Java program shown in Algorithm 1.A total of twenty test cases (1-20) were taken along with their node coverage information.All those test cases that covered the affected nodes (with respect to a modification point) are selected hierarchically.Finally, five test cases (6-10) are selected for prioritization.For hierarchical regression test case selection details interested readers are requested to refer to [19].In this approach, we propose a technique to prioritize the selected test cases (6-10).Thus, we fix our research objectives as follows: (i) To identify and represent the affected program parts and compute the coupling value of these affected program parts.
(ii) To cluster the coupling values [22] into groups and assign a weight to each group based on their criticality.
(iii) To prioritize the test cases by sorting them in the decreasing order of their computed weights.
So, the contributions of this paper lies in the following: (i) Proposing an algorithm for prioritizing the selected test cases.
(ii) Implementing the proposed algorithm for the fifteen experimental programs.
(iii) Carrying out mutation analysis.
(iv) Comparing the performance of our approach with an existing work.
The rest of the paper is organized as follows: Section 2 introduces the technique used in this paper for prioritizing the test cases.We describe our proposed process of prioritization in Section 3. We also discuss the working and complexity analysis of our algorithm in this section.The details of our implementation and experimental studies are given in Section 4.
Here, we present the experimental study settings, describe the characteristics of the program samples taken for our experimentation and mutation analysis, and analyze the results.
In Section 5, we discuss and compare our work with some related work.We also highlight some of the limitations of our approach in this section.We conclude the paper in Section 6 with some insights into probable extensions to our work.

Preliminary Study
In this section, we discuss the techniques that are used in this work to accomplish our research objectives.

Program Slicing.
This paper uses program slicing to identify the affected program parts for change impact analysis.
Program slicing was originally introduced by Weiser [23] as a method for automatically decomposing programs by analyzing their data flow and control flow starting from a subset of a program's behavior.Slicing reduces the program to a minimal form that still produces the same behavior as the original program.Program slicing is a method of separating out the relevant parts of a program with respect to a particular computation [24][25][26][27].The input that the slicing algorithm takes is usually an intermediate representation of the program under consideration [28][29][30][31].The first step in slicing a program involves specifying a point of interest, which is called the slicing criterion and is expressed as a tuple (, V), where  is the statement number and V is the variable that is being used or defined at . Li et al. [25] proposed the concept of hierarchical slicing that computes the slices at package, class, method, and statement levels.Here, we adopt an approach of slicing that is different from that given in [1,25].We name this slicing approach hierarchical decomposition (HD) slicing.We first construct a single intermediate graph of the program taking into account the possible dependences among different program elements.Then, we perform HD slicing to obtain the affected program parts with respect to the change made to the program.The slice thus obtained is graphically represented and named affected slice graph (ASG).The steps of HD slicing are given in [32].A comparison of hierarchical slicing [1] versus HD slicing in terms of number of nodes selected and computation time is shown in Table 2.At each level, we obtain more accuracy in test case selection from a coarse-grain level to a finer-grain level by discarding the test cases that are not relevant.

Coupling in Object-Oriented Programming.
Coupling is defined as the degree of interdependence between two modules.However, in an object-oriented programming environment, coupling can exist not only at the level of methods but also at the class level and package level.Therefore, coupling represents the degree of interdependence between methods, between classes, between packages, and so forth.Many researchers have proposed different slicing based mechanisms [19,20,32] to measure coupling in an objectoriented framework.There are many factors, such as information hiding, encapsulation, inheritance, message passing,  Similarly, the container packages of the coupled classes are also said to be coupled.The coupling mechanism adopted in this paper is given in Section 3.

Regression Test Case Prioritization.
Testing is an important phase in the software life cycle.This phase incurs approximately 60% of the total cost of the software.Therefore, it becomes highly essential to devise proper testing techniques in order to design test cases that tests the software to detect early bugs.It becomes a big challenge to manage the retesting process with respect to the time and cost, especially when the test suite becomes too large.Therefore, selective retest technique attempts to identify those test cases that can exercise the modified parts of the program and the parts that are affected by the modification to reduce the cost of testing.However, test case prioritization can complement selective retest technique and faults can be detected early by prioritizing these selected test cases.Thus, test case prioritization (TCP) problem, stated by Rothermel et al. [6], is as follows, given that  is a test suite;  is the set of permutations of ;  is a function from  to the real numbers.
Problem.Find   ∈  such that where  is the set of all possible orderings of the test cases in  and  is a function that maps the ordering with an award value.This prioritization approach can be used with the selective retest technique to obtain a version specific prioritized test suite [2].Rothermel et al. [6] proposed a metric to ensure the efficiency of any of the existing prioritizing techniques.This metric is named as Average Percentage of Fault Detected (APFD) and is used by many researchers to evaluate the effectiveness of their proposed techniques.APFD measure is calculated by taking the weighted average of the number of faults detected during execution of a program with respect to the percentage of test cases executed.Let  be a test suite and let   be a permutation of .The APFD for   is defined as follows: Here,  is the number of test cases in ,  is the total number of faults, and   is the position of the first test case that reveals the fault .The value of APFD can range from 0 to 100 (in percentage).The higher the APFD value for any ordering of the test cases in the test suite is, the higher the rate at which software faults are discovered is.

Our Proposed Approach
In this section, we discuss our proposed approach to prioritize a given test suite based on the test cases selected for regression testing.We consider the example Java program shown in Algorithm 1 to discuss our proposed approach.This program defines a class named Shape which is inherited by the classes Rectangle and Triangle.The class TestShape contains the main method and computes the area of a rectangle and triangle, respectively, based on the user inputs given through the console, and displays the result.Though this program is very small in size, it represents all the important features of a Java program and is helpful in explaining the working of this approach.The prioritization steps are summarized as follows.
Step 1. Construct the ASG and compute the coupling value of each node of the ASG.
Step 2. Cluster the coupling values and assign weight to the nodes of ASG.
Step 3. Compute the weights of test cases and prioritize.

ASG and Computation of Coupling
Values.ASG is the graphical representation of the slice that is computed with respect to some change made to the program.The point of change is taken as the slicing criterion to compute the slice.The slicing algorithm comprises both forward and backward traversal to discover the affected program parts.The forward traversal discovers the program parts affected by the change, and the backward traversal discovers those parts that affect the parts discovered in the forward traversal.The steps of hierarchical decomposition (HD) slicing to compute the slice and construct the ASG are given as follows: (i) Traverse the EOOSDG in forward direction, starting from the point of modification, that is, slicing criterion, except method overridden edges.
(ii) Mark and Add each node of the EOOSDG that is reached by the forward traversal to a worklist,  1 .
(iii) Perform two-pass backward traversal for each  ∈  1 as the starting point.
(1) Pass-1: This set of affected nodes and their dependences are then modeled graphically to form the affected slice graph (ASG).To avoid repetition of the concepts, details are not reproduced here; interested readers are requested to refer to [32] for details.Algorithm 2 takes the ASG as input and calculates the ACC of each node.We discuss the working of the proposed Algorithm 2 in Section 3.4.In this approach, we use the concept of information inflow and outflow for coupling measurement.The ASG represents all forms of information flow between any two nodes in the form of edges.Thus, our proposed affected component coupling (ACC) for a given node  is computed as the normalized ratio of the sum of inflow and outflow of  with total nodes in ASG.The direction of couplings between any two nodes is given equal weight and was not considered separately.This goes with the hypothesis that a ripple change can propagate in any direction along a coupling dimension.Below, we define the terms related to the computation of affected component coupling (ACC) values.Definition 1.To measure the coupling, we define a set Inflow() that comprises all those nodes on which  depends.For any node  in ASG, The outflow of  in ASG is defined as the set comprising all those nodes that depends on : Thus, the dependence set () of each node is defined as the union of all the Inflow() and Outflow(): Definition 2. Hence, affected coupling of a given node  is computed as the normalized ratio of dependence of , (), to the total number of affected nodes in the ASG, |  | − 1, as the node under consideration is excluded.This coupling is measured with respect to the change made to the program that was taken as slicing criterion to generate ASG.This coupling measure is given as Definition 3. The updated coupling of a method node  in ASG   = (  ,   ) is defined as the average of the coupling values of all its elements (parameters and statements) along with its own coupling measure.Let a method node  have  number of elements, that is,  1 ,  2 , . . .,   .Thus, coupling of the method node  is given as Definition 4. The updated coupling of a class node  in ASG   = (  ,   ) is defined as the average of the coupling values of all its elements (attributes and methods) along with its own coupling measure.Let a class node  have  number of elements, that is,  1 ,  2 , . . .,   .Thus, cohesion of the class node  is given as Definition 5.The updated coupling of a package node  in ASG   = (  ,   ) is defined as the average of the coupling values of all its elements (classes and subpackages) along with its own coupling measure.Let a package node  have  number of elements, that is,  1 ,  2 , . . .,   .Thus, coupling of the package node  is given as The detail computation of ACC value for some of the nodes is shown in Section 3.4.The reason for taking coupling into consideration is that any node having higher ACC value is an indicator that the node is likely to be more errorprone [20].This is because higher ACC value of a node indicates more dependence of other nodes on this source of information.

Clustering and Assigning Weights.
Once the ACC values of all the nodes have been computed, then the values are clustered.Clustering [33] of the nodes is based on the notion that not all the nodes of ASG are equally erroneous.Some nodes are more erroneous than others.So, we need to identify the set of nodes that can be categorized into different levels of fault-proneness.-means clustering technique [22,34] is used here to cluster the ACC values.-means clustering is a technique of automatically partitioning a set of given data into  groups.The  cluster centres are chosen randomly from the data set.The value of  for our approach is 3 as we divide the coupling values into three categories as shown in Figure 3.These three categories of fault association are critical fault association, moderate fault association, and weak fault association.The computed ACC values can belong to either of these three categories.We propose an algorithm named find Weighted Affected Component Coupling (findWACC).It takes the ASG and its total number of nodes as input.It uses the formula given in (6) to compute the ACC value of each node in the ASG.It computes the outflow of a node at Line (3) and inflow of a node at Line (4).Algorithm 2 computes the ACC values of each node and then updates these values for some specific nodes such as package nodes, class nodes, method nodes, and method call nodes.It then assigns weight to the nodes of ASG.Any value of weights can be chosen to signify the faultiness of one set of nodes compared to the other sets.However, in this paper, we use the following weights: if the ACC value of a node  belongs to the category of critical fault association, that is, 0.7 ⩽ ACC() < 1.0, then  is assigned a weight 3. Similarly, if ACC value of a node  belongs to the category of moderate fault association, that is, 0.6 ⩽ ACC() < 0.7, then  is assigned a weight 2. Otherwise,  belongs to the category of weak fault association and is assigned a weight 1.The ACC value of each node of ASG and the corresponding weights assigned to them are shown in Figure 2.

Computation of Test Case Weights and Prioritization.
The program under consideration is executed with each selected test case in a given test suite to find the coverage information as shown in Table 3.The weight of a test case depends upon the weight of the nodes that it covers.All the critical and moderate nodes (nodes with weights 3 and 2, resp.) are shown in bold in the nodes covered column of Table 3.We propose Algorithm 3 named Hierarchical Prioritization of Test Cases using Affected Component Coupling (H-PTCACC) to compute the weights and prioritize the given test suite.Algorithm 3 takes the selected test cases along with their coverage information and ACC values of each node in the ASG as its input.The output of the algorithm is a prioritized set of test cases.For any test case   ∈ , Algorithm 3 first computes its critical weight (Wtc), that is, the sum of the weights of all the critical fault-prone nodes covered by   .Similarly, Algorithm 3 computes the moderate weight (Wtm), that is, the sum of the weights of all the moderate fault-prone nodes covered by   .In the same way, Algorithm 3 computes the weak weight (Wtw), that is, the sum of the weights of all weak fault-prone nodes for each test case.Thus the weight of test case is given as the sum of its critical weight (Wtc), moderate weight (Wtm), and weak weight (Wtw).Table 4 shows the different weights computed for each of the test cases 6, 7, 8, 9, and 10.Algorithm 3 assigns priority to the test cases based on their different computed weights.The test case having a higher total weight is given higher priority in the test suite.If any of the two test cases have the same total weight then their priority is decided based on their critical weight.The test case with higher critical weight is given higher priority.Similarly, if the critical weights  4 shows the final prioritized sequence of the selected test cases.

Working of the Algorithm.
In this subsection, we discuss the working of our proposed algorithms.Algorithm 2 uses the formula given in (6) to compute the ACC value of each node in the ASG.For example, we show the ACC calculation for the class Triangle represented as node 24 in Figure 1.
Initially, ACC value of node 24 is computed as Figure 4 shows the sets associated with the computation of ACC of node 24.The inflow set for node 24 is shown in Figure 4(a) and outflow set is shown in Figure 4(b).Figure 4(c) shows the set in which node 24 is a member.Once the computation of ACC of all member nodes of node 24, shown in Figure 4(d), is complete then ACC(24) is updated.Similarly, the ACC values of all the associated nodes (25, 26, 27, f3, f4, 29, 30, f27 1 out, f27 2 out, 33, f3 out, and 34) with node 24 as shown in Figure 1 Then, Algorithm 2 updates the ACC value of each node of ASG.The reason behind this update is that, for any node that represents a method, the statements contained inside that method also contribute to the ACC of the method.Even if a method does not have any statement inside it, still it will have some ACC value as some other method may be overriding it.Therefore, we have taken the average of all the ACC values of all the statements and the ACC value of the method under consideration, to compute the updated ACC value of the method.For example, the ACC values of nodes {24, 27, 33} are updated.The average ACC value of node 27 along with the ACC values of all its member nodes {f3, f4, 29, 30, f27 1 out, f27 2 out} are computed and assigned to node 27; that is, ACC Therefore, ACC value of class Triangle in Algorithm 1 represented as node 24 in Figure 1 is found to be 0.68866.Similar procedure is followed to update the ACC values of all the nodes representing the classes and packages in the ASG.Algorithm 3 computes the critical fault-prone weight Wtc(  ), moderate fault-prone weight Wtm(  ), weak faultprone weight Wtw(  ), and the total weight Wt(  ) for each test case   ∈ .For example, the nodes covered by test case 8 as given in the second column of Then, the algorithm sorts the test cases in the decreasing order of their total weights Wt(  ).If there exist some test cases   ,   such that Wt(  ) = Wt(  ), then the algorithm sorts   and   based on their critical fault-prone weights Wtc(  ) and Wtc(  ).If for some test cases Wtc(  ) = Wtc(  ), then   and   are sorted based on their moderate fault-prone weights Wtm(  ) and Wtm(  ).If, again, Wtm(  ) = Wtm(  ), then test cases are sorted by their weak fault-prone weights Wtw(  ) and Wtw(  ).In a very unlikely case, if the weak fault-prone weights are still identical, that is, Wtw(  ) = Wtw(  ), then the test cases are given equal priority.The prioritized order of the test cases 6-10 based on their respective weights is obtained as {7, 8, 9, 10, 6}.

Complexity Analysis of the Algorithms.
The complexity analysis of the proposed algorithms is given as follows.

Space Complexity.
Let the computed slice represented as ASG have  nodes.Each node in the ASG corresponds to each statement of the computed slice along with the actual and formal arguments present.Hence, the space requirement is given as ().Each node may have dependences on other nodes.These dependences on other nodes are represented as edges.Since each node can be dependent on maximum ( − 1) other nodes, the space requirement for the edges is ( 2 ).Hence, the total space requirement for the algorithm is ( 2 + ) ≡ ( 2 ).

Time Complexity.
Let  be the set of nodes in the ASG.To compute the inflow to the input node, each node is traversed only once, so the time complexity is ().If the time spent in each recursive call is ignored, then each node  can be processed in (1 + pred[]), where pred[] represents the set of predecessor nodes of .If each node has every other node in the graph as its predecessor node, then each node has ( − 1) predecessor nodes.So, the time complexity to process each node is (1 + ( − 1)) ≈ ().Similarly, to compute the outflow from the input node the time complexity is calculated as ().Then, the total time required to compute the coupling values of all the nodes is calculated as ( 2 ).
Let , , and  be the number of method nodes, class nodes, and package nodes, respectively, whose ACC values need to be updated.If each method node has  member nodes, then the time required to update  method nodes is ( 2 ).Since  and  are small bounded positive integers, the time complexity is calculated as ( 2 ).Similarly, if each class node has  member nodes and each package node has  member nodes, then the respective time complexities for  class nodes and  package nodes are ( 2 ) and ( 2 ).Since , , , and  are small bounded positive integers, the time complexities are calculated as ( 2 ) and ( 2 ), respectively, for the class and package nodes.As  nodes are there with  ACC values, so the time required to assign a weight to each of the  nodes depending on their respective ACC value is ().Therefore, the worst-case run-time of the findWACC algorithm is calculated as ( 2 + 2 + 2 + 2 +) ≡ ( 2 ).
Let  be the number of test cases to be prioritized in the given test suite .Suppose a test case covers at most  number of nodes.Let , , and  be the critical, moderate, and weak fault-prone nodes, respectively, covered by a test case, such that  =  +  + .So, the time complexity to compute the weight of each test case is calculated as ( +  + ) ≡ ().As a result, the total time complexity to compute the weight of  test cases in the given test suite  is ().Assuming  ≡ , the time complexity to compute the weights is calculated as ( 2 ).The time complexity to sort the  ≡  test cases is calculated as ( 2 ).Therefore, the worst-case run-time of the H-PTCACC algorithm is calculated as ( 2 +  2 ) ≡ ( 2 ).

Implementation
In this section, we briefly describe the implementation of our work.We implemented our code and all the algorithms using Java and Eclipse v3.4 IDE on a standard Windows 7 desktop.The proposed approach of change impact analysis is completely based on the intermediate graph of the modified program.The identification of the dependences to construct the intermediate graph follows the build-on-build approach; that is, we use the existing APIs and tools to build the graph instead of developing the source code parser from scratch.Source code instrumentation and generation of the intermediate graph are implemented by using XPath parser on srcML (SouRce Code Markup Language) representation of the input Java program.Thus, srcML is the XML (eXtended Markup Language) representation of the input Java program.The input program is converted to srcML using src2srcml tool.This srcML representation is then used to extract the dependences between program parts by using the XPath parser.The details of the program conversion and fact extraction process can be referred to in [26,35].Many other APIs and tools (such as Document Object Model (DOM) and Simple API for XML (SAX)) can be used to extract facts from the srcML representation.In this paper, the fact and dependence extraction is done using XPath.XPath is a language support used by XSLT (extensible stylesheet language) parser [36] to address specific part(s) of the entire XML document.The choice of using XPath is because of its simplicity and easy extraction by direct tracing to the location of the information.This also works on both visioXML and srcML formats of XML.The XPath expression "U function [name = "getArea"]," directly traces to the function definition with the name "getArea."The source code is first instrumented and then dependences in the program are identified and extracted into the program dictionary to construct the intermediate graph.The modified statement (instrumented number) is taken as input along with the intermediate graph, to slice the affected nodes.Most of the dependences at package level, class level, and method level are extracted from the Imagix4D XML data.Imagix4D is a static analysis tool that gives the graphical representation of most of these dependences.The statement level dependences such as control dependence and data dependence [35] are extracted from the srcML representation of the program.The program dictionary stores the following information: A change set is maintained that refers to the set of concurrent changes carried out on the program.-means algorithm is implemented in Matlab for clustering the coupling value.

Experimental Program Structure.
To implement our technique and show its effectiveness, we have taken total fifteen programs of different specifications as shown in Table 5.Out of these fifteen programs, ten benchmark programs (Stack, Sorting, BST, CrC, DLL, Elevator spl, Email spl, GPL spl, Jtopas, and Nanoxml) are taken from Software-Artifact Infrastructure Repository (SIR) [37] and other five programs are developed as academic assignments.These smaller programs are chosen to ascertain the correctness and accuracy of the approach, keeping in mind that they represent a variety of Java features and applications, the test cases are available and otherwise easily developed, and coverage information can be computed.
The smallest program has 54 LOC, and the largest program has 7646 LOC.The total LOC for all the fifteen programs is 19369 and the average LOC per program are 1291.The total number of classes in all the fifteen programs is 185 with an average of 12 classes per program.Our example program in Algorithm 1 has smallest number of classes and GPL spl has the highest, 111 number of classes.The total number of methods in all the programs is 2048 with an average of 137 methods per program.We have constructed a total of 150 ASGs for all the programs.The smallest ASG has 33 nodes, and the largest has 5233 nodes.The total number of affected nodes in all the fifteen programs is 28452, and the average number of nodes affected per each change made to the programs is 152.Similarly, the total number of test cases considered for all the programs is 295 with a mean of 20 test cases per program.Only those test cases that had a coverage value of more than 90% were chosen for each of the experimental programs.The coverage of the test cases were found using JaBUTi, a coverage analysis tool for Java programs.The total number of test cases selected for prioritization using our approach for all the fifteen programs is 131.The smallest number of selected test cases for prioritization is 5 for the our example program in Algorithm 1 and the highest is 14 for GPL spl program.

Mutation Analysis.
To generate the mutants for the input program, we used an Eclipse plugin of MuJava known as MuClipse [38].Fault mutants are considered to be good representative of real faults [37,39,40].MuClipse supports both the traditional and object-oriented operators for mutation analysis.Table 6 gives an overview of the mutation operators considered in the experimental study.A brief description of the operators is given for every operator in Table 6.The first five operators are the traditional operators.The remaining 23 operators relate to the faults in object-oriented programs.Out of which JTD, JSC, JID, and JDC are specific to Java features that are not available in all object-oriented languages.Apart from this, there are some other operators, such as EOA, EOC, EAM, and EMM, that reflect the typical coding mistakes common during development of an object-oriented software.The mutant generator generates the mutants for the sliced program (representing the affected program parts) according to the operators selected by the testers.Very large number of mutants are generated.The location of these mutants in the source code is visualized through mutant viewer.It allows a tester to select appropriate number of mutants and design test cases to kill the mutants.As the number of generated mutants are too large, we randomly selected a less number of mutants for our experimental programs.This process was repeated for 10 times and the rate of fault detection for the prioritized test suite was computed.The average number of mutants selected for every program is shown in Table 5.The test cases are written in a specific format such that each test case is in a form of invoking a method in the class under test.The test method has no parameters and returns the result in the form of a string.The mutant is said to be killed if the obtained output does not match the output of the original program.The test cases for the input program are generated using JUnit Eclipse plugin as the JUnit test cases closely match the required format.The total number of fault mutants for all the fifteen programs is 514, and the average number of mutants per program is 34.

Results.
Figure 5 shows the boxplots of the results of our mutation analysis for all the experimental programs.Figure 5(a) shows the presence of mutants in percentage in the affected parts of the programs.The presence of mutants in the affected parts of the programs ranges from a minimum of 12% (DLL program) to a maximum of 94% (Sorting program).The affected program parts in five programs have more than 90% of the mutants and four programs have little more than 10% mutants.The result shows that an average of 47% of mutants are scattered in the affected program parts of the sample programs.Figure 5(b) shows the percentage of mutants killed in each of the experimental programs.The percentage of mutants killed by the prioritized test cases varies from 70% to 95%.The average percentage of mutants killed by the prioritized test suite is 85%.This shows that our prioritized test cases are efficient in revealing the faults.
The average percentage of affected nodes covered by the prioritized test cases using the approach of Panigrahi and Mall and our approach is shown in Figures 6 and 7, respectively, for the experimental program given in Algorithm 1. From Figures 6 and 7, it may be observed that the average Advances in Software Engineering  percentage of nodes covered (APNC) using the approach of Panigrahi and Mall [18] is 77.2%, whereas the APNC value using our approach is 80.6%.Thus, there is an increase of 3.4% in APNC measure by our approach.Hence, our approach detects faults better than the approach of Panigrahi and Mall [18] as our approach covers more number of fault-prone nodes.We evaluated the effectiveness of our approach by using APFD metric.We named Panigrahi and Mall approach [18] as Affected Node Coverage (ANC) and our approach as Fault-Prone Affected Node Coverage (FPANC) in Figure 8.The comparison of APFD values for these fifteen different programs obtained using ANC and FPANC approaches is shown in Figure 8.The results show that our FPANC approach achieves approximately 8% increase in the APFD metric value over ANC approach.
The experimental results show that the performance of our approach varies significantly with program attributes, change attributes, test suite characteristics, and their interaction.To assume that a higher APFD implies a better technique, independent of cost factors, is an oversimplification that may lead to inaccurate choices among prioritization techniques.For a given testing scenario, cost models for prioritization can be used to determine the amount of difference in APFD that may yield desirable practical benefits, by associating APFD differences with measurable attributes such as prioritization time.A prioritization technique would be acceptable provided the time taken is within acceptable   limits, which also reflects the cost of retesting.Korel et al. [41] have also focused on less time of execution to decrease the overhead of prioritization process.However, the acceptable time limit greatly depends upon the testing time available with the tester.An empirical analysis on the prioritization time is outside the scope of this paper and is kept for our future work.We have reported the prioritization time of our approach to indicate the time taken to prioritize the test cases when the precomputed test coverage information and the ASG are available with the tester.The last column of Table 5 shows the time taken for prioritizing the selected test cases.The prioritization time varies from a minimum of 1.3 seconds to a maximum of 3.87 seconds for the experimental programs.The total time taken to prioritize the test cases of all the programs is 35.48 seconds and the average time for prioritizing the test cases is 2.4 seconds.The prioritization time includes the time for computing the weights of the test cases and the time taken to order the test cases in decreasing order of their weights.

Comparison with Related Work
In this section, we give a comparative analysis of our work with some other related works.
Elbaum et al. [42] performed an empirical investigation to find out the testing scenarios where a particular prioritization approach will prove to be efficient.They analyzed the rate of fault detection that resulted from several prioritization techniques such as total function coverage, additional function coverage, total binary-diff function coverage, and additional binary-diff function coverage.The authors considered eight C programs for their experimentation.They used the documentation on the programs and the parameters and special effects that they determined to construct a suite of test cases that exercise each parameter, special effect, and erroneous condition affecting program behavior.Then they augmented those test suites with additional test cases to increase code coverage at the statement level.The regression fault analysis was done on the faults inserted by the graduate and undergraduate students with more than two years of coding experience.The experimental results show that the performance of test case prioritization techniques varies significantly with program attributes, change attributes, test suite characteristics, and their interaction.Our results also confirm similar findings.However, our approach concerns Java programs.We have considered the dependencies caused by the object-oriented features in our proposed intermediate graph.Our approach targets the coverage of those affected nodes that have a high potential of exposing faults; hence, it is more change-based than the approach in [42].
Korel et al. [41] proposed a model based test prioritization approach.The approach is based on the assumption that execution of the model is inexpensive as compared to execution of the system; therefore the overhead associated with test prioritization is relatively small.This approach is based on the EFSM system models.The original EFSM model is compared with the modified EFSM model to identify the changes.After the changes are identified, the EFSM model is executed with the test cases to collect different information that are used for prioritization.The authors propose two types of prioritization: selective test prioritization and model dependence-based test prioritization.The selective test prioritization approach assigns higher priority to the test cases that execute the modified transitions.Model dependencebased test prioritization mechanism carries out dependency analysis between the modified transitions and other parts of the model and uses this information to assign higher priorities to the test cases.EFSM models consist of two types of dependences: control and data dependences.The results show that model dependence-based test prioritization (considering only two types of dependences) gives improvement in the effectiveness of test prioritization.The corresponding system for each model was implemented in C language.In another work, Korel et al. [43] compared the effectiveness of different prioritization heuristics.The results show that model based prioritization along with heuristic 5 gave the best performance.Heuristic 5 states that each modified transition should have the same opportunity of getting executed by the test cases.Korel et al. [44] proposed another approach of prioritization using the heuristics discussed in [43].In this new approach, they considered the changes made in the source code and identified the elements of the model that are related to these changes to prioritize the test cases.In our approach, we have considered the object-oriented programs in Java.The program is represented by our proposed intermediate graph.The graph is constructed by considering many more dependences that exist among the program parts in addition to control and data dependences, giving a clear visualization of the dependences.Then, we identify the effect of modifications and represent the affected program parts in another graph.Our representation is more adaptable to the frequent changes of the software and our approach relies on the execution of these affected program parts.Thus, our prioritization approach is based on both the coverage of the affected program parts and the fault exposing potential of the test cases.
Jeffrey and Gupta [45] proposed a prioritization approach using the relevant slices.They also aimed for early detection of faults during regression testing process.This approach considers the execution of the modified statements for prioritizing the test cases.The assumption is that if any modification results in some faulty output for a test case, then it must affect some computation in the relevant slice of that test case.Therefore, the test case having higher number of statements is given higher priority assuming that they have a better potential to expose the faults.However, intuitively, not all statements depending upon some modification will have the same level of fault-proneness.It may so happen that a test case executing less number of statements will detect more faults than another test case that executed more number of statements.The level of fault-proneness of the statements executed by the test cases affects the fault exposing potential of that test case.Therefore, in our approach, we computed the coupling values of the affected program parts to identify the probable fault-proneness of these programs parts.Our approach assigns a higher priority to that test case which executes maximum number of high fault-prone statements.Further, unlike our hierarchical decomposition slicing approach, relevant slicing depends upon the execution trace of the test cases and is proposed to work on C programs.Even though execution trace based slicing would result in slices of smaller sizes, the computational overhead is very high.The efficiency of our slicing approach is shown in Table 2.We have also shown the time requirement of our prioritization approach in Table 5.
The performance goal of the prioritization approach proposed by Kayes [11] is based on how quickly the dependences among the faults are identified in the regression testing process.An early detection of the fault dependences would enable faster debugging of the faults.The paper assumes that the knowledge of the fault presence is extracted from the previous executions of the test cases.A fault dependence graph is constructed using this information.However, one major limitation of this approach is that regression testing aims at discovering new faults introduced by the changes made to the software.But the prioritization approach proposed in this paper only enhances the chances of finding the faults which have already been revealed and are present in the fault dependence graph.New faults if any cannot be discovered.Further, this approach does not take into account the faultproneness of the statements.However, our approach relies on the dependence of the affected program parts represented as affected slice graph (ASG), so that error propagation because of the change is better visualized and analyzed.We compute the fault-proneness of the statements by computing their coupling values as coupling measures are proven to be good indicator of fault-proneness.Thus, our approach has a higher probability of exposing new faults, if any, in the software.
Mei et al. [15] proposed a static prioritization technique to prioritize the JUnit test cases.This prioritization technique is independent of the coverage information of the test cases.It works on the analysis of the static call graphs of JUnit test cases and the program under test to estimate the ability of each test case to achieve code coverage.The test cases are scheduled based on these estimates.The experiments are carried out on 19 versions of four Java programs of considerable size considering their method and class level JUnit test cases.The heuristic to prioritize the test cases in this approach is to cover system components (in terms of total components covered or components newly covered).The coverage of the system components acts as a proxy for evaluating a test cases true potential of exposing faults.If any two test cases carry the same heuristic value then the approach randomly decides on the test case to be given higher priority.Though this is a scalable approach as it works at coarse granularity level and incurs less computational cost, it suffers from many limitations.The prioritization techniques that work at a finer granularity level give better performances (in terms of fault exposing potential) as compared to the techniques that work at coarse granularity level [42].This approach ignores the faults caused by many object-oriented features such as inheritance, polymorphism, and dynamic binding and focuses only on the static call relationships of the methods in the form of a call graph.Static call relationships are more to procedure-oriented programs.Interaction and communication between methods in the form of message passing is highly important in object-oriented programs.A single method is invoked by different objects and the behavior of the method also differs accordingly.Any prioritization technique is efficient if it is based on the characteristics of the program to be tested.Therefore, considering the objectoriented features is essential.Java supports encapsulation and provides four access levels (private, public, protected, and default) to access the data members and member methods.Any misinterpretation of these access levels forms a rich source of faults.Java supports a feature named "super" to have access to the base class constructor from the derived class constructor.This additional dependence between constructors of the derived class and the bases class needs attention of the testers.Method overriding allows a method in the derived class to have the same function signature as the method in its parent.If invocation to such methods is not resolved correctly, then it can cause some serious faults.Another powerful feature and a potential source of fault is variable hiding.It allows declaration of a variable with the same name and type in the derived class as it is in the base class and allows both variables to reside in the derived class.Problem arises when an incorrect variable is accessed.Inheritance is a powerful feature but sometimes unintentional misuse of this feature can result in serious faults.Polymorphism in Java exists for both attributes and methods and both use dynamic binding.An object of its class type can access an attribute or method of its subclass type.The subclass object can also access the same attributes and methods.These attributes and methods behave differently depending upon the kind of object that is referring it.Such polymorphic dependences if not resolved can cause faults.Interested readers are requested to refer to [46][47][48][49] for more number of faults introduced by the misuse of the object-oriented features.Therefore, any prioritization technique with a performance goal of revealing more faults must consider the object-oriented features as they can induce many kinds of faults in the system.Our approach considers all the object-oriented features in the form of intermediate graph.This intermediate graph is constructed by identifying the dependences that can exist among various program parts and are given in [32].Our approach works at a finer granularity level and therefore may not be as scalable as [15] but has better fault exposing potential.
Fang et al. [14] have proposed similarity based prioritization technique.The authors have taken five Java programs from Software artifacts Infrastructure Repository (SIR) [37] to validate their approach.The prioritization process is based on the ordered sequence of the program entities.They propose two algorithms farthest-first ordered sequence (FOS) and greed-aided clustering ordered sequence (GOS).The FOS approach first selects the test case having largest statement coverage.The next test case that is selected is the one that is farthest in distance from the already selected test case.It computes two types of distances: a pairwise distance between the test cases and distance between a candidate test case and the already selected ones.GOS approach consists of clusters of test cases in which initially each cluster consists of only one test case.Then the clusters are merged depending upon the minimum distance between any two clusters.This process of merging the clusters is repeated until the size of the cluster set is less than some given .Then, the algorithm iteratively chooses one test case from each cluster and adds to the prioritized test suite until all the clusters are empty.The experimental results in this study show that statement coverage is most efficient and preferred for prioritization.When the size of the test suite is large, then additional measures are taken to reduce the cost of prioritization.This approach gives equal importance to all the test cases assuming that all the test cases have equal potential of exposing the faults.Intuitively, a test case executing less number of statements can expose more faults provided that the covered statements have high proneness to faults.It also does not consider the object-oriented features and the faults generated by these features.Unlike Fang et al. [14], we consider the fault inducing capability of the object-oriented features based on which we detect the affected program parts.We propose to prioritize a set of change-based selected test cases that are relevant to validate the change under regression testing.We compute the fault-proneness of the affected statements and then prioritize the test cases based on the coverage of these high fault-prone statements (represented as nodes in our proposed graph).
Lou et al. [16] proposed a mutation-based prioritization technique.In this approach, they compared the two versions of the same software to find the modification.Then, they generate the mutants only for the modified code.They selected only those test cases of the original version that worked on the new version of the software for prioritization.The test case that killed more mutants was given higher priority.The authors used a mutation generation tool, named Javalanche.Unlike our approach, Lou et al. [16] do not take into consideration the object-oriented features and the faults likely to occur because of these features.It is also silent on the type of mutation operators (faults) considered for their experimentation.Like Lou et al. [16], we generate mutants only for the sliced program (representing the affected program parts).However, we used MuClipse (an eclipse version of MuJava) to generate the mutation faults.We use coupling measure of the affected program parts as a surrogate to imply faultproneness.Our hypothesis assumes that the test cases that execute the nodes with high coupling value have a higher chance of detecting faults early during regression testing.We used mutation analysis only to validate our hypothesis.
The detail survey conducted on available coverage based prioritization techniques [11,[14][15][16][41][42][43][44][45] reveals that these techniques have not considered the object-oriented features.The presence of many faults arising due to different objectoriented features is inherent to object-oriented programs and hence must be considered.Therefore, we find that the approaches contributed to by Panigrahi and Mall [17,18] relate closely to our approach for an experimental comparison.Panigrahi and Mall proposed a version specific prioritization technique [17] to prioritize the test cases of objectoriented programs.Their technique prioritizes the selected regression test cases.The test cases are prioritized based on the coverage of affected nodes of an intermediate graph model of the program under consideration.The affected nodes are determined due to the dependences arising on account of the object relations in addition to the data and control dependences.The effectiveness of their approach is shown in form of improved APFD measure achieved for the test cases.In another work, Panigrahi and Mall [18] have improved their earlier work [17] by achieving a better APFD value.In this technique, the affected nodes are initially assigned a weight of 1.The weight is decreased by 0.5, whenever that node is covered by previous execution of the test cases.In both approaches [17,18], they have assumed that all the test cases have equal cost, and all faults have the same severity.
The assumption is also that all the affected nodes have a uniform distribution of faults.As a result, a test case executing more number of affected nodes will detect more faults and, therefore, has a higher priority.The average percentage of affected nodes covered by this approach is shown in Figure 7. Unlike the approach in [18] that is based on node coverage only, our proposed approach is based on the fact that some nodes are more fault-prone than other nodes.We used an intermediate graph that represents only those nodes that are affected by the modification made to the program to compute the fault-proneness of the nodes.The coupling factor of each node in the ASG is computed to predict its level of faultproneness.The test cases are then prioritized based on the fault-prone nodes that they execute.Unlike [18], a test case Advances in Software Engineering executing more number of fault-prone nodes has a higher computed weight and gets a higher priority in our approach.

Threats to Validity.
It is obvious for any new proposed work to be associated with some threat to its validity, and it is likely for this work as well.Our approach is capable of measuring the coupling value of a class in the presence of many object-oriented features such as inheritance, interfaces, polymorphism, and templates.The coupling of classes in a subclass-superclass relationship can have a different impact on software maintainability and fault-proneness compared to the coupling of classes that are not in such relationship.Therefore, it is essential to make a distinction between coupling within an inheritance hierarchy and coupling across inheritance hierarchies.Similarly, whether the presence of Java interfaces (that usually do not contain actual implementations) contributes to the coupling measurement or not is a matter of study that is not included in this paper.The impact of inclusion/exclusion of any of the object-oriented features on the coupling measurement has not been empirically investigated in this paper.We believe that a detailed empirical research on such relationships and their impact on the proposed coupling analysis is essential and is left for future study.Another threat to the validity of this work is that the fault prediction can be improved when both coupling and cohesion metrics are considered together [20], but this approach focuses only on coupling measure.Slicing techniques based on intermediate graphs are always limited by the scalability issues of the graph for larger program.This approach is tested to work well with programs having nearly 1 Lakh line of code.For larger programs it may raise some memory issues.However, it will work fine for bigger programs if the graph is restricted to method level analysis only.The limited size and complexity of the experimental programs are considered a threat to the validity of this approach.Our approach considers only the primary mutants.It does not consider the secondary mutants which are also important.Our approach of mutation analysis may be extended to handle secondary mutants.The use of mutation analysis for the fault manipulation of these programs may not represent the actual fault occurrence in the complex industrial programs and hence is considered a threat to this approach.Though the proposed prioritization approach is efficient in detecting the faults, it may not be so in terms of time requirement.However, the time requirement is within acceptable limit if applied to the test cases selected for regression testing, and the coverage information is available.An empirical study of the impact of prioritization time on the choice of selection of the prioritization techniques would be interesting and may be carried out in future.

Conclusion and Future Work
In this paper, we proposed a coupling metric based technique to improve the effectiveness of test case prioritization in regression testing.Analysis is done to show that prioritized test cases are more effective in exposing the faults early in the regression test cycle.We performed hierarchical decomposition slicing on the intermediate graph of the input program.The affected component coupling (ACC) value of each node of the ASG is calculated as a measure to predict its faultproneness.In this technique, weight is assigned to each node of ASG based on its ACC value.The weight of a test case in a given test suite is then calculated by adding the weights of all the nodes covered by it.The test cases are prioritized based on their coverage of fault-prone affected nodes.Thus, the test case with a higher weight is given higher priority in the test suite.The results show that our FPANC approach achieves approximately 8% of increase in the APFD metric value over ANC approach.In the future, we aim to prioritize the test cases for more complex object-oriented (OO) programs such as concurrent and distributed OO programs.We would also like to incorporate different other coupling measures and metrics to predict the fault-proneness of modules and prioritize the test cases based on their coverage weights.We as well aim to compute the cohesion values of the program elements and use them along with their coupling values for a better fault prediction analysis and prioritization.

2
Advances in Software Engineering

Figure 1 : 1 =
Figure 1: Affected slice graph (ASG) of the example Java program given in Algorithm 1.

Figure 2 :Figure 3 : 5 +
Figure 2: The calculated ACC values of different nodes of the ASG in Figure 1 and their weights.

members of node 24 Figure 4 :
Figure 4: ACC computation of nodes of ASG in Figure 1.
(i) Set of all packages in the program.(ii) Set of all classes in the program.(iii) Set of all methods in the program.(iv) Set of all statements in the program.(v) Sets of each dependence type.

Figure 6 :
Figure6: Average percentage of affected nodes covered by the prioritized test cases using the approach of Panigrahi and Mall[18].

Figure 7 :
Figure 7: Average percentage of fault-prone affected nodes covered by the prioritized test cases using our approach.

Figure 8 :
Figure 8: Comparison of APFD values for different programs.

Table 1 :
A sample test case distribution and the faults detected by them.
program, coupling can exist between any two components due to message passing, polymorphism, and inheritance mechanisms of object-oriented programs.These components include packages, classes, methods, and statements.Two statements 1 and 2 are said to be coupled if 1 has some dependence (control, data, or type dependence) on 2.Methods in an object-oriented program belong to the constituent classes.It implies that a method is coupled either with a method in the same class or with another method in a different class.If the methods of any two classes are coupled, then the corresponding classes are said to be coupled.

Table 3 :
Test case coverage of fault-prone affected nodes.

Table 4 :
Distribution of test case weights on the basis of fault-prone impact.
are also the same then the moderate weights are taken into consideration for prioritization.If the moderate weight of the test cases is again the same then the weak weights are considered for prioritization.If the weak weights are still the same for any two test cases, then both of the test cases are given equal priority.The last column in Table4shows the final case, that is, if the weak weights are still the same for any two test cases, then both of the test cases are given equal priority.The last column in Table

Table 5 :
Result obtained for regression testing of different programs.

Table 6 :
Overview of mutation operators.