Decision Graphs and Their Application to Software Testing

Control flow graphs are a well-known graphical representation of programs that capture the control flow but abstract from program details. In this paper, we derive decision graphs that reduce control flow graphs but preserve the branching structure of programs. As an application to software engineering, we use decision graphs to compare and clarify different definitions of branch covering in software testing.


Introduction
Graphs that represent the control flow of programs have been studied since many years and are known under the names of control flow graphs or program graphs.There are mainly two types of such graphs: one that associates one node with each statement in programs, see, for example, [1], where control flow graphs are applied to optimization or [2] for the application in software engineering; and the other that replaces maximal sets of consecutive nodes with a single entry and a single exit called blocks or segments, by single nodes, for example, [3,4].Blocks can be derived from the control flow graphs of the first type or constructed directly from the programs.Both types capture the control flow by abstraction from the program details.
Since the control flow through programs is determined by the decisions, for example, the if-then-else-constructs, based on the data and the conditions in such constructs, it is promising to keep in graphical representations of programs only the decisions and the control flow between them and thus defining a reduction of control flow graphs that preserves the branching structure.
In this paper, we will study control flow graphs (of the first type) and derive decision graphs [5,6] that represent the branching structure of programs based on the definition of program graphs reduced to DD-paths by Paige [7].
Statement coverage and branch coverage are widely used in software testing.The first property can be checked with control flow graphs since each node represents a statement (or block of statements).When regarding branch coverage, analogously, the question arises, in which in graph type, each edge represents a branch.
More general, decision graphs can be derived not only from control flow graphs but also from arbitrary directed graphs and thus represent the branching structure of the graphs.We show that the branches in a graph correspond to the edges in the derived decision graph.
As an application, we compare different definitions of branch covering in software testing that already existed but were specified in different ways in order to find where they differ from each other and thus get new results on branch coverage that clarify the definitions found in the related literature.
The contribution of this paper is to solve the problem of finding a graph type that has one edge for each branch, analogously to control flow graphs that have one node for each statement.Furthermore, we apply decision graphs to software engineering and clarify the different notions of branch covering in software testing, one of them based on decision graphs, in order to avoid confusion when using them in practice.We propose decision graphs, independently, from the application to software testing, as means to abstract from details and focus on the decision structure.

ISRN Software Engineering
The remainder of this paper is organized as follows.We start with the necessary definitions and results about directed graphs, control flow graphs, and decision graphs.In Section 3, we show the correspondence of branches in a graph and the edges in the derived decision graph.As an application, we define three different notions of branch coverage and compare them in Section 4. Related work and other applications of control flow graphs are discussed in Section 5. Section 6 concludes the paper.

Basic Definitions and Results
This section presents definitions and results-partially taken from [5,6]-necessary for the following.

Directed Graphs
Definition 1.A directed graph with multiple edges is a pair  = (, ) consisting of a finite set  of nodes and a finite set  of edges with  ∩  = 0, together with functions start :  →  and end :  →  that associate a start node and an end node, respectively, with each edge.
A path  is a nonempty, finite sequence of edges  1  2 ⋅ ⋅ ⋅   such that end(  ) = start( +1 ) for  = 1, . . .,  − 1.The start node of the first edge  1 is called the start node of , the end node of the last edge   -the end node of .The length of a path  is the number  of its edges.The nodes start( 2 ), . . ., start(  ) are called inner nodes of .A path  which contains one node twice and all other nodes only once is called loop.A loop  is called unconditional loop, if the inner nodes and the end node have outdegree 1.A special case of an unconditional loop is an isolated loop, that is, a loop that has only nodes with indegree 1 and outdegree If  < , we write   < .For a path , the set prefix() = {  |   ≤ } and, for a set  of paths, the set prefix() = ⋃ ∈ prefix() are the sets of prefixes of  and , respectively.A path starting with an entry node is called S-path.An S-path that ends in an exit node is called complete path.By () and (), we denote the set of nodes and edges, respectively, that are contained in the path .A node   is reachable from a node  if  =   or if a path  with start node  and end node   exists.
If it is not clear from the context which graph is meant, we add a subscript to the functions start, end, pre, post, indegree, outdegree, and prefix, for example, pre  ().
A detailed introduction to graphs can be found, for example, in [8,9].
In many cases, we do not need multiple edges between nodes.If for all  1 ,  2 ∈  : start( 1 ) = start( 2 ) ∧ end( 1 ) = end( 2 ) ⇒  1 =  2 , the graph is called simply directed graph.In such a graph the notation can be simplified and an edge  can be written as a pair (start(), end()) ⊆  ×  of nodes.A path can then briefly be denoted by the sequence of the contained nodes  1  2 ⋅ ⋅ ⋅  +1 , where  ≥ 1.Since there are no multiple edges in such a graph, the numbers of incoming and outgoing edges of a node  are equal to the numbers of elements of the preset, postset of , respectively, that is, indegree() = |pre()|, outdegree() = |post()|.
A simple fact that will be used later is that in a directed graph with multiple edges, the number of edges is equal to the sum of the numbers of incoming edges of all nodes and also equal to the sum of the numbers of outgoing edges of all nodes: || = ∑ ∈ indegree() = ∑ ∈ outdegree().The same holds for an unconnected component of a graph.An isolated loop forms a component of the graph, consisting of sequential nodes, that is unconnected to the rest of the graph.Therefore, the above equation is also valid for isolated loops.

Control Flow Graphs.
The control flow in a function written in a programming language can be modeled by a directed graph called control flow graph, which contains one node for each statement in the function and edges that represent the control flow between statements.We add an entry node and an exit node as unique entry and exit points of the function.When a function is called within a function, the control flow also leaves the function and enters it again after the execution of the called function.But since we discuss only single functions, we do not interpret function calls as exit and entry points of the function.Statements in the programming language C are function calls, assignments, and other expressions with semicolon, return-, break-, continue-, goto-, if-, switch-, do-while-, for-, and while-statements and the null statement.Syntactically, a block is also a statement, but since it consists of statements, only the statements in the block are nodes in the control flow graph, not the block itself.Declarations and definitions are not statements and are also not included in the control flow graph.Definition 2. The control flow graph   = (, ) of a function  is a directed graph that consists of the set of nodes  = {  |      } ∪ { in ,  out } and the set  of edges.This set contains an edge (  ,    ) if the statement   is executed immediately after the statement .For the first statement  1 in the function, we introduce an edge ( in ,   1 ).Furthermore, we add edges (   ,  out ) for each node    that is associated to a statement   , after which the control flow leaves the function because of a return-statement or the right brace that terminates the function.The control flow graph of an empty function, that is, a function without any statements, consists of  = { in ,  out } and  = {( in ,  out )}.
From the definition, it follows that in the control flow graph   , each node-with the exception of  in and  outcorresponds to a unique statement in the function .This function checks whether the array parameter values contains the integer parameter key or -key.In these cases, it sets the output parameter found to 1 or to −1 and sets the parameter index to the found index.The length of the array is given by the constant N.
The nodes of the control flow graph are labeled with "=", "while", and so forth to show which kind of statements is represented or with "in", "out" for better readability.The preset of the exit node consists of two nodes, one representing the return-statement and the other the while-statement after which the function reaches the terminating brace.

Decision Graphs.
As in [5,6], we reduce directed graphs and keep in decision graphs only entry and exit nodes and such nodes that represent decisions, that is, nodes with postsets that have two or more elements, called D-nodes [7].The following definitions and results are independent from the modeling of software by control flow graphs and can be applied to all directed graphs.A node is not a D-node if its indegree is at least 1 and its outdegree is exactly 1.The D-nodes of the control flow graph of the function Search (Figure 1) are the entry and exit nodes  1 ,  16 and the while-and if-nodes  3 ,  4 ,  8 ,  11 .
If an inner node  2 , . . .,   would occur twice in a DDpath, no D-node could be reachable from it.Therefore, all inner nodes are different [5].Furthermore, there are at most outdegree() different DD-paths that start in a D-node , since there is no branching possible after leaving .
In order to reduce graphs but to preserve the branching structure we follow the idea of Paige [7] and replace DDpaths, with single edges.There may be more than one DDpath between two D-nodes, for example, between the second and the third if-node in the control flow graph of Search (Figure 1), and therefore multiple edges are necessary to avoid the merging of branches in the decision graph.Definition 4. Let  = (, ) be a directed graph.The decision graph Ĝ = ( N, Ê) of  is a directed graph with multiple edges that consists of the set of nodes N = { ∈  |  is a D-node in } and the set of edges Ê that contains an edge with start node  and end node   for each DD-path in  that starts in  and ends in   .
The number of edges in Ĝ between two nodes  and   is equal to the number of different DD-paths in  that start in  and end in   and outdegree Ĝ() is equal to the (finite) number of different DD-paths in  that start in .The assignment of edges in Ĝ to DD-paths in  is a bijective function that allows us to identify the edge in the decision graph that corresponds to a given DD-path in the graph and vice versa.This will be necessary to distinguish multiple edges when we compare different coverage notions.In Figure 2, the decision graph of the control flow graph of the function Search is depicted.
Figure 3 shows the decision graph of the following function f1: This example shows that DD-paths need not to be disjoint ( 2  3  4  5  6 and  2  4  5  6 ).
Not only graphs can be reduced to decision graphs, but also paths can be reduced to to decision paths.Definition 5. Let  = (, ) be a directed graph and  =  1  2 . . . +1 a path in  that starts with a D-node and contains at least a second D-node.Let   1 ,   2 , . . .,    be the D-nodes in  with  1 <  2 < ⋅⋅⋅ <   .Then   =       +1 ⋅ ⋅ ⋅   +1 are DD-paths for  = 1, . . .,  − 1.From Definition 4, follows that there are edges   with start node    and end node   +1 in the decision graph Ĝ associated to the DD-paths   .The decision path d of  is defined as The nodes    +1 ⋅ ⋅ ⋅  +1 following the last D-node    are not a complete DD-path and are therefore clipped.From

Branches in Directed Graphs
In this section, we will discuss branches in directed graphs and their relationship to edges in the derived decision graphs.Let us examine the graph shown in Figure 4.This graph can be a control flow graph, for example, of the following function f2: label: goto label; } Since in the decision graph the goto-node will not appear, this example shows that unconditional loops are not represented in decision graphs.Therefore, the number of DDpaths that start in a D-node , where an unconditional loop branches off is lower than outdegree().In the example, the ifnode has two outgoing edges and a postset with two elements, but only one DD-path that starts in the if-node.
In programming languages like C, the only possibility to create unconditional or isolated loops is using gotostatements.Lemma 6.Let  = (, ) be directed graph.Then it holds: (1) each node  ∈  that is not contained in an isolated loop is reachable from a D-node, (2) each edge  ∈  that is not contained in an isolated loop is contained (as last edge) in a path that starts with a D-node and whose inner nodes are not D-nodes, (3) each edge  ∈  that is not contained in an unconditional loop is contained in a DD-path [6].
Proof.(1) Let   be the set of nodes from which  is reachable and assume that   does not contain a D-node.Let  be an edge that starts in   \ {}.If end() ∉   , the node  cannot be reached from end().But  can be reached from start().Therefore, there must be a second edge that starts in start(), and start() would be a D-node which contradicts to the assumption.This means that end() ∈   .Let  be an edge that ends in   .Then  is also reachable from start() and start() ∈   .Together this means that, if there is a connection of   with  \   by an edge, this connection can only be the outgoing edge of  (which exists since  is not a D-node).
In the case that this connection does not exist and the end node of the outgoing edge of  is in   , the nodes   form an unconnected component and ∑   ∈  indegree(  ) = ∑   ∈  outdegree(  ) = |  | (because of the assumption that   does not contain D-nodes, outdegree(  ) = 1 for all   ∈   ).Therefore, all nodes in   must have indegree 1.Let us denote the successor of  by  2 , the successor of  2 by  3 , and so on.Finally, we reach a node   (which must be ), a second time.This means that we found an isolated loop that contains  which is forbidden.
In the case that the end node of the outgoing edge of  is in not   , it follows that ∑   ∈  indegree(  ) = ∑   ∈  outdegree(  ) − 1 = |  | − 1 and at least one node in   must have indegree 0 and is a D-node which contradicts to the assumption.
(2) If the node start() is a D-node,  is contained in the path  = .If start() is not a D-node, it follows from part 1 of this lemma that there is a path that starts in a D-node and ends in start().We prolong this path by the node end() and get the path  that contains  as last edge.If an inner node in the path  is a D-node, we shorten the path from the start node to the last inner D-node (let us denote it by  1 ) in the path.The resulting path, denoted also by , starts with the Dnode  1 , which could be start(), and ends with the edge .The inner nodes of  are not D-nodes (Figure 5).
Part (1) of this lemma can be applied since  would be contained in the same isolated loop, if start() is contained in an isolated loop.(3) If the last node in the path  (this is end()) according to part (2) of this lemma is not a D-node, we follow the edges until we find a D-node   (an exit node is a D-node) or detect after at most |−1| steps an unconditional loop that contains  which is forbidden by the assumption.Thus, we constructed a DD-path   that contains the edge .
If we exclude unconditional loops, we can show that a decision graph has exactly one edge for each branch in the directed graph and thus abstracts from branch details.We identify branches by their first edges-that are outgoing edges of D-nodes.Such an edge leads from an entry node into the graph or selects a branch in a node with a postset of two or more.
Theorem 7. Let  be a directed graph without unconditional loops.Then, the branches in , that is, outgoing edges of Dnodes, correspond bijectively to the edges in the decision graph of .
Proof.Let  = (, ) be a directed graph.The set of outgoing edges of D-nodes will be denoted by .Let  ∈ .With part (3) of Lemma 6, follows that there exists a DD-path  that contains .Since start() is a D-node,  must be the first edge in .It is not possible that two different DD-paths start with the same first edge, and therefore the association  :   →  is well-defined.Furthermore, ( 1 ) ̸ = ( 2 ) for two different edges since  1 ,  2 are the first edges in ( 1 ) and ( 2 ), respectively.Of course, every DD-path starts with an edge in .Together, that means that  is a bijective association from  to the set of all DD-paths in .In Definition 4, a bijective association was defined (let us call it ) that associates an edge in the decision graph Ĝ = ( N, Ê) of  with each DD-path in .This means that  ∘  :  → Ê is bijective.
Note that this result holds for arbitrary directed graphs, not only for control flow graphs, and thus is independent from the modeling of software by graphs.

Branch Coverage
In this section, we apply decision graphs to software testing and compare different definitions of branch coverage.

Test Cases.
When a test case for a function is executed, it runs through the function and also induces a path in the control flow graph.This path always starts with the edge ( in ,   1 ), where  1 is the first executable statement or with ( in ,  out ), if the function does not contain any statements, and therefore is an S-path.In most cases, the execution reaches the end of the function, and the induced path is complete.But there are also cases where the exit node is not reached, for example, when a function is called that does not terminate or a division by 0 is encountered.In both cases, we observe a finite but not a complete path in the control flow graph of the function.The execution could also encounter an infinite loop in the function.Then, we observe in theory an infinite path.In practice, we have to stop the execution of the test case after some time and also get a finite path.This means that we always observe a finite path while executing a test case for a finite observation time.Mostly, we can distinguish these cases in practice while debugging the application.Clearly, this is impossible in general.
If a path  is in D(, ), any prefix   can also be observed with shorter observation time.This means that D(, ) is prefix closed.
The execution of the test case  of the function Search with the array 1, 2, 3, 4 for the parameter values and −2 for the second parameter key induces the complete path  =   4) with parameter 1 results in an infinite execution that is represented by the infinite set of paths D(f2,

Coverage.
The basic coverage notion in software testing is a statement coverage which is obtained if in a test of a function, the test cases in a set  execute all statements in the function.For the control flow graph   = (, ) of the function, it follows that all nodes are covered by the paths that are induced by the test cases, that is, ∀ ∈  ∃  ∈ D(, ) :  ∈ ().For this reason, this coverage criterion is also called all-nodes criterion [4,10]  In software testing, the notion of branch coverage where the test cases should cover all branches of the software is very popular because it is stronger than statement coverage but easier to obtain than more sophisticated coverage definitions like those that consider not only decisions but also the Boolean conditions that occur in the decisions or like dataflow-oriented coverage notions, for example, [4,10], which can give better results [11].In the following, we will give three definitions of branch coverage and investigate the relationship and differences between them.The first one captures the notion that in decisions like if-statements the conditions should be at least once true and once false during testing and thus all branches should be taken, the second one is edge covering of the control flow graph, and the last one is edge covering of the decision graph.
(iii) A set  of test cases satisfies branch coverage [5] if and only if Edge covering (of the control flow graph) is often called all-edges criterion or branch coverage by many authors, for example, [4,10,12], whereas our notion of branch coverage is defined as edge covering of the decision graph.Hierons et al. [13] define branch coverage based on outgoing edges of D-nodes similar to our definition (i).Frankl and Weyuker [11] do not distinguish between branch testing and decision coverage based on the Boolean conditions.Further definitions of branch coverage arise when other graph reductions are used.For example, Bertolino and Marré [14] define branches that start/end in D-nodes, junction nodes (with indegree ≥2), the entry node, or the exit node.In the reduced graphs, called ddgraphs, branches are replaced by edges.We do not consider these definitions of branches because we concentrate on the reduction to decision graphs.
Note that in the rest of the paper we mean by branch coverage the edge coverage of the decision graph, as in Definition 10(iii), unless otherwise stated.
The set  of test cases for the function Search does not satisfy any of these coverage notions since the edge between the while-node and the exit node is neither covered in the control flow graph nor in the decision graph.We need a third test case   with parameters values = 1, 2, 3, 4 and key = 0 to cover all edges in both graphs.
It is obvious that a set of test cases that satisfies edge coverage for a given function also satisfies decision coverage for that function.Such a relation between coverage criteria is often called subsume: a coverage  1 subsumes a coverage  2 , if for all functions  (or for all programs ) and all specifications  all sets  of test cases that satisfy  1 also satisfy  2 [4,11,15].The specification of the program is not used in our coverage definitions and therefore left out in this paper.

Comparison of Coverage Definitions. The example f2
(Figure 4) shows that in the case of unconditional loops all DD-paths and thus all edges in the decision graph can be covered (e.g., by the test case with parameter 0) but not all edges in the control flow graph are executed.This leads to the following lemma.Lemma 11.Let  be a function such that the control flow graph   = (, ) of  does not contain unconditional loops, and let  be a set of test cases that satisfies branch coverage.Then,  also satisfies edge coverage and decision coverage.
Proof.Let  ∈  be an edge in the control flow graph.From part (3) of Lemma 6, follows that there exists a DD-path   that contains .This DD-path induces an edge ê ∈ Ê in the decision graph Ĝ = ( N, Ê) of  (Definition 4).Since the set  of test cases satisfies branch coverage, a path d ∈ D(, ) with ê ∈ ( d) exists (Definition 10), and furthermore there is a path  ∈ D(, ) such that d is the decision path of  (Definition 9).Since the edge ê occurs in the decision path d, we know from Definition 5 that   is part of the path  and thus  ∈ ().
Unconditional loops can be allowed in cases where all test case sets that satisfy branch coverage also run through all unconditional loops.Figure 6 shows the control flow graph of the following function f3: In this example, we need at least two test cases, one with a positive and one with a nonpositive parameter, in order to satisfy branch coverage.The test case with positive parameter executes also the unconditional loop.
Nonterminating function calls result in incomplete test case paths.For example, when we assume that the function g called in f1 (Figure 3) does not terminate if it is called with the parameter 0, the path induced when the function f1 is called with the parameter 0 does not lead to the exit node.The consequence is that all edges are covered when the function f1 is called with parameters 0 and 1 but the DD-path  2  4  5  6 and thus the corresponding edge in the decision graph is not executed.
Definition 12. Let  = (, ) be a directed graph.A set  of paths is called complete, if for each path   ∈  that does not end in an exit node, there exists a path  ∈  with   < .
Infinite sequences  1 <  2 <  3 < ⋅ ⋅ ⋅ of paths (which are induced in the control flow graph of a function by infinite loops) are allowed in complete sets of path, but not paths that stop before reaching an exit node.The set D(f1, ) where  is the test case with parameter 0 is not a complete set since Lemma 13.Let  = (, ) be a directed graph,  a complete set of paths, and   =  1 . . . +1 a path with inner nodes that are not D-nodes such that there exists a path   ∈  with ( 1 ,  2 ) ∈ (  ).Then, there is a path  ∈  that contains   .
Proof.If  = 1, we have   =  1  2 and   is contained in   since ( 1 ,  2 ) ∈ (  ).Assume that the proposition holds for all paths   with inner nodes that are not D-nodes with length .Let   =  1 ⋅ ⋅ ⋅  +2 be a path with length  + 1 where  2 ⋅ ⋅ ⋅  +1 are not D-nodes such that a path   ∈  with ( 1 ,  2 ) ∈ (  ) exists.From the assumption, it follows that there is a path  ∈  that contains  1 ⋅ ⋅ ⋅  +1 .If  +1 is the last node in , we can prolong  uniquely by  +2 since  +1 is not a D-node.This new path contains   and is in  since  is complete and  +1 is not an exit node (Definition 12).If  +1 is not the last node in , the node following  +1 in  must be  +2 , since  +1 is not a D-node, and therefore  contains   .
When we apply this lemma to DD-paths   , we can prove the following proposition.Lemma 14.Let  be a function and  a set of test cases that satisfies edge coverage or decision coverage such that D (, ) is complete.Then,  also satisfies branch coverage.
Proof.Let ê be an edge in the decision graph Ĝ = ( N, Ê) of the control flow graph  of .Then, there exists a DDpath   =  1 ⋅ ⋅ ⋅  +1 in  such that ê is the associated edge in Ê (Definition 4).Since the set  of test cases satisfies edge coverage of  or decision coverage, a path   ∈ D(, ) with ( 1 ,  2 ) ∈ (  ) exists (Definition 10).The set D(, ) is complete, and therefore there is a path  ∈ D(, ) that contains   (Lemma 13).It follows from Definition 5 that the decision path d of  contains the edge ê.This means that for the path d ∈ D(, ) holds ê ∈ ( d).
Since branch coverage does not cover branches with unconditional loops, we could weaken the condition for complete sets and allow that the executions of unconditional loops are not fully contained in the path sets, that is, a path   ∈  is allowed to end in a node that is contained in an unconditional loop without the existence of a path  ∈  with   < .
In the case that a control flow graph has an isolated loop, it is impossible to get edge coverage but decision coverage possibly can be achieved.Of course, such a loop can only appear in functions with unreachable code.This is not the only case in which edge coverage does not follow from decision coverage.Another case arises if the set of paths induced by the test cases is not complete.When, for example, the function f1 in Figure 3 is called with 0 and 1, we get decision coverage, but not edge coverage, if we assume that g never terminates.
Lemma 15.Let  be a function such that the control flow graph   = (, ) of  does not contain isolated loops and let  be a set of test cases that satisfies decision coverage such that D(, ) is complete.Then,  also satisfies edge coverage.
Proof.Let  be an edge in the control flow graph.Let   =  1  2 ⋅ ⋅ ⋅  +1 be the path according to part (2) of Lemma 6 that contains  as last edge.Since the set  of test cases satisfies decision coverage, a path   ∈ D(, ) with ( 1 ,  2 ) ∈ (  ) exists (Definition 10).From Lemma 13, follows that there is a path  ∈ D(, ) that contains   .Therefore,  ∈ ().
If we exclude unconditional and thus isolated loops, for example, by not allowing gotos, which is a simple syntactical criterion, we can summarize the results as follows.
Theorem 16.In the set of all functions with control flow graphs without unconditional loops, branch coverage subsumes edge coverage and edge coverage subsumes decision coverage.
For functions  (with control flow graphs without unconditional loops) and test case sets  that induce complete sets D(, ) of paths the reverse directions also hold, that is, from decision coverage follows edge coverage and from edge coverage follows branch coverage.

Related Work and Applications of Control Flow Graphs
Control flow graphs can be used for white box testing to support test data selection and coverage notions as shown in [5] for statement, segment, and branch coverage or as discussed by Laski and Korel [16] and Rapps and Weyuker [12] for data flow oriented testing.In the latter papers, only complete paths, that is, paths that start in the entry node and end in the exit node, are considered, whereas in the first paper also, paths that do not end in the exit node are allowed in order to capture infinite loops which can occur in practical applications that run until switched off, for example, in embedded control systems.Jalote [10] and Zhu et al. [4] base the definitions of statement and branch coverage and of data flow coverage notions on control flow graphs, where the nodes represent blocks of statements.Different coverage definitions are compared in [4,11].In the first of these papers, the authors argue that a coverage that subsumes another coverage does not necessarily give better results with respect to the detection of faults and introduce a relation called "properly covers" with which they prove that decision coverage is weaker than condition based and data flow oriented coverages.White [17] models the structure of programs with control flow graphs in order to discuss different aspects of testing.Program transformation techniques also use control flow graphs to represent the program structure, for example, as shown by Hierons et al. [13] with the aim to apply automated test data generation to transformed unstructured programs.An approach to generate test data that uses control flow graphs to describe all paths that lead from the entry node to the branch which should be tested is shown in [18].Bertolino and Marré [14] propose an algorithm to generate path covers for branch testing which is based on ddgraphs that reduce graphs to Dnodes and junction nodes and the paths between them.The difference between ddgraphs and our decision graphs is the inclusion of the junction nodes in ddgraphs.
Another principal usage of control flow graphs is control flow analysis in compiler construction and optimization [19].Aho et al. [3] use control graphs to represent intermediate code in the form of three address statements for code generation during the compilation of programs.These statements have the form x = y op z or are unconditional goto-statements goto label or conditional gotostatements if (condition) goto label.A conditional goto is treated as one statement.Nodes represent basic blocks of sequential statements, which can be entered only by the first statement in the block and left by the last statement.Entry and exit nodes are separate nodes and not part of blocks.Ferrante et al. [1] derive program dependence graphs from control flow graphs that describe the data and control dependences in the program and use them for transformation and optimization of programs.
Kosaraju [20] defines flow charts recursively using different types of basic constructs and compares them to study the computational power of the underlying constructs.Analysis of programs by partitioning using segments, DD-paths, and other approaches is discussed by Paige [7].
A further application is to support the definition and evaluation of source-code-based metrics.For example, cyclomatic complexity can be, based on the cyclomatic number in graph theory [21], defined by counting the linearly independent circuits in the graphs [10,22].Sommerville [2] combines cyclomatic complexity and independent paths to design test cases in the white box test.Cyclomatic complexity for sets of functions can be defined in several ways.In the original paper, McCabe [22] defines the complexity of  components by V =  −  + 2, where  is the number of edges and  the number of nodes in the components.Henderson-Sellers and Tegarden [23] argue that if the components represent calling and called functions the control flow graphs of the called functions can be expanded in the control flow graph of the caller.Function call nodes are split into two nodes, a call node and a return-from-node with additional edges from the call node to the entry node of the called function and from the exit node of the called function to the return-from-node.Thus, 2( − 1) edges and −1 nodes are added, if the  components consist of one calling and −1 called functions.The expanded graph consists of one component and defines an alternative cyclomatic complexity of a set of functions: V LI = ( + 2( − 1)) − ( +  − 1) + 2 =  −  +  + 1.In [6], we compare the cyclomatic complexity of a control flow graph and that of its decision graph and prove that V(  ) = V( Ĝ ) + 1.
In order to do interprocedural analysis, Reps et al. [24] define a framework that consists of the set of control flow graphs for all functions in a program using the technique of node-splitting and expansion as described above.For data-flow analysis, especially, the interprocedural approach gives much better results than intraprocedural analysis [25].Kapfhammer [15] defines test coverage notions based on interprocedural control flow graphs.When classes are considered, interprocedural control flow graphs can be restricted to the methods of single classes.With this approach, Harrold and Rothermel [26] give a framework for data flow oriented testing of classes.One difference between procedural and object-oriented programming languages is polymorphism.In another paper, Harrold and Rothermel [27] solve this by the introduction of polymorphic call and return nodes.
So far, all mentioned approaches modeled the control flow on the level of higher programming languages or intermediate level.But it is also possible to analyze the control flow of machine-level programs.
One application of interprocedural control flow graphs on the lower level is the detection of self-mutating malware.Bruschi et al. [28] try to find the control flow graph of the searched malicious code as subgraph in the control flow graphs of the program and thus identify malware.
Abadi et al. [29] use the control flow graph of machinelevel programs to detect deviations from the control flow caused by attacks on the program.Computed jumps especially, have to be secured against destination addresses forged by attackers.
Usually, the control flow graphs are known when software is analyzed.If only the executable code is available, the control flow graphs have to be extracted before the analysis can take place.Various problems make the construction of the graphs imprecise, for example, when jump tables with data dependent target addresses are used.Theiling [30] describes a software framework that extracts the control flow graphs such that they can be used in a safe analysis of the worstcase execution time (WCET).An approach to construct the control flow graphs based on XML representations of the executable code in assembly form is proposed by Wenjian et al. [31].
Several tools exist that visualize control flow graphs.Figure 7 shows the control flow graph of the function Search generated by Crystal FLOW from SGV Software Automation Research Corporation (http://www.sgvsarc.com/).Such tools usually show more information in the graphs in order to support the understanding of the code or to be used in code reviews or documentation.A tool that uses control flow graphs to show the results of program analysis, in this case the WCET, is aiT by AbsInt (http://www.absint.com/ait/).With the visualization tool aiSee, the user can explore the graphs and thus inspect the WCET analysis results [32].
Control flow graphs can also be applied to the testing of hardware descriptions in VHDL.Zhang and Harris [33] introduce timing nodes to represent the timing information in VHDL descriptions and define data flow oriented du pairs coverage for hardware descriptions.Flow graphs are also useful in business process modeling.Sadiq and Orlowska [34] define workflow graphs where nodes represent tasks and edges represent the workflow between tasks.In this paper, workflow graphs are defined as acyclic graphs.A special iteration task is used to express the repetition of tasks.Workflow graphs are checked for deadlocks and lack of synchronization by graph reduction.Workflow charts that model humancomputer interactions that support business processes and allow loops are studied in [35].There, computer screens, forms and links are modeled by nodes in the graphs.One difference between these graphs and control flow graphs is that in order to model the workflow concurrency is necessary.In the testing of graphical user interfaces, event flow graphs model the events that occur as reaction to the interaction of the user with the interface [36,37].Coverage criteria are based on paths in the graphs modeling event sequences.The analogue to edge coverage is called by Memon et al. [38] event selection coverage because an edge (,   ) models the selection of an event   after the event  occurred.Like program decomposition into functions, graphical user interfaces are build up of components and intercomponent criteria can be defined.Decision trees, introduced by Raiffa and Schlaifer [39], are well known in probability theory.A decision tree consists of decision nodes where decisions are taken and of chance nodes where unknown states are modeled by different successors with assigned probabilities.The leaves of decision trees are utility nodes that specify the outcome of the decisions.Each path from the root to a leaf thus models a sequence of decisions and state assumptions that lead to the outcome under assigned probabilities.The drawback that decision trees grow exponentially with the number of decisions can be solved by more general structures such as influence diagrams [40].The main difference between these graph types and our decision graphs is that decision trees and influence diagrams are acyclic, and thus the length of decision sequences have always a fixed upper bound.Oliver [41] defines decision graphs as generalizations of decision trees where duplicated subtrees are joined and applies them to construct decision procedures from sets of examples.A practical application of decision trees is shown for example in [42] where decision trees for the prediction of the diagnosis and the outcome of Dengue illness are constructed from simple clinical and haematological data of 1200 patients using a decision tree classifier software tool.The authors of this study state as a conclusion that their algorithms are expected to help disease management.

Conclusion
In this paper, we derived decision graphs from directed graphs such that the branching structure is preserved.It can be shown that the branches in a graph without unconditional loops correspond to the edges in the decision graph.One useful application is the modeling of programs with control flow graphs.Decision graphs form an abstraction from control flow graphs that display only the decisions, for example, if-then-else-constructs and the paths between decisions to the programmer.With this approach, we compared different definitions of branch covering in software testing that already existed and showed the differences.When we exclude unconditional loops, branch coverage based on the edges in decision graphs subsumes edge coverage of the control flow graph and decision coverage.Control flow graphs are not only popular in software modeling but also popular in different other fields.Therefore, it seems promising to apply decision graphs to other domains and exploit their advantages.

Figure
Figure 1 shows as an example the control flow graph of the following function Search:

Figure 1 :
Figure 1: The control flow graph of the function Search.

Figure 2 :
Figure 2: Decision graph derived from the control flow graph of the function Search.

Figure 3 :
Figure 3: Control flow graph of function f1 and the derived decision graph.

Figure 5 :
Figure 5: Path  and DD-path   that contain the edge .

Definition 8 .
Let  be a test case of a function .If  is the observation time, we denote by d  (, ) the observed finite Spath in the control flow graph of the function that is induced by the execution of the test case  for time .The set D (, ) = { | there exists an observation with  =d  (, )} (1) is the set of all observed paths.For a set  of test cases, we write D (, ) = ⋃ ∈ D (, ) .
. The set of test cases  = {,   } with  as above and   with parameters values = 1, 2, 3, 4 and key = 2, which induces the complete path   =  1  2  3  4  7  8  11  12  3  4  5  6  11  13  14  15  16 satisfies statement coverage for the function Search since each node in the control flow graph of the function is covered by the paths in D(Search, ).A test case of a function induces paths in the control flow graph and thus also paths in the decision graph of the control flow graph if it runs through at least a second D-node.Definition 9. Let  be a test case of a function .We define D (, ) = { d |  ∈ D (, ) and  contains at least two D-nodes} , D (, ) = ⋃ ∈ D (, ) .

Definition 10 .
Let  be a function,   = (, ) the control flow graph of  and Ĝ = ( N, Ê) the decision graph of the control flow graph.(i) A set  of test cases satisfies decision coverage if and only if ∀ ∈  such that start () is a D-node ∃ ∈ D (, ) :  ∈  () .(4) (ii) A set  of test cases satisfies edge coverage (of the control flow graph) if and only if ∀ ∈ ∃ ∈ D (, ) :  ∈  () .

Figure 6 :
Figure 6: Another control flow graph with unconditional loop.

Figure 7 :
Figure 7: The control flow graph of the function Search generated by Crystal FLOW.
1 shows as an example the control flow graph of the following function Search: Definition 3. Let  = (, ) be a directed graph.A node  ∈  is called D-node if it is an entry node or an exit node or if outdegree() ≥ 2. A DD-path is a path  1  2 ⋅ ⋅ ⋅  +1 where the start and end nodes  1 ,  +1 are D-nodes and the other nodes  2 , . . .,   are not D-nodes.