An Approach to Convert XMI Representation of UML 2.x Interaction Diagram into Control Flow Graph

For automation of many software engineering tasks such as program analysis, testing, and coverage analysis, it is necessary to construct a control flow graph. With the advancement of UML, software practitioners advocate to construct control flow graph from some of the UML design artifacts. UML 2.x supports the modeling of control flow information in interaction diagram by means of message sequences and different types of fragments like alt, opt, break, loop, and so forth. Leading UML modeling tools, namely MagicDraw, IBM’s Rational, and so forth export models in XMI format. Construction of control flow graph from the XMI representation of an interaction diagram is not straightforward as model elements of interaction diagram are captured in XMI by means of values of attributes of multiple tagged elements and correlations among these tagged elements is not explicitly specified. This paper proposes an approach for construction of control flow graph from XMI representation of UML 2.x interaction diagram. A prototype tool based on our approach has been developed which can be plugged in any computer-aided software engineering tool.


Introduction
UML has now become a standard modeling language in software industries [1].Currently, UML models are being used in many software engineering tasks such as program analysis, testing, and coverage analysis [2][3][4][5][6][7][8][9][10][11].For automation of these tasks, it is necessary to construct a control flow graph from UML models.UML 2.x supports the modeling of control flow information in an interaction diagram by means of message sequences and different types of fragments like alt, opt, break, loop, and so forth [12][13][14].UML models are usually exported from UML modeling tools in XMI representation [15].Unfortunately, construction of control flow graph from XMI representation of interaction diagram is not straightforward.The factors contributing to the complexity of construction of control flow graph are: (i) model elements of interaction diagrams are captured in XMI by means of values of attributes of multiple tagged elements, (ii) correlations among tagged elements are not explicitly specified [16], and (iii) control flow information is modeled by means of different types of fragments that can be nested fragment with arbitrary nesting depth.
Existing approaches [2,4,11,17,18] consider a set of mapping rules to construct control flow graph from interaction diagrams [12].Mapping rules are in essence as follows.(1) Message, start and end of a fragment correspond to nodes of the control flow graph.(2) An edge is constructed between two nodes depending on various conditions such as type of nodes (message, start/end of a fragment) and type of fragments (alt, opt, loop, etc.).For applying these mapping rules to XMI representation of an interaction diagram, we need to do the following: (i) process the information associated with multiple tagged elements that correspond to a node in control flow graph, (ii) track the tagged elements stored in XMI representation between start and end of each operand of a fragment as well as between start and end of each fragment, and (iii) consider the interpretations of different fragments to map edges.The situation becomes complicated when an operand of a fragment contains a nested fragment of arbitrary nesting depth.In fact, the existing mapping rules are difficult to apply to the XMI representation of interaction diagram.This issue has been addressed in this paper.We propose an approach to parse the XMI representation of interaction diagram in an easy and tangible way and store model information in an intermediate data structure.This intermediate data structure enables us to apply the mapping rules to construct control flow graph.Apart from this, information retrieved from XMI representation can be embedded into control flow graph for many of the automated software engineering tasks such as test case generation, and coverage analysis.Based on our approach, a prototype tool XMI2CFG has been developed which can take any interaction diagram in XMI representation, parse it, construct its control flow graph, and visualize the graph to the users with details in it.
The rest of the paper is organized as follows.Few basic definitions, terminologies, and the control primitives for different types of fragments of interaction diagram have been discussed in Section 2. In Section 3, we discuss some issues of dealing with XMI representation for construction of control flow graph.Section 4 presents the proposed conversion approach from XMI of interaction diagram into control flow graph.We discuss our prototype tool in Section 5. Section 6 discusses related work and compares our work with the existing work.Finally, Section 7 concludes the paper.

Preliminaries
In this section, we first present few definitions and terminologies that have been used in the later sections.Then, we discuss control flow primitives for most commonly used fragments such as alt, opt, break, loop of interaction diagram.

Definitions and Terminologies
XMI. XMI stands for XML metadata interchange.It is a standard representation used for exchanging metadata information by means of extensible markup language (XML) [19].
Interaction Diagram.Interaction diagram is a UML behavioral diagram of a use case [12,13].This diagram models control flow by means of sequence of messages among objects of a system.The objects are shown as rectangular boxes arranged horizontally.The vertical lines falling from the rectangular boxes represent lifelines.The message is represented by means of an arrow connecting two lifelines attached to sender and receiver object of the message.Time is represented in the vertical direction showing the sequence of messages.
Message.A message refers to an instruction sent to an object.Sender object sends a message to a receiver object to invoke a method defined for that receiver object.Difference between message and method is that a message is a request for a receiver object to perform some task(s), and the message consists of a message name and a list of (zero or more) arguments.On the other hand, when a message is sent to a receiver object, a method with the same name and argument list as the message gets executed [20].
Fragment.A fragment is a group of set of messages together to show conditional flow in an interaction diagram.A fragment is also termed as interaction operator that operates on a group of operands, and each operand represents a set of messages that occur in a sequence under some guard condition.UML 2.x specification supports different types of fragments like alt (alternative), opt (optional), loop, break, and so forth [1].A fragment specifies how execution of operands is to be interpreted.For example, alt fragment specifies that only one operand whose guard condition gets satisfied, would be executed.
Model Element.Model element refers to the elements in a UML interaction diagram such as the start of a fragment, and end of a fragment, message.

Control Flow Graph. Control flow graph for an interaction diagram say
ItrDgm, is a directed graph CFG = N, E , where N = {n 1 , n 2 , . . ., n i , . . ., n q } is a set of nodes and E = { n i , n j } is a set of edges.A node n i ∈ N represents a model element of the ItrDgm, and an edge n i , n j ∈ E represents a control flow between two model elements say m i (corresponding to n i ) and m j (corresponding to n j ).
Nodes of the control flow graph are of two types: message node and fragment node representing message and fragment boundary, respectively.The message and fragment node is uniquely represented by a tuple T, ID, B, RO, M, PR, RVar , where (i) T is the type of the fragment or message.A fragment is of type like alt, opt, loop, and break, whereas message is of type synchronous, reply, and so forth.(ii) ID is the identification number of message/fragment. is the null interaction, then I x ≺ I y is called the left null precedence relation and written as Λ ≺ I y .On the other hand, if I y is the null interaction, then I x ≺ I y is called the right null precedence relation and written as I x ≺ Λ.

Control Flow Primitives.
We now consider control flow primitives for most commonly used fragments alt, opt, break, and loop.
Alt Fragment.The alt fragment is used to capture alternative flows by means of multiple operands [12].In Figure 1(a), an interaction diagram is shown with an alt fragment.The alt fragment has two operands containing the messages m2 and m3, respectively.
As observed in Figure 1(a), after receiving the message m1 from Ob jectA, Ob jectB sends the message m2 to itself if C1 becomes true, otherwise, Ob jectB sends the message m3 to Ob jectC.Finally, reply message m4 is sent from Ob jectB to Ob jectA.Opt Fragment.The opt fragment has only one operand that is executed optionally [12].In Figure 1(c), an interaction diagram contains an opt fragment with two messages m2 and m3.These two messages are executed if the guard condition C1 becomes true. Figure 1(d) depicts graph representation for the interaction diagram (in Figure 1(c)).Note that when the guard condition C1 becomes false, the control flow is transferred from the start of opt fragment to its end.This control flow is implicitly captured in interaction diagram and so in its graph representation.However, this control flow is modeled explicitly in control flow graph as an edge from the node opt S 1 (corresponding to the start of opt fragment) to the node opt E 1 (corresponding to the end of opt fragment) as shown in Figure 1(e).
Break Fragment.The break fragment is used to capture an exit point of the systems [12].In Figure 1(f), we see that if guard condition C1 becomes true, then it exits after execution of two messages m2 and m3, following which no further message (i.e., m4 and m5) would be executed.Note that the control flow which is transferred from the start of break fragment to the message m4 when the guard condition C1 becomes false is implicitly captured in both interaction diagram and its graph representation (see Figure 1(g)).The control flow for the break fragment is modeled explicitly by reducing the node break E 1 (corresponding to the end of the break fragment) as a sink node and introducing an edge from the node break S 1 to the node m 4 in control flow graph (see Figure 1(h)).
Loop Fragment.The loop fragment is used to model repetitive interactions [12].In Figure 1(i), we see that message m2 of the loop fragment gets executed repeatedly until the guard condition C1 associated with loop fragment becomes false.When C1 becomes false, loop terminates and execution of the message m3 commences.Figure 1 ).There are many more fragments such as par and ref defined in UML 2.x [12], whose graph representations and control flow graphs are not being discussed in this paper due to space limitation; however, they can be treated likewise.

Issues with Construction of Control Flow Graph
In this section, we discuss the issues that arise in construction of control flow graph from XMI representation of interaction      diagram using mapping rules as followed in the existing approaches.
Let us consider a simple interaction diagram as shown in Figure 2(a).The interaction diagram in Figure 2(a) has five messages m1, m2, m3, m4, and m5 and two fragments opt 1 and alt 1 .Figure 2(b) shows the graph representation with usual bearings as discussed in the previous section.The XMI representation of the interaction diagram as exported from MagicDraw 16.0 tool [21] appears as shown in Figure 3.All tagged elements in the XMI representation are referred by line numbers printed on the left side (see Figure 3).
Let us consider the mapping rules used for construction of a graph representation from an interaction diagram [2,4,11,17,18].As per the mapping rules: (1) message, start and end of a fragment are mapped into nodes in the graph representation; (2) an edge is considered between two nodes representing (a) two messages where one message follows another, or (b) one message and start of a fragment where the fragment follows the message, or (c) end of a fragment and one message where message follows the fragment, or (d) end of a fragment and start of another fragment where second fragment follows the first one; (3) for each fragment, edges are drawn (a) from the node representing the start of a fragment to the node representing the first element (message, fragment) in each of its operands and (b) from the node corresponding to the last element of each operand of the fragment to the node corresponding to end of that fragment.In order to apply the mapping rules as stated above, the primary task is to find the detailed information of model elements such as message, start and end of a fragment in interaction diagram from its XMI representation.
Let us see how the information such as sender and receiver objects and their classes of a message (say, m1) can be obtained from XMI representation.For this, we need to identify a tagged element that specifies the method whose name is the same as the message m1.In XMI representation, we observe that name attribute of the tagged element " ownedOperation " (representing an operation) at line 4 contains the value as same as the name of the message m1.The corresponding method is defined as call event " 443" in XMI representation (line 66).The call event " 443" actually corresponds to the receive event " 441" of the message " 439" (lines 66 and 25).Note that send and receive events of the message " 439'' correspond to the tagged elements containing the attribute value as "xmi: type = MessageOccurrenceSpecification" (lines 24 and 25).These send and receive events occur at the object lifelines referred as the attribute "covered = 376" and "covered = 393" (lines 24 and 25), respectively.The tagged elements corresponding to the lifelines ("xmi: id = 376", "xmi: id = 393") refer to the associated objects as the attribute "represents = 377" and "represents = 394" (lines 20 and 21).The tagged elements with the attribute xmi: id (" 377", " 394") contain the names of the objects: ObjectA and ObjectB (lines 16 and 17).Their class types are obtained as ClassA and ClassB from tagged elements at lines 2 and 3.All these information are for the message m1, which imply that ObjectA of ClassA sends a message m1 to ObjectB of ClassB.In other words, for the node m 1 (corresponding to the message m1) in graph representation, we are to retrieve the information encapsulated in the tagged elements 4, 66, 25, 24, 62, 20, 16, 2, 21, 17 and 3 of the XMI representation.

</xmi:XMI>
Table 1 shows the association of each node in graph representation with a group of tagged elements in XMI representation.It is evident from Table 1 that the information of each node in graph representation is not only associated with multiple tagged elements but also spread in different places of XMI representation.This implies that the mapping of model elements from XMI representation of interaction diagram to the nodes of its graph representation is not necessarily straightforward.In fact, the conversion process becomes complex when we try to apply the mapping rules to construct edges in graph representation for the fragments of an interaction diagram.This is due to the association of multiple tagged elements in XMI representation with each model element in interaction diagram and tracking the tagged elements between start and end of each operand of a fragment as well as between start and end of each fragment.This is difficult because tagged elements for a fragment are stored in an unstructured way, that is, intermingled with tagged elements of other fragments.The situation becomes more complicated when an operand of a fragment contains a nested fragment of arbitrary nesting depth.That is why the mapping rules are indeed too difficult to apply straightway to XMI representation.

Proposed Approach
To overcome the difficulties pointed out in the previous section, we have proposed the concept of interaction sequence.Here, the term interaction signifies either a single message or a set of messages of a fragment.We propose a solution to extract the interaction sequences for each fragment precisely and then map them to control flow graph following the mapping rules as discussed in Section 2.2.The major steps in our approach are shown in Figure 4.The first step in our approach is to synthesize metainformation from XMI representation of a given UML interaction diagram.These metainformation are then processed to identify the fragment set and message set.These sets are used to determine the nodes of graph representation in the second step.The fragment structure is obtained in next step.The fourth step determines the edges among nodes of the graph.The edges are labeled in the fifth step.This completes the revealing graph representation of the input interaction diagram.The last step applies set of control flow rules for different types of fragments to obtain the control flow graph.We now discuss the various steps in detail in the following.
Step 1 (identifying the fragments and their message sets).The first step of our conversion approach is to identify a set of fragments and their message sets from values of the attributes of the tagged elements in XMI representation.For this, we use standard SAX (Simple API for XML) parser [22].Note that SAX is an event-based parser.As the name implies, SAX parser generates events while reading an XML document.The events are related to element opening tags, element closing tags, content of elements, and so forth in the XML document.These events notify an application by calling appropriate event handlers implemented by the application.For example, two event handlers: startElement() and endElement() are invoked when parser reads the opening tag and closing tag of an element, respectively.During invocation of event-handlers, attributes of the tagged elements are passed as a list of parameters.The processing of values of the attributes of the elements in XMI representation is necessary to identify the fragments and their message sets.For this, following the steps need to be carried out.
(a) Storing Values from Tagged Elements.We store values of the attributes of tagged elements as objects of the following classes: EMessage, EMessageEvent, ECallEvent, EOperation, EFragment, EOperand, EOb ject, EClass, and ELi f eline as referred in Table 2.The association of tagged element and value of its attribute "xmi : ty pe" (see Figure 3) with an event is shown in Table 2.This table also depicts which class of object is used to store the values of attributes of a tagged element.Attributes of the classes are shown in Figure 5.The values of the most of the attributes for the objects of these classes would be obtained directly from the values of the attributes of the corresponding tagged elements.Moreover, values of some attributes of objects are to be either obtained from other objects or set with specific value.For example, value of the attribute "ty pe" of an object say, aMEvent (of EMessageEvent class) corresponding to a send event is set as aMEvent.ty pe = "sendEvent", whereas for receive event, it is set as aMEvent.ty pe = "receiveEvent".
If an operand contains an inner fragment with its Id as f i and aOperand (of EOperand class) is the object corresponding to the operand, then f i would be stored in "MessageList" of aOperand along with Ids of other messages in the same sequence as they appear in the operand.In a nutshell, after storing the values of attributes of the tagged elements we would obtain a set of lists of objects   2).
(b) Synthesizing Details of a Message.We need to synthesize detail information of a message such as name of object method that gets executed when the message is sent, sender object and its class, receiver object and its class, parameters of the message, and return variable (if any).In other words, we are to find the values of the attributes: MethodName, Note that one message corresponds to two message events, namely, send and receive events, which may or may not correspond to a call event.This is because only receive event of a message (other than reply message) and send event of reply message correspond to call events.Further, a call event (except for reply message) corresponds to an operation.All this information is represented in the class diagram (see Figure 5).To synthesize values of attributes for an object aMessage of EMessage class using associations (as shown in Figure 5), we identify two objects sMEvent, and rMEvent from the list of EMessageEvent objects such that (i) sMEvent and rMEvent correspond to send and receive events of the message corresponding to aMessage, respectively, (ii) if aMessage represents a reply message, then sMEvent should correspond to a call event (represented by an object say aCallEvent in the list of ECallEvent objects) otherwise, rMEvent would correspond to a call event (corresponding to aCallEvent) that is associated with an operation (represented by an object say, aOperation in the list of EOperation objects).In other words, the conditions are to be satisfied as follows.
(   ( We may note that for the nested fragment f i , S ( Step 2 (determining the nodes Step 3 (determining fragment structure).In order to find the edges among nodes in N of CFG, we need to determine the hierarchy structure of the fragments formed by the set of messages of interaction diagram (M seq ).In other words, we are to find the outermost fragments (the fragments that are not contained in another fragment) built by M seq and then determine the inner fragments contained in each fragment.
(a) To find the outermost fragments formed by the messages in M seq , we need to identify the minimal set of fragments F ⊆ F seq that together correspond to the largest subset of M seq .For this, we follow two steps.
(i) Initially, set F = F seq (set of all fragments in interaction diagram).(ii) Exclude the fragment f i from F if there exists another fragment f j / = f i and f j ∈ F such that M fi ⊂ M fj , where M fi and M fj are the sets of messages of the fragments f i and f j , respectively.Repeat this step until no such fragment f i in F can be excluded.The resultant set F is the minimum set of fragments that can replace the largest subset of the M seq .
We then replace the subset of M seq corresponding to the M fi for each fragment f i ∈ F by the ID of the fragment f i .This implies that the message set M seq covers all fragments in F as outermost fragments.
(b) To determine the inner fragments contained in each fragment f i in F seq , we need to find the minimal set of fragments F ⊂ F seq that together correspond to the largest subset of M fi (messages of the fragment f i ).
For this, we follow two steps given below.
(i) Find the set of fragments F ⊂ F seq such that the set of messages say, M fk of a fragment f k ∈ F corresponds to the subset of M fi such that i / = k.(ii) Exclude the fragment f k from F if there exists another fragment f j / = f k and f j ∈ F such that M fk ⊂ M fj , where M fk and M fj are the sets of messages of the fragments f k and f j , respectively.Repeat this step until no such fragment f k in F can be excluded.The resultant set F is the minimum set of fragments that can replace the largest subset of the M fi .
We then replace that largest subset of M fi by the IDs of the fragments in F .
Step 4 (determining the edges).After determining the hierarchy structure of the fragments formed by the set of messages M seq of the interaction diagram, we find the edges among nodes in N of the CFG using the following steps.
(a) We apply the precedence relation ≺ on M seq and obtain a set of precedence relations say, P(M seq ).For this, we consider the sequence number of all messages (i.e., SeqNumber of corresponding EMessage objects) in M seq , and in case M seq contains an interaction (i.e., fragment f i ), then we use the sequence number of a message in the M fi (message set of f i ).If P(M seq ) includes a precedence relation (I x ≺ I y ) such that I x ∈ F seq or I y ∈ F seq , then P(M seq ) = P(M seq ) ∪ P(M fk ), where P(M fk ) is the set of precedence relations on the set of messages of the fragment f k and f k = I x or I y .The unions are repeated for all fragment operands present in the precedence relations of P(M seq ).The reason for union operation in computation of P(M seq ) is the presence of some fragment operand f k in some precedence relation, which implies that precedence relations among the messages in the fragment f k also need to be considered.Note that we compute P(M fk ) using sequence of message Id(s) in the MessageList of the EOperand objects associated with EFragment object corresponding to the fragment f k .(b) Considering a precedence relation (I x ≺ I y ) in P(M seq ), we draw an edge following the set of rules as mentioned below.
(i) If I x , I y ∈ M seq , then we draw an edge from the message node (corresponding to I x ) to the message node (corresponding to I y ).This edge implies that the message I x occurs immediately before the message I y .(ii) If I x ∈ M seq , I y ∈ F seq , then an edge is drawn from the message node (corresponding to the message I x ) to the fragment node (corresponding to the start of the fragment I y ).The signification of this edge is that the message I x occurs immediately before the start of the interaction that corresponds to the fragment I y .(iii) If I x ∈ F seq and I y ∈ M seq , then we draw an edge from the fragment node (corresponding to the end of the fragment I x ) to the message node (corresponding to the message I y ).This edge implies that the message I y occurs immediately after the end of the interaction that corresponds to the fragment I x .
(iv) If I x , I y ∈ F seq , then an edge is drawn from the fragment node (corresponding to the end of the fragment I x ) to the fragment node (corresponding to the start of the fragment I y ).The significance of this edge is that the end of the interaction which corresponds to the fragment I x occurs immediately before the start of the interaction that corresponds to the fragment I y .
(c) Next, we draw an edge corresponding to each left null precedence relation (Λ ≺ I) ∈ P(M seq ).Note that the (Λ ≺ I) implies that I is the first interaction in a fragment f k ∈ F seq and that interaction I is either a message or a set of messages of the inner fragment f i of the f k and f i , f k ∈ F seq , where f i / = f k .Thus, we identify the M fk of a fragment f k ∈ F seq such that M fk contains I.It may be noted that M fk contains fragments other than messages because the subset of M fk has been replaced by the fragments in the preceding step (determine fragment structure).If I ∈ F seq , then we draw an edge from the fragment node (corresponding to start of the fragment f k ) to the fragment node (representing the start of the fragment I) otherwise, we draw an edge from the fragment node (corresponding to start of the fragment f k ) to the message node (corresponding to the message I).The edge thus obtained for a left null precedence relation implies that the edge is either between start boundaries of two fragments f k , I ∈ F seq or between the start boundary of the fragment f k and the message that occurs first in the f k .
(d) Similarly, for each right null precedence relation (I ≺ Λ) ∈ P(M seq ), we draw an edge in control flow graph.Note that the (I ≺ Λ) implies that I is the last interaction in a fragment f k ∈ F seq and that interaction I is either a message or a set of messages of the inner fragment f i of the f k , where f i , f k ∈ F seq and f i / = f k .Thus, we identify the M fk for a fragment f k ∈ F seq such that M fk contains I.If I ∈ F seq , then we draw an edge from the fragment node (representing the end of the fragment I) to the fragment node (representing the end of the fragment f k ) otherwise, we draw an edge from the message node (corresponding to the message I) to the fragment node (corresponding to end of the fragment f k ).The edge thus obtained for a right null precedence relation implies that the edge is either between end boundaries of two fragments I, f k ∈ F seq or between the message that occurs last in the fragment f k and the end boundary of the f k .
Step 5 (identifying the labels of edges).Once the edges of the control flow graph (CFG) are determined, we assign the guard conditions associated with operands of each fragment of F seq to the edges in CFG.For this, we consider the edge corresponding to each left precedence relation (Λ ≺ I) ∈ P(M seq ) and label the edge same as the guard condition associated with the operand of the fragment that contains I (combined fragment or message).To obtain guard condition, we use value of the instance variable Guard of the : two fragment nodes representing the start and end of the fragment f i m j , m k : two message nodes representing two messages f S i → m j : an edge from the fragment node representing start of fragment f i to message node m j f S i → c m j : an edge from f S i to m j labeled with the condition c f S i m j : add an edge from f S i to m j f S i !c m j : add an edge from f S i to m j labeled with the condition !c EOperand object whose MessageList contains the Id of the I.The interpretation of this label assignment is, if the guard condition associated with edge is satisfied then all messages in the operand of the fragment would be executed.All other edges would have no label.
Step 6 (applying control flow rules).Once preceding five steps are over, construction of graph representation for the interaction diagram is complete.Note that graph representation of interaction diagram captures control flow for the fragments loop, opt, break, re f , and so forth implicitly (see the discussions in Section 2.2).To reduce this graph representation into control flow graph, we propose a set of rules with the help of the notations given in Table 3.
(a) Loop Fragment.The first six rules (R1-R6) are for the loop fragment.Rule R1 says that if there is an edge labeled with c from the fragment node representing the start of loop fragment loop i to message node m j and another edge from fragment node representing the end of the loop fragment loop i to message node m k , then do the following: (i) add back edge from end of loop fragment loop i to the start of that loop fragment, (ii) add a loop exit edge from the start of loop fragment loop S i to the message node m k with the label same as !c, and (iii) delete the edge from end of the loop fragment loop i to the message m k .Rules R2 to R6 are similar to the rule R1, but only the difference is the contexts where rules are applied.For example, when loop fragment loop i has an inner fragment f j as the first interaction, then rule R2 is applied.Rule R3 is applied when loop i is followed by another fragment f k .Rule R4 is applied when loop i is contained in some other fragment f k as the last interaction.Rule R5 is applied if loop i has an inner fragment f j as the first interaction and loop i is followed by the fragment f k .Rule R6 is applied when loop fragment loop i has an inner fragment f j as the first interaction, and loop i is contained in some other fragment f k as the last interaction.
(b) Opt Fragment.The two rules (R7 and R8) are for the opt fragment.According to the rule R7, if there is an edge labeled with c from the fragment node representing the start of opt fragment opt i to some message node m j , then add an edge with the label same as !c from fragment node representing the start of opt fragment opt i to the end of that opt fragment.Rule R8 is similar to the rule R7, but the difference is that rule R8 is applied only when the opt fragment contains an inner fragment f j as the first interaction.
(c) Ref Fragment.Only one rule (R9) is for the Re f fragment.Rule R9 says that if re f fragment refers to the interaction diagram ID whose graph representation is ID(S ID , E ID ) with the start node S ID and end node E ID , respectively, then do the following: (i) add an edge from the start of re f fragment to S ID , (ii) add an edge from E ID to the end of re f fragment, and (iii) delete the edge from the start of re f fragment to the end of re f fragment.
(d) Break Fragment.Next six rules (R10-R15) are for the break fragment.Rules R10-R15 are similar to the rules R1-R6, but only the difference is that back edge is not there in case of break fragment, which is added for loop fragment (see the rules from R1 to R6 and from R10 to R15).
Illustration of Our Approach.We illustrate our approach for conversion from XMI representation of an interaction diagram to an equivalent control flow graph with the help of a case study pertaining to a Restaurant Automation System (RAS).The RAS automates various functionalities of a restaurant such as Make Order, Process Order, and Generate Bill.Here, we focus only on a particular use case, namely, Generate Bill.In Generate Bill use case, manager of the restaurant inputs Order Number of an order whose Bill is to be generated.Depending on current status of the order (which may not even be processed or delivered) and whether Bill has already been generated for this Order or not, many scenarios can occur, which are modeled in interaction diagram as shown in Figure 6.All messages and fragments in the interaction diagram are referred to as Message Numbers and Fragment labels, respectively (see Figure 6).
Identifying the Fragments and Their Message Sets.Considering XMI representation of interaction diagram as input (see Figure 6), we first parse it using SAX parser and then obtain a set of lists of objects of following classes: (a) EMessage, (b) EMessageEvent, (c) ECallEvent, (d) EOperation, (e) EFragment, (f) EOperand, (g) EOb ject, (h) EClass, and (i) ELi f eline (see Table 2).The values of the instance variables of aMessage object in the list of EMessage objects are set with the values from the objects of other classes considering class relationships as discussed in Step 1.We then obtain the following.
Determining the Nodes.We determine the nodes of control flow graph from M seq and F seq as follows.
(a) We add two fragment nodes into the set of nodes N of control flow graph for each fragment f i ∈ F seq and thus obtain N , break E 2 } (see Figure 7).Here, the filled nodes represent fragment nodes, and empty nodes represent message nodes (see control flow graph in Figure 7).
(c) We store the values of tuple T, ID, B, RO, M, PR, RV ar for all nodes in N of CFG N, E in a table named as Node table (see Table 4).For message node m i ∈ N, we use the values of instance variables of EMessage object corresponding to the message i ∈ M seq .We obtain the information for the fragment nodes from the EFragment objects (see Table 4).Note that for message node corresponding to a reply message, PR would be same as the return value if any, and both M and RV ar would be empty.
Determining Fragment Structure.In this step, we first find the outermost fragments formed by the set of messages M seq .We then determine the inner fragments contained in each fragment to find the hierarchy structure of the fragments.
(a) We need to determine the minimum number of fragments F that together correspond to the largest subset of M seq .The F is initialized as the set of all fragments of the interaction diagram, that is, ).After applying these control flow rules, we obtain the final control flow graph as shown in Figure 7(b).

XMI2CFG: A Prototype Tool
We have developed a prototype tool named as XMI2CFG (XMI of interaction diagram to control flow graph) following our approach.We have implemented XMI2CFG in Java language (Java 2) using NetBeans IDE 6. Dependency among these classes is also depicted in Figure 8.
The M yParser class implements the event-handlers startElement(), endElement(), characters(), and endDocument() to interface with SAX parser.For this, we have used the library of Apache Xerces available in the web portal [22].In the event-handler startElement(), we process the tagged elements starting with the names same as "ownedAttribute", "lifeline", "fragment", "operand", "guard", "specification", "argument", "body", "ownedParameter", "message" and "packagedElement", "ownedOperation", "ownedBehavior", "guard", and "ownedParameter".Depending on type of tagged elements, we categorize them as "MessageEvent", "Fragment", "CallEvent", "Object", "Class", "Lifeline", "Operand", "Message", "Operation", "Parameter", "SequenceDiagram", and "Guard".When multiple tagged elements start with the same name, then we consider the value of the attribute "XMI type".For example, tagged elements specifying the class name and call event start with the same name "packagedElement".In order to distinguish them, we have considered whether the attribute "XMI type" has the value as "uml:CallEvent" or "uml:Class".For each processed tagged element, we retrieve the associated metainformation of interaction diagram from SAX parser and store them by means of instance variables: Id, Name, ClassId, Ob jectId, FragmentType, Guard, SendEventId, ReceiveEventId, CallEventId, MessageT y pe, OperationId, MessageId, and Li f elineId of the class named M yParser.Note that after processing of a tagged element, only relevant variables would have meaningful values, and the rest would have the null value.
Once metainformation of interaction diagram is available through instance variables of M yParser object, we instantiate metaobject of particular type as mentioned in metaclass diagram of interaction diagram (see Figure 5).We then pass the meta-object to CFGConstructor object via parameter of its method RegisterSAXEvent().After all necessary metaobjects are passed to CFGConstructor, then CFGConstructor object would have a set of arraylists of objects: EMessageList, EFragmentList, EOperandList, EOperationList, ECallEventList, EMessageEventList, ELi f elineList, EOb jectList, and EClassList.These arraylists store meta object of types EMessage, EFragment, EOperand, EOperation, ECallEvent, EMessageEvent, ELi f eline, EOb ject, and EClass, respectively.To keep the track of the hierarchy structure of fragments as well as operands of a fragment, we use two arraylists: FragmentIDList and OperandIDList.Note that FragmentIDList and OperandIDList keep the IDs of fragments and operands whose processing is still remaining.In order to update these two arryalists, we add the ID into corresponding arraylist within startElement() when SAX parser notifies about the opening of fragment/operand tag.Similarly, we remove the ID of fragment/operand from the corresponding list within endElement(), when SAX parser notifies about the closing of the fragment/operand.We also use different flags such as OperationParameterFlag, and GuardFlag to keep tracking the end of processing of parameters of recent operation and guard associated with fragment operand.
The conversion logic is encapsulated in different methods such as ComputeMessageSet(), ComputeFragmentSet(), DetermineFragmentStructure(), DetermineCFGNodes(), ComputePrecedenceRelations(), DetermineCFGEdges(), DetermineEdgeLabel(), and ApplyControlFlowRules() of CFGConstructor class.Two methods ComputeMessageSet() and ComputeFragmentSet() perform the first step of our conversion approach, that is, computing the message set and fragment set.Two arraylists EMessageList and EFragmentList are used for this purpose.DetermineCFGNodes() performs the second step of conversion.Two Node objects for each fragment (specifying the start and end of the fragment) and one Node object for each message are instantiated.Node information (i.e., the values of instance variables of Node object) is obtained from arraylists EMessageList and EFragmentList.Once a Node object is instantiated, it is added into arraylist named CFGNodeList.As per our conversion approach, DetermineFragmentStructure() first identifies the minimum number of outermost fragments that can replace largest subset of message set of interaction diagram and replaces that subset by IDs of outermost fragments.After that, for message set of each fragment, a set of inner fragment IDs is identified, and the subset of message set of the fragment is replaced by IDs of inner fragments (see details in the third step of our conversion procedure).ComputePrecedenceRelations() computes the precedence relations using SeqNo. of interaction (message/fragment) in message set of interaction diagram.If a fragment operand is found in some precedence relation, then precedence relations among messages of the fragment also need to be computed.Based on type of operand in precedence relation, DetermineCFGEdges() determines the edges between message and message, message and fragment start, fragment end and message, and fragment end and fragment start.In addition to this, DetermineCFGEdges() also finds the edges corresponding to null precedence relations.These edges are between fragment start and message, fragment start and fragment start, message and fragment end, and fragment end and fragment end.For each edge, we identify a pair of Node objects which correspond to the end nodes of the XMI2CFG supports different menu options for selecting an XMI file, displaying XMI file, parsing selected XMI file, starting conversion, and display control flow graph in both DOT language and image form.One typical usage scenario of XMI2CFG is depicted in Figure 9.In this usage scenario, we first select and display the XMI file in upper left panel.We then select the option to start parsing and subsequently construct the control flow graph.For this, we invoke CFGConstructionUnit component.Once control flow graph construction is complete, we visualize it in both DOT language format and image form in lower left and right panel, respectively.

Comparison with Related Work
Control flow analysis has been investigated widely in the context of program analysis and compiler design [26]. is processed, if automatically.We give a way out to this hindrance.The prototype tool XMI2CFG based on the proposed approach has been tested with a large number of interaction diagrams in real-life applications.A thorough investigation on these applications substantiates the correctness of the proposed approach.Indeed the proposed approach bridges the gap between theory and practice of converting (XMI representation of) UML 2.x interaction diagram to control flow graph.To add more, the methodology can be extended to other UML diagrams such as activity diagram and interaction overview diagram with a minor enhancement in each case.
Figure 1(b) depicts corresponding graph representation for the interaction diagram containing alt fragment.Note that four nodes m 1 , m 2 , m 3 , and m 4 represent four messages m1, m2, m3, and m4, respectively, and two nodes alt S 1 and alt E 1 correspond to the start and end of the alt fragment, respectively.Two alternative flows are modeled by two outgoing edges from alt S 1 and two incoming edges at the node alt E 1 in the graph representation.Note that this graph representation contains the same information as in interaction diagram.In this case, control flow is explicitly modeled for alt fragment in interaction diagram and hence, control flow graph would be the same as the graph representation (see Figure 1(b)).
(j) shows the graph representation for the interaction diagram Figure 1(i).The control flow for the loop fragment is explicitly modeled in the control flow graph by a back edge from the node loop E 1 to the node loop S 1 and another loop exit edge from loop E 1 to m 3 (see Figure 1(k)

Figure 1 :
Figure 1: The control primitives for the fragments alt, opt, break, and loop of interaction diagram.

Figure 2 :
Figure 2: An example interaction diagram and its graph representation.(a) An interaction diagram.(b) Graph representation for Figure 2(a).

Figure 4 :
Figure 4: Block diagram of conversion procedure from XMI of interaction diagram to control flow graph.

Figure 5 :
Figure 5: Class diagram for metadata of interaction diagram in XMI.
SenderOb ject, SenderClass, ReceiverOb ject, ReceiverClass, SeqNumber, ParameterList, and ReturnV ar of the object aMessage (of EMessage class) corresponding to a message from the data stored in other objects.To do this, it is necessary to find relationships among classes of these objects.For this, we consider the inherent structure of XMI representation as well as metamodel of interaction diagram as given in UML superstructure specification[1].The class diagram for representing the relationships among the classes which are used to store tagged elements is shown in Figure5.

( 2 ) 3 )
(d) Finding Message Sets of Fragments.To determine the set of messages of a fragment f i ∈ F seq , we identify the aFragment object (corresponding to the fragment f i ) from the list of EFragment objects and then the set of EOperand objects S fi EOprds such that an object aOperand ∈ S fi EOprds is associated with aFragment object, that is, aFragment.OperandList contains the aOperand.Id.In other words, S fi EOprds = aOperand | aOperand exists in the list of EOperand objects & aFragment.OperandList contains aOperand.Id & f i −→ aFragment .(We then find the set of EMessage objects S fi EMsgs for the fragment f i such that aMessage ∈ S fi EMsgs is associated with an object aOperand ∈ S fi EOprds , that is, aOperand.MessageList contains aMessage.Id.In other words, S fi EMsgs = aMessage | aMessage exists in the list of EMessage objects & aOperand.MessageList contains aMessage.Id & aOperand ∈ S fi EOprds . fi EMsgs would contain an object aFragment in the list of EFragment objects and therefore, for each aFragment (corresponding to a fragment f j ) in S set of EMessage objects for the fragment f j .Once the update of S fi EMsgs is complete, we obtain the set of messages M fi of the fragment f i from S fi EMsgs as M fi = aMessage.Id | ∈ S fi EMsgs .

Figure 6 :
Figure 6: Interaction diagram of Generate Bill Use Case.
(b) We then identify set of messages for a fragment f i ∈ F seq with considering the EFragment object corresponding to f i as well as the set of EOperand objects associated with the EFragment object (see = "D eli ve re d" a O rd e r = N u ll i<OrderList.size()& found =false fo u n d = tr u e aO rd er != N u ll Status !=

Figure 7 :
Figure 7: Graph representation and control flow graph of interaction diagram for Generate Bill use case.
1 [23].Input of XMI2CFG is the XMI representation of UML 2.x interaction diagram.We have used MagicDraw 16.0 [21] to draw interaction diagram and subsequently exported this diagram in the form of XMI representation.XMI2CFG visualizes the control flow graph as the output.XMI2CFG consists of two main components: CFGConstructionUnit and VisualizationUnit.CFGConstructionUnit first parses the XMI of interaction diagram and then converts it into control flow graph.Taking control flow graph as the input, Visualizatio-nUnit changes it into the DOT language format [24] and produces an image to visualize the control flow graph.Two components: CFGConstructionUnit and VisualizationUnit are described below.CFGConstructionUnit.This component parses the XMI representation of interaction diagram using SAX parser.The class diagram of this component is shown in Figure 8.This component comprises of two main classes: M yParser, and CFGConstructor, and other auxiliary classes: EMessage, ELi f eline, EClass, ECallEvent, EMessageEvent, EOb ject, Fragment, EOperand, EOperation, EFragment, Node, and Edge.

Figure 9 :
Figure 9: Screen shot of XMI2CFG showing conversion of XMI representation of interaction diagram into control flow graph.
(iv) RO is a reference to the receiver object of the message corresponding to message node.It has the value null for fragment node.(v)M is a method gets executed when RO receives the message corresponding to message node and is null for fragment node.(vi)PR is the set of parameters of the method M, and is null for fragment node.I = {m 1 , m 2 , ..., m n | n ≥ 0 },where m 1 , m 2 , ..., m n are the n number of messages in the interaction diagram.If n = 0, then I is referred to as null interaction, and if n = 1, then I is referred to a message.On the other hand, if n > 1, then I is referred to a set of messages of a fragment.Precedence Relation.For any two interactions I x and I y in a set of interactions M I , we say there is a precedence relation I x ≺ I y if I x ∈ M I occurs immediately before I y ∈ M I according to the timing order in the interaction diagram.It implies that if there exists a precedence relation between I x and I y in M I , then there would be no I z ∈ M I such that I x ≺ I z and I z ≺ I y .This relation satisfies the following properties.(a)If I x ≺ I y , then I y / ≺I x , where I x , I y ∈ M I (asymmetric).(b) If I x ≺ I y and I y ≺ I z , then I x / ≺I z , where I x , I y , I z ∈ M I (non-transitive).
Each message/fragment is identified by a unique identification number.(iii) B is the start or end boundary of a fragment.The value of B is set as S and E for the start and end boundary of the fragment, respectively.It is null for message node.(vii) RV ar is a return variable that keeps the value returned by the method M. It is null for fragment node.Interaction.An interaction I is a set of zero or more message occurrences in an interaction diagram [1].That is, x (non-reflexive).Null Precedence Relation.Let P(M I ) be a set of precedence relations on M I .Note that if M I is the set of messages of a fragment f i and an operand of the fragment f i contains a single message m , then there would be no precedence relation containing the message m in P(M I ).To ensure the existence of some precedence relation in P(M I ) that would contain m as an operand, null interaction may be assumed to have occurred before or after m .The precedence relation between m and null interaction is called null precedence relation, which is defined below.A precedence relation (I x ≺ I y ) of P(M I ) is called a null precedence relation if either I x or I y is null interaction.If I x

Table 1 :
Association of nodes in Figure 2(b) with tagged elements in XMI representation of Figure 3.

Table 2 :
Classes to store information of elements in XMI representation.
Next, we are to find the return variable associated with the message say, m a corresponding to object aMessage.Note that the return variable of m a is represented as a reply message and specified immediately after m a in XMI representation.For this, we identify an object bMessage from a list of a) sMEvent.type="sendEvent"&sMEvent.Id = aMessage.SendEventId.(b)rMEvent.type="receiveEvent"&rMEvent.Id = aMessage.ReceiveEventId.(c)(aCallEvent.Id = rMEvent.CallEventId & aOperation.Id = aCallEvent.OperationId) OR (aCallEvent.Id = sMEvent.CallEventId & aMessage.messageType="reply").These send and receive events (represented by sMEvent and rMEvent) must occur at two lifelines (represented by two objects say sLi f eline and rLi f eline in the list of ELi f eline objects).These two lifelines (sLi f eline and rLi f eline) are associated with two objects sOb ject and rOb ject in the list of EOb ject objects, respectively.All these imply the satisfiability of the following conditions.(d)sLifeline.Id = sMEvent.Li f elineId & rLi f eline.Id = rMEvent.Li f elineId.(e)sObject.Id = sLi f eline.Ob jectId & rOb ject.Id = rLi f eline.Ob jectId.(f)sClass.Id = sOb ject.ClassId & rClass.Id = rOb ject.ClassId.="reply").aMessage.SenderOb ject = sOb ject.Name.aMessage.ReceiverOb ject = rOb ject.Name.aMessage.SenderClass = sClass.Name.aMessage.ReceiverClass = rClass.Name.aMessage.SeqNumber = aCallEvent.SeqNo.aMessage.parameterList=aOperation.parameter-List.−aCallEvent.SeqNo.=1.(j) bMessage.messageType="reply".(k)aMessage.messageType/="reply".Here, bCallEvent refers to a call event for the message corresponding to bOb ject.If such bMessage object exists for the aMessage, then we set ReturnV ar of the aMessage object same as the return value stored in argumentV alueList [0] of the bMessage (because reply message can not have more than one return value).In other words,aMessage.ReturnV ar = bMessage.argumentValueList[0](1)Oncevalueof aMessage.ReturnV ar is synthesized from the bMessage object, then bMessage becomes redundant and it should be removed from the list of EMessage objects.(c)Finding Message Sets and Set of Fragments.After synthesis of values of the attributes of an object aMessage in the list of EMessage objects, we need to find a set of messages M seq , set of reply messages R seq (using the value of attribute "messageType") from that list.A set of fragments F seq is also to be obtained from the list of EFragment objects.M seq , R seq , F seq can be determined as follows.M seq = { aMessage.Id | aMessage exists in the list of EMessage ob jects }.R seq = { aMessage.Id | aMessage.Id ∈ M seq & aMessage.messageTy pe = "reply" }.
).Initially, control flow graph CFG N, E is empty.That is, the set of nodes N and set of edges E are both null.In this step, we determine the message nodes and fragment nodes (i.e., N) of CFG.For this, we use the set of messages M seq , set of fragments F seq as obtained in represent the start and end of the fragment f i , respectively.For each fragment node, we consider (i) the values of instance variables: ty pe (T), ID of the EFragment object corresponding to the fragment f i and (ii) the boundary (B) of the fragment f i (i.e., start boundary or end boundary represented by the fragment node).The value of attribute (B) for the start boundary and end boundary is S and E, respectively.For fragment node, the values of the attributes: RO, M, PR, and RV ar of corresponding tuple T, ID, B, RO, M, PR, RV ar would be null.(b) For each m i ∈ M seq , we add a message node m i into N.To obtain the values of associated attributes: T, ID, RO, M, PR, and RV ar of the tuple for the message node m i , we consider the values of the instance variables: messageT y pe (T), Id, ReceiverOb ject (RO), MethodName (M), ParameterList and ArgumentV alueList (set of parameters and their values, PR), and ReturnV ar (RV ar) (if any) of the EMessage object corresponding to the message m i .For message node, value of the attribute B would be null.(c) For each node in N of CFG N, E , we store corresponding tuple T, ID, B, RO, M, PR, RV ar in a table, called Node table.The entries corresponding to message nodes in Node table would be in a sequence as same as the sequence number (SeqNumber) of corresponding EMessage objects.

Table 3 :
Set of rules.