Frequent Statement and Dereference Elimination for Imperative and Object-Oriented Distributed Programs

This paper introduces new approaches for the analysis of frequent statement and dereference elimination for imperative and object-oriented distributed programs running on parallel machines equipped with hierarchical memories. The paper uses languages whose address spaces are globally partitioned. Distributed programs allow defining data layout and threads writing to and reading from other thread memories. Three type systems (for imperative distributed programs) are the tools of the proposed techniques. The first type system defines for every program point a set of calculated (ready) statements and memory accesses. The second type system uses an enriched version of types of the first type system and determines which of the ready statements and memory accesses are used later in the program. The third type system uses the information gather so far to eliminate unnecessary statement computations and memory accesses (the analysis of frequent statement and dereference elimination). Extensions to these type systems are also presented to cover object-oriented distributed programs. Two advantages of our work over related work are the following. The hierarchical style of concurrent parallel computers is similar to the memory model used in this paper. In our approach, each analysis result is assigned a type derivation (serves as a correctness proof).


Introduction
Distributed programming is about building a software that has concurrent processes cooperating in achieving some task. For a problem specification, the type, number, and the way of interaction of processes needed to solve the problem are decided beforehand. Then a supercomputer can be computationally simulated by a group of workstations to carry different processes. A group of supercomputers can in turn be combined to provide a computing power greater than that provided by any single machine. This enormous computing power provided by distributed systems is why the distributed programming style [1][2][3] is quite important and attractive. Among examples of distributed programming languages (DPLs), based on machines having multicore processors and using partitioned-global model, are Unified Parallel C (UPC), Chapel, Titanium which is based on Java, and X10.
Among advantages of object-oriented programming (OOP) is combining other styles such as imperative, functional, and relational programming. Concepts of class, procedure, and inheritance are basics for OOP. These concepts result in dynamic behavior in various implementations of object-oriented programming languages.
Recomputing a nontrivial statement and reaccessing a memory location are waste of time and power if the value of the statement and the content of the location have not been changed. The purpose of frequent statement and dereference elimination analysis is to save such wasted power and time. This is an interesting analysis because it involves connecting statement and dereference calculations to program points where the calculated values may be reused. The analysis also requires changing program points at the ends of these connections. Such changes to program points have to be done carefully so that they do not destroy the compositionality. Our approach to treat this analysis is a type system [4,5] built 2 The Scientific World Journal (0) (1) x := k + l; (2) (3) (4) (5) (6) else l := c * d; k := a * b; x := conert( * (a * b), 2); y := transmit c * d from (3); then y := transmit * (c * d) from (2); x := conert ( * (a * b), 2); if ( * (a * b) = * (c * d)) x := a * b + c * d; x := conert( * k, 2); y := transmit l from (3); if ( * k = * l) then y := transmit * (c * d) from (2); else x := conert( * (a * b), 2); For different programming languages, in previous work [4,5], we have proved that the type systems style is certainly an adaptable approach for achieving many static analyses. This paper proves that this style is flexibly useful to the involved and important problem of frequent statement and dereference elimination of imperative and object-oriented distributed programs.
This paper introduces new techniques for frequent statement and dereference elimination for imperative and objectoriented distributed programs running on hierarchical memories. Simply structured type systems are the main tools of this paper's techniques presented using the languages ℎ of Figure 2 and OODP of Figure 3. These languages are equipped with basic commands for distributed execution of programs and for pointer manipulations. The single program multiple data (SPMD) model is the execution archetypal used in this paper. On different data of different machines this archetypal runs the same program. The analysis of frequent statement and dereference elimination for distributed programs is achieved in three steps each of which is done using a type system. The first of these steps achieves ready statement and memory access analysis. The second step deals with semiexpectation analysis and builds on the type system of the first step. The third type system takes care of the analysis of frequent statement and dereference elimination and is built on the type system of the second step. The paper also illustrates how these type systems can be generalized to cover objectoriented distributed languages. This paper is an extended and revised version of [6], which treats imperative distributed programs. The work of [6] was generalized in Section 5 of the current paper to cover object-oriented distributed programs. The soundness theorems of the current paper are stated using memory model and operational semantics in the appendix of [6].
Motivation. The left-hand-side of Figure 1 presents a motivating example of our work. We note that lines 4 and 6 dereference * which has already been dereferenced in line 2 with no changes to values of and in the path from 2 to 6. This is a waste of computational power and time (accessing a secondary storage). One objective of the research in this paper is to avoid such waste by transforming the program into that in the right-hand-side of the algorithm. This is not all; we need to do that in a way that provides a correctness proof for each transformation. We adopt a style (type systems) that provides these proofs (type derivations).
Contributions. Contributions of this paper are new techniques, in the form of type systems, for achieving the following analyses for imperative and object-oriented distributed programs.
(1) The analysis of ready statement and memory access.
(3) The analysis of frequent statement and dereference elimination.
Organization. The rest of the paper is organized as follows. Section 2 presents the type system achieving the analysis of ready statement and memory access for imperative distributed programs. The analysis of semiexpectation as an enrichment of the type system presented in Section 2 is outlined in Section 3. The main type system carrying the analysis of frequent statement and dereference elimination is contained in Section 4. Type systems of Sections 2, 3, and 4 are generalized in Section 5 to cover object-oriented distributed programs. Related and future works are discussed in Section 6.

Ready Statement and Memory Access Analysis of ℎ
If the value of a statement and the content of a memory location have not been changed, then the compiler should not recompute the statement or reaccess the location. The purpose of frequent statement and dereference elimination is to save the wasted power and time involved in these repeated computations. This is not a trivial task; compared to other program analyses, it is a bit complex. This task is done in stages. The first stage is to analyze the given program to recognize ready statements and memory locations. The analysis of ready statements and memory locations calculates for every program point the set of statements and memory locations that are ready at that point in the sense of Definition 1. This section presents a type system (ready type system) to achieve this analysis for imperative distributed programs. Program ::= Defs: S. S ∈ Stmts ::= n | true | false | x | S 1 i op S 2 | S 1 b op S 2 | * S | skip | name | x := S | S 1 ← S 2 | S 1 ; S 2 | if S then S t else S f | while S do S t | x · S | S 1 S 2 | letrec x = S in S | new l | conert (S, n) | transmit S 1 from S 2 . Defs ::= (name = S); Defs | .  (2) At a program point , a memory location is ready if each computational path to (a) reads at some point (say ) and (b) does not modify content of between and .
The ready analysis is a forward analysis that takes as an input a set of statements and memory locations (the ready set of the first program point). It is sensible to let this set be the empty set. The set of types of our ready type system has the form: is the set of global addresses. This set is defined precisely in the appendix of [6], and (3) points-to-types is a set of points-to-types (typically have the form of maps from the union of variables and global addresses to the power set of global addresses [4,7]).
The subtyping relation has the form ≤ × ⊇, where ≤ is the order relation on the points-to-types and ⊇ is the order relation on P( + ∪ ). A state on an execution path is of type ∈ P( + ∪ ) if all elements of are ready at this state according to Definition 1. Judgments of the ready type system have the form : ( , ) → ( , , ). The symbols and denote the points-to-types of the before and after states of executing . The set denotes the set of addresses that may evaluate. We assume that all such pointer information is given along with the statement . Techniques like [4,7] are available to compute the pointer information. For a given statement along with pointer information and a ready pretype rs, we present a type system to calculate a post ready-type such that : ( , ) → ( , , ). The type derivation of this typing process is a proof for the correctness of the ready information. The meaning of the judgment is that if elements of are ready before executing , then elements of are ready after executing . The inference rules of the ready type system are presented in Algorithm 1. Comments on the inference rules are in order. We note that numbers, variables, and the allocating statement (new) do not affect the ready pretype. In line with semantic rules ( ) and ( ) [6], nontrivial arithmetic and Boolean statements and their nontrivial substatements are made ready. The direct assignment rule (:= ) expresses that after executing the assignment the substatements of r.h.s. become ready and that all statements involving become unready as the value of may become different. The rule ( * ) reflects the fact that the statement * becomes ready after executing the dereference. Moreover if evaluates a single address according to the underlying pointer analysis, then this address becomes ready as well. However if evaluates a large set of addresses (more than one), then we are not sure which of these addresses is the concerned one and hence cannot conclude any readiness information about addresses. The rule (← ) adds the substatements of 1 and 2 to the ready pretype. Since the content of address referenced by 1 is possibly changed after executing the statement, all statements involving dereferencing this address are removed from the set of ready items. Remaining rules are self-explanatory. The Boolean statements and have inference rules similar to that of .
All in all, the information provided by type derivations obtained using this and the following type system is classified into two sorts. The first sort is about knowing the program point at which a particular statement becomes ready. The second sort of information is about the program point at which a precomputed value of a ready statement can be replaced with the statement. Now we recall the assumption that our distributed system consists of | | machines. For a given statement and a given machine , the type system of Algorithm 1 calculates for each The rule (main-rs) supposes a suitable notion for the join of pointer types. The soundness of the ready type system is stated asfollows.

Semiexpectation Analysis of ℎ
The aim of frequent statement elimination is to introduce new variables to accommodate values of frequent statements and The Scientific World Journal 5 reusing these values rather than recomputing the statements. Analogously, the aim of frequent dereferences elimination is to introduce new variables to accommodate values of frequent dereferences and reusing these values rather than reaccessing the memory. The information gathered so far by the ready type system introduced in the previous section is not enough to achieve frequent statements and dereferences elimination. We need to enrich the ready information, assigned to each program point, with new information called semiexpectable information. (2) At a program point , a memory location is semiexpectable if each computational path to (a) reads at some point (say ) where is ready at , and (b) does not read between and .
The semiexpectation analysis is a backward analysis that takes as an input a set of statements and memory locations (the semiexpectable set of the last program point). It is sensible to let this set be the empty set. The following example gives an intuition for the previous definition: if (⋅ ⋅ ⋅ ) , then := + else := * ; := ( + ) * .
Neither the statement + nor the statement * is ready after the if statement because they are not computed in all branches. Hence it is not true to replace these statements with variables towards optimizing the last statement of the example. The job of the type system presented in this section is to provide us with this sort of information. More precisely, as the statements + and * are not ready after the if statement, the second statement of the example does not make them semiexpectable.
The semiexpectation analysis assigns for each program point the set of items that are semiexpectable. The analysis is based on the readiness analysis and is backward. The set of types of the semiexpectation type system has the form: The subtyping relation has the form ≤ × ⊇ × ⊇. A state on an execution path is of type ∈ P( + ∪ ) if all elements of are semiexpectable according to Definition 3. Judgments of the semiexpectation type system have the form : ( , , ) → ( , , , ). For a given statement along with pointer information, readiness information, and a semiexpectation type , we present a type system to calculate a pre-semiexpectable-type such that : ( , , ) → ( , , , ). The type derivation of this typing process is proof for the correctness of the semiexpectable information. The meaning of the judgment is that if elements of are semiexpectable after executing , then elements of must have been semiexpectable before executing .
The inference rules of the semiexpectation type system are shown in Algorithm 2. Some comments on the inference rules are in order. In the rule ( ), given the posttype , we calculate the pretype for the statement 2 . Then the resulting pretype is used as a posttype for the statement 1 to calculate the pretype . In line with Definition 3, the arithmetic statement 1 2 is added to only if it belongs to . Similar explanations illustrate the rule ( * ). The remaining rules mimic the rules of the ready type system. Now we recall the assumption that our distributed system consists of | | machines. For a given statement and a given machine , the type system given above calculates for each program point of the set of semiexpectable items. Now the following rule can be used to combine the information calculated for each machine to get new semiexpectable information for each program point. The new semiexpectable information is valid on any of the | | machines. Consider The difference in the way that this rule treats the semiexpectable information and the way ready information is treated is explained by the fact that the ready analysis is forward while the semiexpectation analysis is backward. It is not hard to prove the soundness of the above type system.

Frequent Statement and Dereference Elimination of ℎ
This section presents a type system that is an enrichment of the type system presented in the previous section.  were calculated by the previous type system. is the optimization of and is a sequence of assignments that links optimized statements with the names of their unoptimized versions. Algorithms 3 and 4 present inference rules for the frequent statements and dereferences elimination. We note the following on the inference rules. A big deal of optimization is achieved by the three rules for * . These rules are ( * 1 ), ( * 2 ), and ( * 3 ). The rule ( * 1 ) takes care of the case where * is ready and is replaceable by its name under the function .
The rule ( * 2 ) treats the case where * is semiexpectable and is not ready before calculating the statement. In this case, a statement name of * is used. The rule ( * 3 ) considers the case where * is neither semiexpectable at the program point after execution nor ready before calculating the statement. In this case, the statement * does not get changed. Similarly, the three rules ( (1) ), ( (2) ), and ( (3) ) treat different cases for arithmetic statements. The Boolean statements are treated with rules quite similar to that of arithmetic statements. The rule ( ℎ ) reuses frequent substatements of the guard. This is done via adding in the positions clarified in the rule. Remaining rules of system are self-explanatory.
For expressing the soundness, we introduce the following definition. :

Frequent Statement and Dereference Elimination of OODP Programs
This section generalizes the type systems of previous sections to cover object-oriented distributed programs. Hence, a new model for object-oriented distributed programs and necessary changes to proposed type systems for the analysis of frequent statement and dereference elimination are presented in this section. Object-oriented concepts such as subtyping and inheritance are included in the model language (dubbed OODP) whose syntax is shown in Figure 3.
In line with OOP concepts, local variables are contained in functions and live while their functions are live. While parameters of function are represented using local variables, a class's internal state is contained in its instance variables. A class is a container for a set of function definitions. Each function has parameter , a main statement , and a statement representing value returned by the function. Hence an OODP program is a set of classes followed by a "main" function. Figure 4 presents semantic spaces and naming conventions used in the rest of the paper.
As shown in the previous sections, the analysis of frequent statement and dereference elimination for imperative distributed programs is achieved in three steps. In the following, we show necessary changes to the three type systems presented so far to cover object-oriented distributed programs.
For each program point, ready statements and memory locations (Definition 1) are computed by the analysis of ready statements and memory locations. Adding rules of Algorithm 5 to that of Algorithm 1 results in a type system that calculates this analysis for object-oriented distributed programs of Figure 3. Using semantics notions of Figure 4, Definitions 1, 3, and 5 are applicable and convenient for the analyses in this section for the language OODP. Comments on the inference rules are in order. The rules of Algorithm 5 suppose the existence of a class analysis that calculates the set of classes that a statement may reference. The judgments of the proposed analysis have the form : → . The intuition of such judgments is that the pointer information are used to calculate the set . In the rule (:= ⋅V ), ready substatements of 1 and 2 are added to to produce . Then for any class that 1 may reference, statements involving ⋅ V are removed from . In the rule (:= ⋅ ), includes classes that 2 may reference. For all functions named in classes of , the body and return statements are enumerated in the set { 1 , . . . , }. Ready substatements of these statements are added to to produce +1 . Then all statements involving 1 are removed from +1 . Using semantics notations of Figure 4, soundness of the type system of Algorithm 5 is stated as follows. The goals of main analysis of this section for OODP are as follows.
Introducing new variables to maintain values of frequent statements and dereferences and then reusing these values instead of recomputing the statements and reaccessing the memory.
To achieve this goal the ready information needs to be enriched with information of semiexpectable.
Adding rules of Algorithm 6 to that of Algorithm 2 results in a type system that calculates the analysis of semiexpectation for object-oriented distributed programs of Figure 3. Some comments on the inference rules of Algorithm 6 are in order. In the rule (:= ⋅V ), starting with the posttype , the pretype is calculated for the statement 2 . Then is used as a posttype for 1 to get the main pretype . Similarly to (:= ⋅ ), the rule (:= ⋅ ) enumerates body and return statements of convenient functions. Then sequentially is calculated starting from . The remaining rules mimic the rules of the ready type system.
Using semantics notations of Figure 4, soundness of the type system of Algorithm 6 is stated as follows. Adding rules of Algorithm 7 to that of Algorithm 3 results in the main type system achieving the analysis of frequent statement and dereference elimination for object-oriented distributed programs of Figure 3. We note the following on the inference rules. Optimization is based on rules for ( ) ; (( ) 1 ), (( ) 2 ), and (( ) 3 ). The case that ( ) is ready and is replaceable by its name under the function is treated by (( ) 1 ). The case ( ) is semiexpectable but not ready before calculating the statement is treated by (( ) 2 ). The rule (( ) 3 ) takes care of the case, where ( ) is neither ready before the calculation nor semiexpectable after execution. The following definition generalizes Definition 5 and is necessary to express soundness. Using semantics notations of Figure 4, soundness of the type system of Algorithm 7 is stated as follows.

Related Work
The techniques of common subexpression elimination (CSE) [8,9] are closed to our work. In [10], a type system for CSE of the while language is introduced. The work presented in our paper can be realized as a generalization of that presented in [10]. The generality of our work is evident in our language models which are much richer with distributed, pointer, and object-oriented commands. Consequently, the operational semantics that we measure the soundness of our system against are much more involved than that used in [10].
Using new opportunities appearing while scheduling controlintensive designs, the work in [11] introduces a technique that dynamically eliminates CSE. To optimize polynomial expressions (important for applications like domains, computer graphics, and signal processing), the paper [12] generalizes algebraic techniques originally designed for multilevel logic synthesis. The generalization in [12] uses factoring to eliminate common subexpressions of polynomial expressions. There are many analyses for optimizing object-oriented programs. In [13] evolutionary multiobjective optimization methods are used to present a Class-Based Elitist Genetic Algorithm (CBEGA) for testing OOP. A new method to optimize OOP for field access in concurrent object-oriented programs is presented in [14]. This work utilizes the correctness concept that concurrency control must be used by programmers. A new model concurrency abstraction is presented in [15]. This model has the advantage of separating the specification of the synchronization code from the method bodies.
The association of a correctness proof with each result of the static analysis is important and needed by applications like proof-carrying code and certified code. The work presented in this paper has the advantage over most related work of constructing these proofs. Adding to the value of using type systems, the proofs constructed in our proposed approach have the form of type derivations. The work in [4,16,17] presents many examples of other static analyses that are in the form of type systems.
In [18], a technique for flow-insensitive pointer analysis of programs that run on parallel and hierarchical machines and that share memory is introduced. Via a two-level hierarchy, [19,20] present constraint-based approaches to evaluate locality information and sharing attributes of references. Our language model is a generalization of models presented in [18,19].

12
The Scientific World Journal Much research acclivities [18,21] was devoted to analyze distributed programs. This is motivated by the importance of distributed programming as a main stream of programming today. The examining and capturing of causal and concurrent relationships are among important issues to many distributed systems applications. In [22], an analysis that examines the source code of each process constructs an inclusive graph, POG, of the possible behaviors of systems. Data racing bugs [23] can be a side effect of the parallel access of cores of a multicore process to a physically distributed memory. In [23] a technique, called DRARS, is proposed for avoidance and replay of this data race. Parallel programs on DSM or multicore systems can be debugged using DRARS. The classical problems of satisfiability decidability and algorithmic decidability are approached in [24] on the distributedprograms model of message sending. In this work, distributed programs are represented by communicating via buffers.

Conclusion
This paper introduces new techniques for the analysis of frequent statement and dereference elimination for imperative and object-oriented distributed programs running on parallel machines equipped with hierarchical memories. Type systems are the tools of the techniques presented in this paper. The first sort of proposed type systems defines for program points of a distributed program sets of calculated (ready) statements and memory accesses. The second sort determines which of the ready statements and memory accesses are used later in the program. The final sort eliminates unnecessary statement computations and memory accesses.

Disclosure
This is an extended and revised version of [6].