Cluster computing

Saturday, March 21, 2015

Today we continue to discuss the WRL research report on Swift Java compiler.
We were discussing pattern based peephole optimizer. The graph rewriting order is as follows An SSA value is processed before all of its inputs. There is an exception when there are cycles in the value graphs. For each value, rules are applied in the order specified by the rules file until one matches and mutates a value. The rewriting then proceeds to the next value. Each pass changes the value. Consequently the passes are iterated through the entire SSA graph until there are no more changes. The ordering of rules is somewhat important in ensuring that optimizations are applied in the most effective order.
Many of the machine independent peephole optimization apply simple algebraic or boolean identifies. When placing the length of a new array, it is replaced by an allocated size of an array. Similarly a get_field operation can be replaced by the value from the previous put_field operation. Other optimizations include dead code elimination by way of repeated passés, This is done via mark and sweep. The live values in a Swift SSA graph are the control values and all inputs of live values. All the unmarked values are removed as dead code. Swift also does pass for conditional constant propagation and simultaneous elimination of impossible control flow

Friday, March 20, 2015

Today we continue to discuss the WRL research report on Swift Java Compiler. We were discussing the loop peeling method of optimization. Part of the first iteration or more is peeled into code preceding the loop. It involves a search phase that starts from the highly nested loop for each loop. It evaluates the faulting operations to determine the candidates. There are a few other requirements as well. The procedure of loop peeling is straight forward. The blocks are copied to the new header. Control flow edges are redirected to the peel copy. Phi nodes are created in the new loop header or start and edges joined from the peel copy to the original. If the addition of phi nodes cascade down , the peeling is aborted. There is an alternative to estimate the number of blocks to add beforehand by running the phi node placement algorithm.
We now discuss pattern based peephole optimizer. Swift contains several passes that do pattern-based transformations of the SSA graph. A pattern matcher generator is one that takes a set of graph rewriting rules as input and produces a compiler pass which transforms the SSA graph according to those rules. Each rule consists of a pattern matcher that can match a piece of an SSA graph and some Java code which is executed when the pattern matches. The language used to describe the patterns allows matching of node operations, auxiliary operations and types, and allows nested matching on the inputs to a node. Matching of an operation, a list of operations, a particular operation, or any operations not in a list are alloed. Similar matching capabilities exist for type.Binding of a variable to a particular value in a pattern such that it can be used in the action part of the rule is permitted. The top level value is matched by the pattern is available in the action via the keyword self. Moreover the action can mutate the self value or any of its inputs.
#coding exercise
GetAlternateOddNumberRangeProductCubeRootEighthPower (Double [] A)
{
if (A == null) return 0;
Return A.AlternateOddNumberRangeProductCubeRootEighthPower();
}

#coding exercise
GetAlternateOddNumberRangeProductCubeRootSixthPower (Double [] A)
{
if (A == null) return 0;
Return A.AlternateOddNumberRangeProductCubeRootSixthPower();
}

Thursday, March 19, 2015

Today we continue to discuss the WRL research report on Swift Java Compiler. We were discussing the loop peeling method of optimization where the idea of treating initial iteration checks separately was borrowed from scientific programming. Loop peeling refers to the peeling of one or more iterations of the loop into the straight line code preceding the loop. When Swift peels the loop, Global CSE automatically eliminates the runtime checks still in the loop. The end result may look very similar to that of code motion for simple checks in loops.
The loop peeling pass begins with a search phase that looks for loops that should be peeled. The search begins outwards for all loops starting from the most highly nested loop. If the search encounters a loop faulting operation. that loop is considered a candidate for peeling. By this we mean that the operation must be a part of each loop otherwise it might very well be an exit condition time check. If the set of blocks that are dominated by the loop header and those that are dominated by the faulting operation are large then this peel of blocks is large and results in duplication of code. Therefore, there's also a threshold specified by the maximum number of blocks to peel. There also a few other requirements such as a peel should not have a nested loop.
Although the requirements sound complex, the steps are straightforward in the sense that we determine a set of blocks to peel and change the control flow to go through the peel first. When the blocks are copied from the original to the duplicates to form the peel, the values in the blocks are also copied. During this copying, the uses of the phi nodes at the top of the peel is converted to the associate phi input when entering the loop. The control flow edges entering the loop header are then redirected to the peel copy. At this time some phi nodes in the original loop header may be considered redundant and eliminated and the block that is the successor of the peel copy in the original becomes the new loop header or start.To complete the move, all the values that were copied, phi nodes are created in the new loop header and control flow edges joined from the peel copy to the peel original. If the addition of these phi nodes cascade into the addition of some more phi nodes, the loop peeling is aborted because the loop is then complex. Its also possible to estimate the number of blocks to add by re-running the phi node placement algorithm.

Wednesday, March 18, 2015

Today we continue to read the WRL research report on the Swift Java Compiler. We were discussing runtime check elimination including array store checks. We now discuss loop peeling. Loop peeling is an optimization that removes one or more of the initial iterations into straight line code preceding the loop. This helps in removing runtime checks associated with the loops. The problem that we wish to solve is that there are often loop invariant runtime checks in a loop. For example, there may be a required null check for an array that is first accessed inside the loop. The trouble with this check is that we want to move it out of the loop but it is pinned and can throw exceptions. We don't want it to throw exception if no loop is iterated. Morevoer, if the check has succeeded in one iteration, then it is safe to remove it from subsequent iterations. It is redundant since it will succeed in all later iterations. Loop peeling comes in useful to separate out the part that is the first iteration. The rest of the checks can be automatically removed by global CSE. It may be interesting to note that the loop peeling and global CSE of simple loops and checks often leave a result that is similar to that of code motion. The loop peeling pass begins with a search phase that looks for loops that should be peeled.The search includes all loops, starting with the most highly nested loops first. A loop is a candidate if it contains a loop invariant faulting operation. And it must be dominating meaning that it should be part of each iteration. The set of blocks that is dominated by the loop header and the loop invariant fault operation become the loop peel. We don't want that peel to be too large otherwise it will result in duplication of code. The largest useful peel that does not exceed a maximum number of blocks is chosen.
#coding exercise
GetAlternateOddNumberRangeProductCubeRootTenthPower (Double [] A)
{
if (A == null) return 0;
Return A.AlternateOddNumberRangeProductCubeRootTenthPower();
}

Tuesday, March 17, 2015

Today we continue to read the WRL research report on the Swift Java Compiler. We were discussing runtime check elimination. We said that swift does some simple determinations based on the users of a value and not based on any hypothesis or theorem. These checks or type tests can be made based on the properties of the values and the control flow of the method. We saw this for both null check and type tests and the redundancies were removed. The bounds check was a bit more involved. Swift applies condition gathering recursively down to a fixed depth, and applies transitive properties between three variables. Some simple algebraic rules such as sum of two variables are also applied. This approach is effective because Swift's IR provides intermediate access to all the users of a value v, and therefore makes it easy to find program expressions that help prove properties about v. In addition, Swift does a round of global CSE before doing the check removal which lets Swift combine values equivalent to v and collect all their uses together.
We discuss array store checks as another run time check for Swift to optimize. In Java, this check ensures that the object being stored into the array is compatible with the base type of the array. Swift attempts to eliminate array store checks via several methods. To eliminate any of these checks Swift has to resolve the base type of the array.It will also eliminate an array store check if the base type of the array is known and can hold the known type of the stored value. If the value is a null or loaded from another element of the array, those checks are eliminated too. Swift also checks the class of the base type of array has no subclasses. If the type is final, then the store check cannot fail and is eliminated.
#coding exercise
GetAlternateOddNumberRangeSumCubeRootTenthPower (Double [] A)
{
if (A == null) return 0;
Return A.AlternateOddNumberRangeSumCubeRootTenthPower();
}

Monday, March 16, 2015

Today we continue to read the WRL research report on the Swift Java Compiler. We were discussing branch removal and the two methods that Swift uses. The first method involves removing of branches for conditions that evaluate to 0 or 1 and involves removing a phi node that also removes an edge from the control flow graph The second method translates logical comparisons to bitwise comparisons. We now study runtime check elimination. Some checks or type tests can be made on the properties of values and corroborated with the control flow of a method. For example, if a statement checks if a value is non-null, then null checks can be eliminated in one branch of this if - statement. Generalizing this, Swift scans the current SSA graph for run-time checks that have not been eliminated. In the example above, if the successor block along one of the edges of the null time check has null checks, then they can be eliminated However values that depend on the null time check should not float above the successor block. To do this, Swift places a pin value and changes all users of the null-check to use pin value instead. Pin values do not generate any machine code but they are pinned to their original block. These measures guarantee that the values that depend on the null check stay within the correct branch of the IF statement. The same discussion applies for cast checks.
For example, for a cast check of value v, Swift scans the users of v for instanceof checks on v that control an IF. Similar to the null checks, redundant IFs are eliminated by searching for a dominant IF with the same controlling condition.
For a bounds check, however, Swift does somewhat more work. If Swift were to try to eliminate a bounds check whose index is value v. then Swift first scans the users of v to derive conditions on v. If it finds a comparision on v that controls an IF, then it may know that the comparision or its negation is true. On the other hand, if v is an induction variable of a loop that is monotonically increasing / decreasing, then v is guaranteed to be greater/less than its starting value. In the example of a for loop, the termination condition specifies that the index is always less than the termination value which is usually an array length. Similarly for a zero based loop, the index has to be greater than or equal to zero given that the index is monotonically increasing. Together these two conditions guarantee that the bounds check associated with accessing the array element at index i will always succeed and so Swift will swiftly remove it.
The key idea behind runtime checks is that Swift does not try to solve a system of inequalities involving i in order to prove properties of i. Instead, it looks at all the users of the value representing i, and determines if any direct comparisions of i can be shown to be true.
#coding exercise
GetAlternateOddNumberRangeSumCubeRootEigthPower (Double [] A)
{
if (A == null) return 0;
Return A.AlternateOddNumberRangeSumCubeRootEigthPower();
}

Sunday, March 15, 2015

Today we continue to read the WRL research report on the Swift Java Compiler. We saw that the Global Common SubExpression elimination and global code motion have significant optimization benefits. We now reviewed Global Code Motion. We now look at branch removal. Swift has a phase that uses two methods to attempt to remove branches. The first method removes branches that are used to compute a value that is either 0 or 1. Instead it wants to assign them directly. However in JVM there are no byte codes that produce a boolean value directly from a comparison. A branch is therefore involved however Swift tries to optimize it. It simply searches through all the phi nodes in the method whose two inputs are constants of 0 and 1. If it translates to an If-then-else, then Swift can replace the use of the phi nodes with the controlling boolean of the If node. When the phi node is optimized away, the if-then-else control flow structure is also removed. Swift also tries to remove branches by converting logical and/or expression to bitwise and/or expression. For example, the code for a conditional that has two conditions with a logical 'and', requires two branches since the second condition is not evaluated if the first one fails. However, if the conditions are fairly simple and have no side effects, then they can be combined with a bitwise AND and then do a conditional branch. This reduction in the number of branches is important because extra branches can reduce the rate of instruction fetching and may even result in a speculation down the wrong path.
This optimization is fairly easy to implement in the Swift IR. Swift simply looks through all the blocks in CFG for a block that is controlled by an IF node and one of whose successors is also an IF node. If such a block is found, Swift checks the other criteria namely no side effects. There must be no memory write operation, method call or other exception causing operations. If these are satisfied then the conditions can be combined and the branches can be eliminated.