Cluster computing

Tuesday, March 31, 2015

Today we continue reading the WRL research report on Swift Java compiler. We were discussing register allocations and solving it by means of graph coloring. We will discuss the results of the study on Swift Java compiler next. The Swift java compiler was measured on a Alpha workstation which had one 667MHz processor and a 64KB on-chip data cache and a 4MB board cache. The generated code was installed into a high performance JVM. This was necessary so that the results be properly evaluated against the controlled conditions. Only when the baseline is performant, can we find the results to be representative of the variables we control. A poor choice for baseline may hide gains from some of the variables or skew the results because of running time variations. In this case, the JVM chosen was already performing some form of CHA. This helps us evaluate the gains from the passes more appropriately. The heap size used was 100 MB. Although the hardware seems less powerful as compared to recent processors, the configuration was decent at that time. Moreover, with the JVM baseline established, the trends could be expected to be the same on a different choice of system. The tests were performed on a number of applications from a variety of domains with varying lengths in program size. The initial set of results were taken with all optimizations. Then they were taken without the class hierarchy analysis (CHA). This showed that the use of CHA greatly improves the overall performance. The overall speedup of the Swift generated code without CHA over the fast JVM is marginal because the JVM is already using some form of CHA to resolve method calls. The results were also compared for simple-CHA versus full CHA and it turned out that that the former was only somewhat less performant than the latter indicating it as a useful strategy when dynamic loading is present.
Swift Compilation could proceed at the rate of about 2000 lines of code per second with all optimizations except when escape analysis was on. Escape analysis may require slowed down the compilation by about 20-40%;

Monday, March 30, 2015

Today we continue reading the WRL research report on Swift Java compiler. We were discussing register allocations and solving it by means of graph coloring. To summarize, the steps involved were
Insert copies
Precolor
construct the bias graph
construct the interference graph
compute coloring order
color values
If some values failed to be colored
- spill uncolored values to the stack
- repeat by constructing the interference graph
Cleanup

We saw how each of this steps mattered in solving the register allocations. Specifically how the copies help when a value can be in more than one register. We saw how pre color helps with register allocations of method parameters and return values. The bias graph helps with establishing edges between values that need to be colored the same. The interference graph helps with finding edges between nodes which cannot be colored the same. In doing so, it encapsulates all the possible coloring assignments to the values. We saw how to apply a coloring heuristic where the hard nodes are colored first and the easy nodes last. The difficulty was translated to the degree of the nodes in the interference graph. The modes are then colored in the order computed. The bias graph is used to make intelligent choice of a color from the set of legal colorings allowed by the interference graph. If the coloring does not succeed we spill the values by inserting a spill value just after its definition and a restore value before each use. This lets the next pass to find it easier to color this node. Finally when the coloring has succeeded, data flow is used to eliminate unnecessary copies.
We next look at code generation. Swift's code generation pass translates SSA operation into machine code. Then the operations remaining in the SSA graph at this time correspond to zero or one alpha instructions. The code generation involves computing the stack frame size, emitting the prolog code, emitting code for each block as per the scheduling pass, emitting a branch when the successor is not the immediately following block, emitting the epilog code and emitting auxiliary information including a list of relocation entries, associated constants, an exception table, and a byte code map. Branches that are necessary are found and the final code block for that branch is determined.

Sunday, March 29, 2015

Today we continue reading the WRL research report on Swift Java compiler. We were discussing register allocations and solving it by means of graph coloring. Today we continue with the order of coloring. The bias graph is used to make intelligent choices of a color from the set of legal colorings allowed by the interference graph. Uncolored nodes are colored the same as a node only if the Interim nodes can be colored the same. If the coloring does not succeed, then we spill values to the stack. The value corresponding to each node that was not colored is spilled onto the stack by inserting a spill value just after its definition and a restore value before each use.This lets the original value and the newly added restore value to be in a register over a shorter range and thus will be hopefully easier to color on the next pass.
A final cleanup pass is necessary after all the coloring succeeds to remove copies that have the same source and destination and to remove unnecessary restore operations. This pass does a data flow computation to determine what value each register holds after each instruction. This helps with optimization such as replacing input value of each instruction with the oldest copy that is still in a register.
#codingexercise
GetAllNumberRangeProductCubeRootPowerSeven (Double [] A)
{
if (A == null) return 0;
Return A.AllNumberRangeProductCubeRootPowerSeven();
}
#codingexercise
GetAllNumberRangeProductCubeRootPowerNine) (Double [] A)
{
if (A == null) return 0;
Return A.AllNumberRangeProductCubeRootPowerNine();
}

Saturday, March 28, 2015

Today we continue our study of the WRL Research report on Swift Java compiler. We were discussing register allocations. We mentioned the construction of bias graph and interference graph. Today we discuss the next steps which is the coloring order. We saw that the algorithm proceeds by coloring the hard nodes first and the easy nodes last. The nodes with the minimum degree from the interference graph are selected first. Each time we build the interference graph, this will change so we look for the minimum remaining degree and then the order of coloring is the reverse of this order.
To color all the nodes in the order computed, we color them one by one by finding the set of possible colorings for that node. The colors of the adjacent nodes in the interference graph are then excluded from the set of possible colorings. Any color from this set is valid and if there is no color possible, then the uncolored values are spilled on the stack and the interference graph and coloring order are recomputed.
The bias graph is used to make an intelligent choice of a color from the set of legal colorings. If we represent the edges from the interference graph with solid lines and those from the bias graph with dotted lines, then to color a particular node, we do a breadth first search of the bias graph. If we find a node that is already colored, we color the original node the same color as long as that color is allowed for interim nodes. The interim node cannot be colored different if we are to use the same color for this node and the colored node. If none of the nodes found have a color that can be used for the node we want to color, then we do another BFS on the uncolored nodes in the bias graph. At each node encountered, we intersect the set of possible colors for the node we want to color, with the set of colors allowed for the encountered uncolored node. If we are left with a non-empty set, a color is chosen for the node we want to color. This method allows for the maximum number of nodes in the bias graph connected to the node we want to color to match the color we picked.

Friday, March 27, 2015

Today we continue our study of the WRL Research report on Swift Java compiler. We were discussing register allocations. We saw the first step in this algorithm was to insert copies and the second step was to pre color them. we now discuss the remaining steps. Next we construct the bias graph. This is an undirected graph that has values as nodes a edges between nodes which we want to color with the same color. The nodes hat we want to color the same are the inputs and outputs of a copy. This therefore eliminates some of the copy insertions from step 1. Next we construct the interference graph. The interference graph has nodes and edges between nodes that cannot be assigned the same color because their live ranges overlap. This is the step where we determine all the possible valid assignments of colors to values. Hence with this step, we covert the problem to a graph coloring problem. Graph coloring attempts to color the nodes such that no two nodes that are adjacent in the interference graph have the same color. The interference graph completely encodes the possible legal assignments to colors because all the restrictions are drawn. That said, the graph coloring algorithm may be NP-hard, so heuristics are involved.
In the next step, we find the coloring order of all the nodes. A coloring order is selected such that we find the most connected nodes from the interference graph and color them first. This is referred to as coloring the hard nodes first and then the easy nodes. The difficulty corresponds to the degree of the nodes in the interference graph. The algorithm proceeds by repeatedly removing a node with the minimum degree from the interference graph. On the removal of a node, the corresponding edges are also deleted. The algorithm terminates when all the nodes have been removed. The degree of the nodes changes with the removal of edges. Hence, the algorithm selects nodes with the smallest remaining degree among all the nodes. Morevoer, the order of coloring is the reverse order of the removal of nodes. This ensures that the nodes with low degree are colored after the nodes with the higher degree.
The coloring of each of the nodes in the order computed is a separate step. Here we enumerate all the possible legal colorings of that node. This could be for example all the registers that could hold that value and not including colors of any neighboring colored nodes in the original interference graph. If a node cannot be colored, it is put on the stack and the interference graph is reconstructed The algorithm exist when there are no more values left to be colored.

Thursday, March 26, 2015

Today we continue our study of the WRL Research report on Swift Java compiler. We were discussing trace scheduling and trace layout algorithms. Both of them are greedy algorithms. We next discuss register allocations. This is a modified Briggs style coloring allocator. Swift's allocator adds a special data structure instead of using the coalescing. This is the bias graph data structure to direct coloring and limit the number of copies introduced. Register allocation proceeds by assigning each value a color which represents a particular register assignment. Thereafter the problem is converted to a graph coloring problem for which there are coloring heuristics available. The coloring problem is defined by the restrictions introduced.
Register allocation proceeds with the following algorithm:
1) Insert Copies
2) Precolor
3) Construct bias graph
4) Construct interference graph
5_ Compute coloring order
6) Color values
7) If some values failed to be colored
a) Spill uncolored values to the stack
b) Goto step 4)
8) Clean up
We will discuss each of these steps. A coloring allocator assumes that each value is allocated to exactly one register for its life time. Copies are inserted when a value is required to be in more than one register such as when a value needs to move from one register to another as in the return value or a method call parameter. Copies are also required for a phi node especially because the input values to a phi node may not be assigned to the same register as the phi node. In addition, Swift uses LIMIT algorithm to split the live ranges of values around loops in which they are not referenced. It means that the live range of a value is split into copies before and after the loop. This helps with the fallout of values after those used within the loops.
The next phase is value precoloring. This is kind of the initialization step. The compiler determines which values need to be assigned to certain registers and fixes their color assignment. Values which have fixed register assignments include method arguments and return values
The next stage involves creating the bias graph. This is the data structure introduced by Swift and it is an undirected graph that has values as nodes and edges between nodes which are to be colored the same. This data structure was introduced to undo as many copy operations from the first step so as to try color the input and output of a copy the same color.
#codingexercise
GetAllNumberRangeProductCubeRootPowerSix) (Double [] A)
{
if (A == null) return 0;
Return A.AllNumberRangeProductCubeRootPowerSix();
}

Wednesday, March 25, 2015

Today we continue to discuss the WRL research report on Swift Java compiler. We were discussing Trace scheduling and we saw that it involves a greedy algorithm. We will now continue with block layout how Swift uses profile information to determine a good layout for the traces.Swift uses a simple version of Pettis and Hansen's code layout algorithm. This is also a greedy algorithm that gradually merges blocks/traces into sequences. and always merges the two sequences that have the heaviest weight edge between an element of one and the element of the other. The end result of this algorithm is a single merged sequence which is the desired layout.
The changes that Swift makes ensures that loop exit block will be placed at the end of a loop/ Swift also modifies the dynamic or static profile information such as by reducing the weight of other outgoing edges of a block which has an edge that exits a loop. By keeping the exit at the end, Swift guarantees only one branch per loop iteration. Branches that exit in the middle are given lower priority since these edges are already determined to be less important than the remaining edges in a tree.
#codingexercise
GetAllNumberRangeProductCubeRootPowerFour) (Double [] A)
{
if (A == null) return 0;

Return A.AllNumberRangeProductCubeRootPowerFour();

}