Today we continue to discuss the WRL research report on Swift Java Compiler. We were discussing the loop peeling method of optimization where the idea of treating initial iteration checks separately was borrowed from scientific programming. Loop peeling refers to the peeling of one or more iterations of the loop into the straight line code preceding the loop. When Swift peels the loop, Global CSE automatically eliminates the runtime checks still in the loop. The end result may look very similar to that of code motion for simple checks in loops.
The loop peeling pass begins with a search phase that looks for loops that should be peeled. The search begins outwards for all loops starting from the most highly nested loop. If the search encounters a loop faulting operation. that loop is considered a candidate for peeling. By this we mean that the operation must be a part of each loop otherwise it might very well be an exit condition time check. If the set of blocks that are dominated by the loop header and those that are dominated by the faulting operation are large then this peel of blocks is large and results in duplication of code. Therefore, there's also a threshold specified by the maximum number of blocks to peel. There also a few other requirements such as a peel should not have a nested loop.
Although the requirements sound complex, the steps are straightforward in the sense that we determine a set of blocks to peel and change the control flow to go through the peel first. When the blocks are copied from the original to the duplicates to form the peel, the values in the blocks are also copied. During this copying, the uses of the phi nodes at the top of the peel is converted to the associate phi input when entering the loop. The control flow edges entering the loop header are then redirected to the peel copy. At this time some phi nodes in the original loop header may be considered redundant and eliminated and the block that is the successor of the peel copy in the original becomes the new loop header or start.To complete the move, all the values that were copied, phi nodes are created in the new loop header and control flow edges joined from the peel copy to the peel original. If the addition of these phi nodes cascade into the addition of some more phi nodes, the loop peeling is aborted because the loop is then complex. Its also possible to estimate the number of blocks to add by re-running the phi node placement algorithm.
The loop peeling pass begins with a search phase that looks for loops that should be peeled. The search begins outwards for all loops starting from the most highly nested loop. If the search encounters a loop faulting operation. that loop is considered a candidate for peeling. By this we mean that the operation must be a part of each loop otherwise it might very well be an exit condition time check. If the set of blocks that are dominated by the loop header and those that are dominated by the faulting operation are large then this peel of blocks is large and results in duplication of code. Therefore, there's also a threshold specified by the maximum number of blocks to peel. There also a few other requirements such as a peel should not have a nested loop.
Although the requirements sound complex, the steps are straightforward in the sense that we determine a set of blocks to peel and change the control flow to go through the peel first. When the blocks are copied from the original to the duplicates to form the peel, the values in the blocks are also copied. During this copying, the uses of the phi nodes at the top of the peel is converted to the associate phi input when entering the loop. The control flow edges entering the loop header are then redirected to the peel copy. At this time some phi nodes in the original loop header may be considered redundant and eliminated and the block that is the successor of the peel copy in the original becomes the new loop header or start.To complete the move, all the values that were copied, phi nodes are created in the new loop header and control flow edges joined from the peel copy to the peel original. If the addition of these phi nodes cascade into the addition of some more phi nodes, the loop peeling is aborted because the loop is then complex. Its also possible to estimate the number of blocks to add by re-running the phi node placement algorithm.
No comments:
Post a Comment