Cluster computing

Tuesday, November 26, 2013

Let us take a closer look at the insertion and deletion of nodes in B-Trees. To ensure that there are very few disk access to locate a key, we have to maintain a balanced tree during insertion and deletion. During insertion, we check if the child node is able to hold an additional node. If not, then a new sibling node is added to the tree and the child's key are redistributed to make room for the new node. During redistribution, the middle key of the overfilled childs node could be sent to the parent and split at that key. If we descend the tree during insertion and the root is full, then some of the keys are given to the new child. This change in the level of the tree maintains a balanced tree. During deletion we check to see if the root can absorb the child nodes i.e the key to be removed is replaced with the largest key of the left subtree or the smallest key in the right subtree. This replacement is guaranteed to come from a leaf. If a node has very few keys, say half full, on deletion, it looks for keys from sibling. If such a sibling exists, the node takes a key from the parent and the parent takes the extra keys from the sibling. If the sibling also has fewer keys, then the two nodes are joined to form one full node.
B* trees are similar. The only difference is that the nodes are key 2/3 full. This results in better utilization of space in the tree and slightly better performance.
B+ trees store all the keys at the leaf level, with their associated data values. The parent nodes keep the duplicates of the keys to guide the search. During insertion and deletion, the parent nodes have to be updated correctly. For example, when modifying the first key in a leaf, the last right pointer found while descending the nodes will require modification to reflect the new key value. Also, since all the keys are in the leaf level, they may be linked for sequential access.
In the B++ tree, the keys are again all in the leaves. Each node can hold k keys and the root node can hold 3k keys. During insertion, we check to see if a child node is full before we descend the level in the tree. If it is, we take the child node and the two adjacent nodes, and merge and redistribute them. If the two adjacent nodes are also full, then another node is added resulting in four nodes each three-quarters full. Similarly during deletion, we check to see if the child node is half full before we descend the level. If it is, the keys in the child nodes and the two adjacent nodes are all merged and redistributed. If the two adjacent nodes are also half-full, then they are merged into two nodes each three-quarters full.
We increase the level of the tree when the root is full during insertion in which case we distribute the keys to four new nodes, each three-quarters full. During deletion, we check the child nodes and if there are only three that are all half-full, then they are merged into the root and reduce the level of the tree.

Cluster computing

Tuesday, November 26, 2013

No comments:

Post a Comment