Given a distance function between two terms that measures the similarity between the two terms, we build a tree of clusters which we traverse to insert the term in the cluster with the nearest center. For that cluster we recompute the center as if the record r is inserted into it. If the cluster threshold is exceeded, we can proceed to the next record. If the tree grows beyond a maximum number of clusters because we want to keep only a few clusters, then we can increase the threshold so that the clusters can be merged or accomodate more records
B+ Tree:
   
B+ Tree:
class Node:
     def __init__(self, data, l = None,  r = None, center = None):
          self.l = l
   self.next = None
   l.next = r
          self.r = r
          self.center = center
def value(self): 
         return self.center
No comments:
Post a Comment