Given a distance function between two terms that measures the similarity between the two terms, we build a tree of clusters which we traverse to insert the term in the cluster with the nearest center. For that cluster we recompute the center as if the record r is inserted into it. If the cluster threshold is exceeded, we can proceed to the next record. If the tree grows beyond a maximum number of clusters because we want to keep only a few clusters, then we can increase the threshold so that the clusters can be merged or accomodate more records
B+ Tree:
B+ Tree:
class Node:
def __init__(self, data, l = None, r = None, center = None):
self.l = l
self.next = None
l.next = r
self.r = r
self.center = center
def value(self):
return self.center
No comments:
Post a Comment