We discussed out of box support for data mining in server products in the previous post and prior to that we have discussed methods of text mining that involves clustering. We discussed choices of clustering methods. We favored clustering because it let us evaluate topics and keywords based on similarity measures and because we could not determine predictive parameters for keyword extraction.
If we explore the approach that keywords have a predictive parameter in and by themselves as they appear in an input text, then we can explore significant optimization and an easier approach. The parameter could be based on a large trained data set or by exploring graphs in word thesaurus or ontology. However, that said, if we were to find words similar to those that occur in input text, we resort to clustering.
No comments:
Post a Comment