Suffix tree clustering
Suffix Tree Clustering, often abbreviated as STC is an approach for clustering that uses suffix trees.[1] A suffix tree cluster keeps track of all n-grams of any given length to be inserted into a set word string, while simultaneously allowing differing strings to be inserted incrementally in a linear order. This has the advantage of ensuring that a large number of clusters can be handled sequentially. However, a potential disadvantage may be that it also increases the number of possible documents that need to be looked through when handling large sets of data. Suffix tree clusters can either be decompositional or agglomerative in nature, depending on the type of data being handled.[2]
References
- ↑ Branson, Steve; Greenberg, Ari. "Clustering Web Search Results Using Suffix Tree Methods, CS276A Final Project" (PDF). www.stanford.edu. Stanford University. Retrieved 2 January 2015.
- ↑ Davis, Ernest. "Lecture 4: Clustering". www.cs.nyu.edu. New York University. Retrieved 2 January 2015.
This article is issued from Wikipedia - version of the 7/3/2015. The text is available under the Creative Commons Attribution/Share Alike but additional terms may apply for the media files.