Clustering Categorical Data Using Hierarchies (CLUCDUH)


SİLAHTAROĞLU G.

World Academy of Science, Engineering and Technology, cilt.56, ss.334-339, 2009 (Scopus) identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 56
  • Basım Tarihi: 2009
  • Dergi Adı: World Academy of Science, Engineering and Technology
  • Derginin Tarandığı İndeksler: Scopus
  • Sayfa Sayıları: ss.334-339
  • Anahtar Kelimeler: Clustering, Entropy, Gini, Pruning, Split, Tree
  • İstanbul Medipol Üniversitesi Adresli: Evet

Özet

Clustering large populations is an important problem when the data contain noise and different shapes. A good clustering algorithm or approach should be efficient enough to detect clusters sensitively. Besides space complexity, time complexity also gains importance as the size grows. Using hierarchies we developed a new algorithm to split attributes according to the values they have and choosing the dimension for splitting so as to divide the database roughly into equal parts as much as possible. At each node we calculate some certain descriptive statistical features of the data which reside and by pruning we generate the natural clusters with a complexity of O(n).