BazEkon - Biblioteka Główna Uniwersytetu Ekonomicznego w Krakowie

BazEkon home page

Meny główne

Autor
Korzeniewski Jerzy (University of Lodz, Poland)
Tytuł
New Method of Variable Selection for Binary Data Cluster Analysis
Źródło
Statistics in Transition, 2016, vol. 17, nr 2, s. 295-304, tab., bibliogr. s. 303-304
Słowa kluczowe
Analiza skupień, Segmentacja rynku, Dobór zmiennych
Cluster analysis, Market segmentation, Variables selection
Uwagi
summ.
Materiały z konferencji Multivariate Statistical Analysis 2015, Łódź.
Abstrakt
Cluster analysis of binary data is a relatively poorly developed task in comparison with cluster analysis for data measured on stronger scales. For example, at the stage of variable selection one can use many methods arranged for arbitrary measurement scales but the results are usually of poor quality. In practice, the only methods dedicated for variable selection for binary data are the ones proposed by Brusco (2004), Dash et al. (2000) and Talavera (2000). In this paper the efficiency of these methods will be discussed with reference to the marketing type data of Dimitriadou et al. (2002). Moreover, the primary objective is a new proposal of variable selection method based on connecting the filtering of the input set of all variables with grouping of sets of variables similar with respect to similar groupings of objects. The new method is an attempt to link good features of two entirely different approaches to variable selection in cluster analysis, i.e. filtering methods and wrapper methods. The new method of variable selection returns best results when the classical k-means method of objects grouping is slightly modified. (original abstract)
Dostępne w
Biblioteka Główna Uniwersytetu Ekonomicznego w Krakowie
Biblioteka SGH im. Profesora Andrzeja Grodka
Biblioteka Główna Uniwersytetu Ekonomicznego w Katowicach
Biblioteka Główna Uniwersytetu Ekonomicznego we Wrocławiu
Pełny tekst
Pokaż
Bibliografia
Pokaż
  1. ACARMONE, F., KARA, A., MAXWELL, S., (1999). HINoV: A New Model to Improve Market Segment Definition by Identifying Noisy Variables, Journal of Marketing Research, Vol. 36, No. 4, 501-510.
  2. DASH, M., LIU, H., (2000). Feature selection for clustering, Proceedings of Fourth Pacific-Asia Conference on Knowledge Discovery and Data Mining, (PAKDD), 110-121.
  3. DEVANEY, M., RAM, A., (1997). Efficient feature selection in conceptual clustering, Proceedings of the Fourteenth International Conference on Machine Learning, Nashville, 92-97.
  4. DIMITRIADOU, E., DOLNICAR, S., WEINGESSEL, A., (2002). An Examination of Indexes for Determining the Number of Clusters in Binary Data Sets, Psychometrika 67(1), 137-160.
  5. HUBERT, L., ARABIE, P., (1985). Comparing Partitions, Journal of Classification 2.
  6. LEISCH, F., WEINGESSEL, A., HORNIK, K., (2015). Bindata package manual.
  7. KORZENIEWSKI, J., (2012). Selekcja zmiennych w analizie skupień. [The selection of variables in cluster analysis]. Nowe procedury, Wydawnictwo Uniwersytetu Łódzkiego, Łódź.
  8. RAFTERY, E., DEAN, N., (2006). Variable selection for model-based clustering, Journal of the American Statistical Association , 101(473): 168-178.
  9. STEINLEY, D., BRUSCO, M., (2007). Initializing k-means batch clustering: A critical evaluation of several techniques, Journal of Classification 24, 99-121.
  10. STEINLEY, D., BRUSCO, M., (2008). Selection of Variables in Cluster Analysis: An Empirical Comparison of Eight Procedures, Psychometrika 73, 125-144.
  11. TALAVERA, L., (2000). Dependency-Based Feature Selection for Clustering Symbolic Data, Intelligent Data Analysis 4, 19-28.
Cytowane przez
Pokaż
ISSN
1234-7655
Język
eng
Udostępnij na Facebooku Udostępnij na Twitterze Udostępnij na Google+ Udostępnij na Pinterest Udostępnij na LinkedIn Wyślij znajomemu