BazEkon - The Main Library of the Cracow University of Economics

BazEkon home page

Main menu

Woźniak Rafał (Lodz University of Technology), Ożdżyński Piotr (Lodz University of Technology), Zakrzewska Danuata (Lodz University of Technology)
Cluster Analysis of Medical Text Documents by Using Semi-Clustering Approach Based on GRAPH Representation
Information Systems in Management, 2018, vol. 7, nr 3, s. 213-224, rys., tab., bibliogr. 14 poz.
Systemy Informatyczne w Zarządzaniu
Analiza skupień, Eksploracja tekstu
Cluster analysis, Text mining
The development of Internet resulted in an increasing number of online text repositories. In many cases, documents are assigned to more than one class and automatic multi-label classification needs to be used. When the number of labels exceeds the number of the documents, effective label space dimension reduction may significantly improve classification accuracy, what is a major priority in the medical field. In the paper, we propose document clustering for label selection. We use semi-clustering method, by considering graph representation, where documents are represented by vertices and edge weights are calculated according to their mutual similarity. Assigning documents to semi-clusters helps in reducing number of labels, further used in multi-label classification process. The performance of the method is examined by experiments conducted on real medical datasets. (original abstract)
Full text
  1. Tsoumakas G., Katakis I., Vlahavas I. (2008) Effective and Efficient Multilabel Classification in Domains with Large Number of Labels, Proceedings of ECML/PKDD Workshop on Mining Multidimensional Data, MMD'08, 30-44.
  2. Balasubramanian K., Lebanon G. (2012) The Landmark Selection Method for Multiple Output Prediction, Proceedings of the 29th International Conference on Machine Learning, Edinburgh, Scotland, UK, 983-990.
  3. Read J., Pfahringer B., Holmes G. (2008) Multi-label Classification Using Ensembles of Pruned Sets, Proceedings of 8th IEEE International Conference on Data Mining, 995-1000.
  4. Bi W., Kwok J. (2013) Efficient Multi-label Classification with Many Labels, Proceedings of the 30th International Conference on International Conference on Machine Learning 28, Atlanta, Georgia, USA, III-405-III-413.
  5. Hsu D., Kakade S.M., Langford J., Zhang T. (2009) Multi-label Prediction via Compressed Sensing, Bengio Y., Schuurmans D., Lafferty J.D., Williams C.K.I., Culotta A. [eds]: Advances in Neural Information Processing Systems 22, Curran Associates Inc., 772-780.
  6. Lin Z., Ding G., Hu M., Wang J. (2014) Multi-label Classification via Feature-aware Implicit Label Space Encoding, Proceedings of the 31st International Conference on International Conference on Machine Learning 32, Beijing, China, II-325-II-333.
  7. Chen Y.-N., Lin H.-T. (2012) Feature-aware Label Space Dimension Reduction for Multi-label Classification, Proceedings of the 25th International Conference on Neural Information Processing Systems 1, Nevada, USA, 1529-1537.
  8. Herrera F., Charte F., Rivera A.J., del Jesus M.J. (2016) Multilabel Classification. Problem Analysis, Metrics and Techniques, Springer Switzerland.
  9. Hangal S., MacLean D., Lam M.S., Heer J. (2010) All Friends are Not Equal: Using Weights in Social Graphs to Improve Search, Proceedings of the 4th ACM Workshop on Social Network Mining and Analysis, Washington, USA, 1-7.
  10. Andersen J.S., Zukunft O. (2016) Semi-Clustering that Scales: An Empirical Evaluation of GraphX, Proceedings of the 2016 IEEE International Congress on Big Data, San Francisco, USA, 333-336.
  11. Malewicz G., Austern M.H., Bik A.J.C., Dehnert J.C., Horn I., Leiser N., Czajkowski G. (2010) Pregel: A System for Large-Scale Graph Processing, Proceedings of the 2010 International Conference on Management of Data, New York, USA, 135-146.
  12. (accessed November 20, 2017).
  13. (accessed November 20, 2017).
  14. Boring C.C., Squires T.S., Tong T. (1991) Cancer statistics, 1991, CA: A Cancer Journal for Clinicians, 41(6), 19-36.
Cited by
Share on Facebook Share on Twitter Share on Google+ Share on Pinterest Share on LinkedIn Wyślij znajomemu