BazEkon - Biblioteka Główna Uniwersytetu Ekonomicznego w Krakowie

BazEkon home page

Meny główne

Roy Chiranjiv (Technology Services, GSD CSC Bangalore, India), Moitra Sourov (Technology Services, GSD CSC Bangalore, India), Malhotra Rashika (Technology Services, GSD CSC Bangalore, India), Srinivasan Subramaniyan (Technology Services, GSD CSC Bangalore, India), Das Mainak (Technology Services, GSD CSC Bangalore, India)
IT Infrastructure Downtime Preemption using Hybrid Machine Learning and NLP
Annals of Computer Science and Information Systems, 2015, vol. 6, s. 39-44, rys., tab., bibliogr. 28 poz.
Słowa kluczowe
Uczenie maszynowe, Programowanie neurolingwistyczne (NLP), Algorytmy
Machine learning, Neuro-Linguistic Programming (NLP), Algorithms
IT Infrastructure Management and server downtime have been an area of exploration by researchers and industry experts, for over a decade. Despite the research on web server downtime, system failure and fault prediction, etc., there is a void in the field of IT Infrastructure Downtime Management. Downtime in an IT Infrastructure can cause enormous financial, reputational and relationship losses for customer and vendor. Our attempt is to address this gap by developing an innovative architecture which predicts IT Infrastructure failure. We have used a hybrid approach of human-machine interaction through Big Data, Machine Learning, NLP and IR. We sourced real-time machine, operating system, application logs and unstructured case notes into an algorithm for multi-dimensional symptoms mining, using iterative deepening depth-first search, traversal to create transactions for Sequential Pattern Mining of symptoms to events. It went through multiple statistical tests and review from technology experts, to create and update a dynamic Pattern Dictionary. This dictionary is used for training unsupervised and supervised classification models of machine learning, namely SVM and Random Forrest to score and predict new logs in a real time mode. The approach is also dynamic to use unsupervised clustering methods to give directions to the technicians on future or unknown pattern of errors or fault, to constantly update the Pattern Dictionary and improve classification for new IT products.(original abstract)
Pełny tekst
  1. Aggarwal, Charu C., Yu, Philip C. 2001. Outlier Detection for High Dimensional Data, ACM SIGMOD
  2. Ghose, Udayan., Rai, C.S., Singh, Yogesh. 2010. On Multiplicative Entropy and Information gain in Large Data Sets, International Journal of Engineering Science and Technology, 187-193.
  3. Han, Jiawei., Kamber, Micheline., Pei, Jian. 2011. Data mining: Concepts and Techniques, 561-562, Morgan Kaufmann.
  4. Hodge, Victoria J., Austin, Jim. 2004. A Survey of Outlier Detection Methodologies, In: Artificial Intelligence Review, 85-126, Kluwer Academic Publishers, Netherlands.
  5. Knorr, Edwin M., Ng Raymond T. 1998. Algorithms for Mining Distance-Based Outliers in Large Datasets, VLDB Conference.
  6. Minka, Thomas P. 2003. A comparison of numerical optimizers for logistic regression.
  7. Pawling, Alec., Chawla, Nitesh V., Chaudhary, Amitabh. 2005. Computing Information Gain in Data Streams, Temporal Data Mining Workshop.
  8. Pliner, Vadim. 2004. A SAS® Macro for Naïve Bayes Classification.
  9. Pokrajac, Dragoljub., Lazarevic, Aleksandar., Latecki, Longin Jan. 2007. Incremental Local Outlier Detection for Data Streams, IEE Symposium on Computational Intelligence and Data Mining (CIDM).
  10. Rokach, Lior, Maimon, Oded. 2010. Decision Trees. In: Data Mining and Knowledge Discovery Handbook, 165-192, Springer.
  11. Sahami, Mehran.1996. Learning Limited Dependence Bayesian Classifiers.
  12. Tan, Pang-Ning., Stienbach, Michael., Kumar, Vipin. 2007. Introduction to Data Mining, 139-20, Pearson.
  13. Agrawal, R., Amielinski, T., and Swami, A. (1993). Mining association rule between sets of items in large databases. In Proceeding of the 1993 ACM SIGMOD International Conference on Management of Data, pp. 207-216, Washington, DC, May 26-28.
  14. Agrawal, R. and Srikant, R. (1994). Fast algorithms for mining association rule. Proceedings of the 20th International Conference on Very Large Data Bases. pp. 487 - 499.
  15. Antonie, M., Zaïane, O. R., Coman, A. (2003). Associative Classifiers for Medical Images. Lecture Notes in Artificial Intelligence 2797, Mining Multimedia and Complex Data, pp 68-83, Springer-Verlag.
  16. Blackmore, K. and Bossomaier, T. J. (2003). Comparison of See5 and J48.PART Algorithms for Missing Persons Profiling. Technical report. Charles Sturt University, Australia.
  17. Brin, S., Motwani, R., Ullman, J., Tsur, S. (1997). Dynamic Itemset Counting and Implication Rules for Market Basket Data. Proceedings of the 1997 ACM SIGMOD International Conference on Management of Data.
  18. Cendrowska, J. (1987). MODEL: An algorithm for inducing modular rules. International Journal of Man-Machine Studies. Vol.27, No.4, pp.349-370.
  19. Cohen, W. W. (1995). Fast effective rule induction. In the Proceeding of the 12 th International Conference on Machine Learning, Morgan Kaufmann, San Francisco, pp. 115-123.
  20. Cohen, W. W. (1993). Efficient pruning methods for separate-andconquer rule learning systems. In the proceeding of the 13th International Joint Conference on AI, Chambry, France.
  21. Cowling, P. and Chakhlevitch, K. (2003). Hyperheuristics for Managing a Large Collection of Low Level Heuristics to Schedule Personnel. Proceeding of 2003 IEEE conference on Evolutionary Computation, Canberra, Australia, 8-12 Dec 2003.
  22. Dong, G., Li, J. (1999). Efficient mining of frequent patterns: Discovering trends and differences. In Proceeding of SIGKDD 1999, San Diego, California.
  23. Chris Buckley and Alan F. Lewit, Optimizations of inverted vector searches, SIGIR '85, Pages 97-110, 1985.
  24. Fayyad, U. M.; Piatetsky-Shapiro, G.; Smyth, P. (1996). Advances in knowledge discovery and data mining, MIT Press.
  25. Zaki, M. J., Parthasarathy, S., Ogihara, M., and Li, W. (1997). New algorithms for fast discovery of association rules. 3rd KDD Conference, pp. 283-286, August 1997.
  26. Charu C. Aggarwal, Stephen C. Gates and Philip S. Yu, On the merits of building categorization systems by supervised clustering, Proceedings of the fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Pages 352 - 356, 1999.
  27. Paul Bradley and Usama Fayyad, Refining Initial Points for K-Means Clustering, Proceedings of the Fifteenth International Conference on Machine Learning ICML98, Pages 91-99. Morgan Kaufmann, San Francisco, 1998
  28. Alvarez, Sergio A. Technical Report BC-CS-2003-01, July 2003. Chisquared computation for association rules: preliminary results
Cytowane przez
Udostępnij na Facebooku Udostępnij na Twitterze Udostępnij na Google+ Udostępnij na Pinterest Udostępnij na LinkedIn Wyślij znajomemu