BazEkon - Biblioteka Główna Uniwersytetu Ekonomicznego w Krakowie

BazEkon home page

Meny główne

Autor
Wójcik Filip (Wroclaw University of Economics and Business, Poland)
Tytuł
Utilization of Deep Reinforcement Learning for Discrete Resource Allocation Problem in Project Management - a Simulation Experiment
Wykorzystanie uczenia ze wzmocnieniem w problemach dyskretnej alokacji zasobów w zarządzaniu projektami - eksperyment symulacyjny
Źródło
Informatyka Ekonomiczna / Uniwersytet Ekonomiczny we Wrocławiu, 2022, nr 1 (63), s. 56-74, rys., tab., wykr. bibliogr. 47 poz.
Business Informatics / Uniwersytet Ekonomiczny we Wrocławiu
Słowa kluczowe
Badania operacyjne, Zarządzanie, Optymalizacja
Operations research, Management, Optimalization
Uwagi
Klasyfikacja JEL: C44, C45, C61
streszcz., summ.
Abstrakt
W artykule zbadano stosowalność metod głębokiego uczenia ze wzmocnieniem (DRL) do symulowanych problemów dyskretnej alokacji ograniczonych zasobów w zarządzaniu projektami. DRL jest obecnie szeroko badaną dziedziną, jednak w chwili przeprowadzania niniejszych badań nie natrafiono na zbliżone studium przypadku. Hipoteza badawcza zakładała, że prawidłowo skonstruowany agent RL będzie w stanie uzyskać lepsze wyniki niż klasyczne podejście wykorzystujące optymalizację. Dokonano porównania agentów RL: VPG, AC i PPO z algorytmem optymalizacji w trzech symulacjach: "łatwej"/"średniej"/ "trudnej" (70/50/30% średnich szans na sukces projektu). Każda symulacja obejmowała 500 niezależnych, stochastycznych eksperymentów. Istotność różnic porównano testem ANOVA Welcha na poziomie istotności α = 0.01, z następującymi po nim porównaniami post hoc z kontrolą poziomu błędu. Eksperymenty wykazały, że agent PPO uzyskał w najtrudniejszych symulacjach znacznie lepsze wyniki niż metoda optymalizacji i inne algorytmy RL.(abstrakt oryginalny)

This paper tests the applicability of deep reinforcement learning (DRL) algorithms to simulated problems of constrained discrete and online resource allocation in project management. DRL is an extensively researched method in various domains, although no similar case study was found when writing this paper. The hypothesis was that a carefully tuned RL agent could outperform an optimisation-based solution. The RL agents: VPG, AC, and PPO, were compared against a classic constrained optimisation algorithm in trials: "easy"/"moderate"/"hard" (70/50/30% average project success rate). Each trial consisted of 500 independent, stochastic simulations. The significance of the differences was checked using a Welch ANOVA on significance level alpha = 0.01, followed by post hoc comparisons for false-discovery control. The experiment revealed that the PPO agent performed significantly better in moderate and hard simulations than the optimisation approach and other RL methods.(original abstract)
Dostępne w
Biblioteka SGH im. Profesora Andrzeja Grodka
Biblioteka Główna Uniwersytetu Ekonomicznego we Wrocławiu
Pełny tekst
Pokaż
Bibliografia
Pokaż
  1. Ackoff, R. L. (1956). The development of operations research as a science. Operations Research, 4(3). https://doi.org/10.1287/opre.4.3.265
  2. Anderson, C. (2015). Creating a data-driven organization. O'Reilly Media.
  3. Arcuri, A., and Briand, L. (2014). A Hitchhiker's guide to statistical tests for assessing randomized algorithms in software engineering. Software Testing Verification and Reliability, 24(3). https:// doi.org/10.1002/stvr.1486
  4. Arulkumaran, K., Deisenroth, M. P., Brundage, M., and Bharath, A. A. (2017). Deep reinforcement learning: A brief survey. IEEE Signal Processing Magazine, 34(6). https://doi.org/10.1109/ MSP.2017.2743240
  5. Bagherzadeh, M., Kahani, N., and Briand, L. (2021). Reinforcement learning for test case prioritization. IEEE Transactions on Software Engineering. https://doi.org/10.1109/TSE.2021.3070549
  6. Bellman, R. (1954). The theory of dynamic programming. Bulletin of the American Mathematical Society, 60(6). https://doi.org/10.1090/S0002-9904-1954-09848-8
  7. Bhimani, A., and Willcocks, L. (2014). Digitisation, Big Data and the transformation of accounting information. Accounting and Business Research, 44(4). https://doi.org/10.1080/00014788.2014. 910051
  8. Chiang, H. Y., and Lin, B. M. T. (2020). A decision model for human resource allocation in project management of software development. IEEE Access, 8. https://doi.org/10.1109/ACCESS.2020.2975829
  9. Colas, C., Sigaud, O., and Oudeyer, P. Y. (2019). A Hitchhiker's guide to statistical comparisons of reinforcement learning algorithms. RML@ICLR 2019 Workshop - Reproducibility in Machine Learning.
  10. Duan, Y., Edwards, J. S., and Dwivedi, Y. K. (2019). Artificial intelligence for decision making in the era of Big Data - evolution, challenges and research agenda. International Journal of Information Management, 48. https://doi.org/10.1016/j.ijinfomgt.2019.01.021
  11. Dulac-Arnold, G., Evans, R., van Hasselt, H., Sunehag, P., Lillicrap, T., Hunt, J., ... and Coppin, B. (2015). Deep reinforcement learning in large discrete action spaces. ArXiv Preprint ArXiv:1512.07679.
  12. Farhang Moghaddam, B. (2019). Mapping optimization techniques in project management. Journal of Project Management. https://doi.org/10.5267/j.jpm.2019.3.003
  13. Ferguson, C. J. (2009). An effect size primer: A guide for clinicians and researchers. Professional Psychology: Research and Practice, 40(5), 532.
  14. Games, P. A., and Howell, J. F. (1976). Pairwise multiple comparison procedures with unequal N's and/or variances: A Monte Carlo Study. Journal of Educational Statistics, 1(2). https://doi. org/10.3102/10769986001002113
  15. Giannoccaro, I., and Pontrandolfo, P. (2002). Inventory management in supply chains: A reinforcement learning approach. International Journal of Production Economics, 78(2). https://doi.org/10.1016/ S0925-5273(00)00156-0
  16. Gupta, S., Modgil, S., Bhattacharyya, S., and Bose, I. (2022). Artificial intelligence for decision support systems in the field of operations research: review and future scope of research. Annals of Operations Research, 308(1-2). https://doi.org/10.1007/s10479-020-03856-6
  17. Huang, Z., van der Aalst, W. M. P., Lu, X., and Duan, H. (2011). Reinforcement learning based resource allocation in business process management. Data and Knowledge Engineering, 70(1). https://doi. org/10.1016/j.datak.2010.09.002
  18. Institute, P. M. (2021). Guide to the Project Management Body of Knowledge (PMBOK Guide) and the Standard for Project Management. Project Management Institute.
  19. Islam, M. N. (2011). Crashing project time with least cost: A linear programming approach. SSRN Electronic Journal. https://doi.org/10.2139/ssrn.1012525
  20. Jędrzejowicz, P., and Ratajczak-Ropel, E. (2013). Reinforcement learning strategy for A-Team solving the resource-constrained project scheduling problem. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 8083 LNAI, 457-466. https://doi.org/10.1007/978-3-642-40495-5_46
  21. Kane, H., and Tissier, A. (2012). A resource allocation model for multi-project management. 9th International Conference on Modelling, Optimization & SIMulation.
  22. Koulinas, G., Xanthopoulos, A., Kiatipis, A., and Koulouriotis, D. (2018). A summary of using reinforcement learning strategies for treating project and production management problems. 2018 13th International Conference on Digital Information Management, ICDIM 2018, 33-38. https://doi. org/10.1109/ICDIM.2018.8847099
  23. Li, R., Zhao, Z., Sun, Q., Chih-Lin, I., Yang, C., Chen, X., Zhao, M., and Zhang, H. (2018). Deep reinforcement learning for resource management in network slicing. IEEE Access, 6, 74429-74441. https://doi.org/10.1109/ACCESS.2018.2881964
  24. Lillicrap, T. P., Hunt, J. J., Pritzel, A., Heess, N., Erez, T., Tassa, Y., Silver, D., and Wierstra, D. (2016). Continuous control with deep reinforcement learning. 4th International Conference on Learning Representations, ICLR 2016 - Conference Track Proceedings.
  25. Manikantan, P., and Gurusamy, S. (2016). Impact of knowledge sharing on project management in the IT industry - An empirical investigation. SUMEDHA Journal of Management, 5(2).
  26. Mao, H., Alizadeh, M., Menache, I., and Kandula, S. (2016). Resource management with deep reinforcement learning. HotNets 2016 - Proceedings of the 15th ACM Workshop on Hot Topics in Networks. https://doi.org/10.1145/3005745.3005750
  27. Mnih, V., Badia, A. P., Mirza, L., Graves, A., Harley, T., Lillicrap, T. P., Silver, D., and Kavukcuoglu, K. (2016). Asynchronous methods for deep reinforcement learning. 33rd International Conference on Machine Learning, ICML 2016, 4.
  28. Mousavi, S. S., Schukat, M., and Howley, E. (2018). Deep reinforcement learning: An overview. In Lecture notes in networks and systems (Vol. 16). https://doi.org/10.1007/978-3-319-56991-8_32
  29. Schulman, J. (2016). Optimizing expectations: From deep reinforcement learning to stochastic computation graphs.
  30. Schulman, J., Moritz, P., Levine, S., Jordan, M. I., and Abbeel, P. (2016). High-dimensional continuous control using generalized advantage estimation. 4th International Conference on Learning Representations, ICLR 2016 - Conference Track Proceedings.
  31. Schulman, J., Wolski, F., Dhariwal, P., Radford, A., and Klimov, O. (2017). Proximal policy optimization algorithms. ArXiv Preprint ArXiv:1707.06347.
  32. Schwindt, Ch. (2010). Resource allocation in project management. Springer.
  33. Selaru, C. (2012). Resource allocation in project management. International Journal of Economic Practices and Theories, 2(4), 274-282.
  34. Sharda, R., Delen, D., and Turban, E. (2020). Analytics, data science, & artificial intelligence: Systems for decision support. Pearson Education Limited.
  35. Shyalika, C., Silva, T., and Karunananda, A. (2020). Reinforcement learning in dynamic task scheduling: A review. SN Computer Science 2020 1:6, 1(6), 1-17. https://doi.org/10.1007/S42979-020- 00326-5
  36. Smart, W. D., and Kaelbling, L. P. (2000). Practical reinforcement learning in continuous spaces. Proceedings of the Seventeenth International Conference on Machine Learning.
  37. Sullivan, G. M., and Feinn, R. (2012). Using effect size - or why the P value is not enough. Journal of Graduate Medical Education, 4(3), 279-282.
  38. Sutton, R., Bach, F., and Barto, A. (2018). Reinforcement learning : An introduction. MIT Press Ltd.
  39. Sutton, R. S., McAllester, D., Singh, S., and Mansour, Y. (1999). Policy gradient methods for reinforcement learning with function approximation. Advances in Neural Information Processing Systems, 12.
  40. Wu, Y., Mansimov, E., Grosse, R. B., Liao, S., and Ba, J. (2017). Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation. Advances in Neural Information Processing Systems, 30.
  41. Xu, Y., Zhao, Z., Cheng, P., Chen, Z., Ding, M., Vucetic, B., & Li, Y. (2021). Constrained Reinforcement Learning for Resource Allocation in Network Slicing. IEEE Communications Letters, 25(5), 1554-1558. https://doi.org/10.1109/LCOMM.2021.3053612
  42. Yan, Y., Chow, A. H. F., Ho, C. P., Kuo, Y.-H., Wu, Q., and Ying, C. (2021). Reinforcement learning for logistics and supply chain management: Methodologies, state of the art, and future opportunities. SSRN Electronic Journal. https://doi.org/10.2139/ssrn.3935816
  43. Ye, H., Li, G. Y., and Juang, B.-H. F. (2019). Deep reinforcement learning based resource allocation for V2V communications. IEEE Transactions on Vehicular Technology, 68(4), 3163. https://doi. org/10.1109/TVT.2019.2897134
  44. Ye, K., Shi, X., Li, H., and Shi, N. (2014). Resource allocation problem in port project portfolio management. Proceedings - 2014 7th International Joint Conference on Computational Sciences and Optimization, CSO 2014, 159-162. https://doi.org/10.1109/CSO.2014.36
  45. Yu, L., Zhang, C., Jiang, J., Yang, H., and Shang, H. (2021). Reinforcement learning approach for resource allocation in humanitarian logistics. Expert Systems with Applications, 173. https://doi. org/10.1016/j.eswa.2021.114663
  46. Yuan, Y., Li, H., and Ji, L. (2021). Application of deep reinforcement learning algorithm in uncertain logistics transportation scheduling. Computational Intelligence and Neuroscience. https://doi. org/10.1155/2021/5672227
  47. Zuo, J., and Joe-Wong, C. (2021). Combinatorial multi-armed bandits for resource allocation. 2021 55th Annual Conference on Information Sciences and Systems, CISS 2021. https://doi.org/10.1109/ CISS50987.2021.9400228
Cytowane przez
Pokaż
ISSN
1507-3858
Język
eng
URI / DOI
http://dx.doi.org/10.15611/ie.2022.1.05
Udostępnij na Facebooku Udostępnij na Twitterze Udostępnij na Google+ Udostępnij na Pinterest Udostępnij na LinkedIn Wyślij znajomemu