ADAPTIVE MANAGEMENT OF RESOURCES OF A COMPLEX INFORMATION PROTECTION SYSTEM BASED ON THE SYNTHESIS OF GAMES THEORY AND REINFORCED LEARNING

Authors

DOI:

https://doi.org/10.31891/2307-5732-2026-361-15

Keywords:

dynamic Bayesian games, reinforcement learning (Q-learning), adaptive cyber defense management, game theory, information asymmetry, comprehensive information protection system (CIPS), access control system (ACS), resource optimization

Abstract

The article develops and theoretically substantiates a method of adaptive cyber defense resource management based on a combination of dynamic Bayesian games and reinforcement learning approaches. This method models the confrontation between a rational defender and an attacker under conditions where the defender has incomplete information about the adversary. Uncertainty regarding the attacker’s level of skill or motivation is formalized through prior probabilistic assumptions about the attacker’s hidden type.

The proposed method operates not as a one-time calculation but as a continuous, iterative cycle consisting of monitoring, adaptation, and decision-making phases. A key element of this research is the dynamic adaptation mechanism, which employs the Bayesian principle to update and adjust probabilistic assumptions about the adversary’s type each time a specific attack action is observed. This allows the system to refine its understanding of the threat landscape in real-time. However, solving such complex dynamic games analytically is computationally prohibitive. Therefore, to compute the optimal long-term strategy of the defender—specifically, the strategy that minimizes cumulative costs associated with both security implementation and potential damage—a reinforcement learning algorithm, specifically Q-learning, is used to approximate the Bayesian–Nash equilibrium. This allows the defense agent to learn the optimal policy through simulated interactions, balancing immediate defense costs against future risks.

It is theoretically proven that this dynamic and proactive approach is significantly more effective than traditional static or purely reactive methods. By anticipating rational attacker behavior and adapting to the attacker's type, the method ensures a global minimization of expected costs over the entire duration of the conflict. The method has substantial practical significance for the development of next-generation intelligent decision support systems (DSS) for Security Operations Centers (SOCs). Furthermore, the algorithmic nature of the proposed solution allows it to be easily integrated into existing security frameworks, such as Comprehensive Information Protection Systems (CIPS) and Access Control Systems (ACS), providing them with an intelligent core for automated resource allocation and strategic defense.

Published

2026-01-29

How to Cite

DZHULIY, V. ., MULIAR, I., RATUSHNYAK, M., & CHESHUN, V. . (2026). ADAPTIVE MANAGEMENT OF RESOURCES OF A COMPLEX INFORMATION PROTECTION SYSTEM BASED ON THE SYNTHESIS OF GAMES THEORY AND REINFORCED LEARNING. Herald of Khmelnytskyi National University. Technical Sciences, 361(1), 120-126. https://doi.org/10.31891/2307-5732-2026-361-15