ПОРІВНЯЛЬНИЙ АНАЛІЗ МОДЕЛЕЙ МАШИННОГО НАВЧАННЯ RANDOM FOREST ТА XGBOOST У ЗАДАЧІ КЛАСИФІКАЦІЇ ІНЦИДЕНТІВ БЕЗПЕКИ

МИКОЛА КОНОТОПЕЦЬ; ОЛЕКСАНДР ТУРОВСЬКИЙ; ОЛЕКСАНДР АКСАМИТНИЙ; АЛЕКСАНДРА МАТІЙКО

doi:10.31891/2307-5732-2025-357-29

Authors

MYKOLA KONOTOPETS Institute of Special Communications and Information ProtectionNational Technical University of Ukraine"Igor Sikorsky Kyiv Polytechnic Institute" Author https://orcid.org/0000-0002-6963-1877
OLEKSANDR TUROVSKY State University of Information and Communication Technologies Author https://orcid.org/0000-0002-4961-0876
OLEKSNDR AKSAMYTNYI Institute of Special Communications and Information ProtectionNational Technical University of Ukraine"Igor Sikorsky Kyiv Polytechnic Institute" Author https://orcid.org/0009-0004-4439-6286
ALEXANDRA MATIYKO Institute of Special Communications and Information ProtectionNational Technical University of Ukraine"Igor Sikorsky Kyiv Polytechnic Institute" Author https://orcid.org/0000-0002-6947-5958

DOI:

https://doi.org/10.31891/2307-5732-2025-357-29

Keywords:

incident classification, information security, machine learning, Random Forest, XGBoost, false positives

Abstract

The article presents a comparative study of the effectiveness of the Random Forest and XGBoost machine learning models in the problem of multi-class classification of security incidents in information systems. In the process of the study, two incident classification models based on the Random Forest and XGBoost algorithms were built.
In addition to the empirical assessment of the quality of the models, the paper describes their principles of operation and provides a mathematical justification. For Random Forest, a probabilistic model of an ensemble of trees was formulated, indicator functions were used, the regression function and the generalized error of the model were analyzed. For XGBoost, the procedure for building a tree at each iteration, optimization of the loss functional with a regularization component, the use of second-order gradients and the growth criterion were considered in detail. Such a formalization provides a deeper theoretical understanding of the mechanisms of operation of the algorithms and explains their behavior in real conditions.
A comparative analysis of the effectiveness of the models was conducted on key classification metrics: accuracy, recall, precision, and F1-measure. It was determined that the Random Forest model showed a slightly higher overall accuracy (91.97%) and a better ability to detect false positive incidents, which is an important advantage in conditions of SOC overload with a large number of signals. In turn, the XGBoost model demonstrated stable classification of true threats (TruePositive), which is critical for rapid response in information security systems.
The results of the study can be effectively used in the integration of the developed models into SIEM, SOAR, and other information security platforms for automated preliminary classification of events.

COMPARATIVE ANALYSIS OF RANDOM FOREST AND XGBOOST MACHINE LEARNING MODELS IN THE TASK OF SECURITY INCIDENT CLASSIFICATION

Authors

DOI:

Keywords:

Abstract

Downloads

Published

Issue

Section

License

How to Cite

Language

Make a Submission

Index

For Avtors

Flag