A MODEL OF A DATA LEAKAGE PREVENTION SYSTEM WITH EVOLUTIONARY ADAPTATION BASED ON GENETIC ALGORITHMS
DOI:
https://doi.org/10.31891/2307-5732-2025-357-56Keywords:
data leakage prevention, genetic algorithms, evolutionary adaptation, concept drift, behavioral profiling, document classificationAbstract
The paper proposes a data leakage prevention (DLP) system model with evolutionary adaptation that integrates three components within a four-tier architecture: document classification based on a genetic algorithm with production IF-THEN rules, a dual-window concept drift detector combining model uncertainty monitoring and the Kolmogorov–Smirnov test with a warm-start policy archive, and user behavioral profiling with exponential forgetting. Content analysis is based on a classifier that synthesizes a set of rules in the IF-THEN form using a genetic algorithm. Each rule sets conditions on specific features of the document and evaluates the degree of belonging to a certain confidentiality class. Chromosomal encoding of classification rules ensures full interpretability, while a multi-objective fitness function balances accuracy with rule set complexity. Experimental evaluation on the SMS Spam Collection (5574 messages) and Synthetic PII (2000 documents) datasets shows that the genetic classifier achieves an F1-score of 0.985 on the personal data detection task, falling short of ensemble methods (F1 = 1.000) by only 1.5 percentage points. The evolutionary adaptation mechanism provides a 13.9% improvement in prequential accuracy compared to a static model, and the drift detector correctly identifies all abrupt drift points (Recall = 1.00). The proposed model is suitable for deployment in regulated industries where GDPR and HIPAA compliance requires justification of decisions to block information flows. The approach can be extended through integration of transformer models for feature extraction, adapted to multi-class classification, and deployed in distributed corporate environments using federated learning.
Downloads
Published
Issue
Section
License
Copyright (c) 2025 ПЕТРО ВІЖЕВСЬКИЙ, ЮРІЙ КРАВЧИК (Автор)

This work is licensed under a Creative Commons Attribution 4.0 International License.