METHOD FOR TEXTS CLASSIFICATION BY PROPAGANDA CONTENT USING DEEP LEARNING NEURAL NETWORK MODELS
DOI:
https://doi.org/10.31891/2307-5732-2024-341-5-51Keywords:
deep learning neural networks, transformer neural networks, propaganda, augmentationAbstract
The method for classifying texts by propaganda content by neural network models of deep learning is proposed, based on combining traditional recurrent neural networks with long-term memory with transformers, which can provide a deeper understanding of sequence and context in text content. The peculiarity of proposed method is that it allows detecting both explicit and hidden propaganda messages, based on combining the capabilities of traditional recurrent neural networks with long-term memory and neural networks-transformers, as well as using the mechanism of training text data augmentation, which allows expand the number of training samples.
To evaluate the effectiveness of developed method of classifying texts by the propaganda content using deep learning neural network models, the software implementation was created, which consists of notebooks implemented in the cloud service "Google Colab" and the application for user interaction with the model. Notebooks are used to train the hybrid architecture neural network model and to expand the obtained data set by method of text augmentation. The graphical user interface application developed by Python using the PyCharm development environment.
The dataset of more than 25,000 records was created to train the neural network and the corresponding software was developed to investigate the method effectiveness. It was established that with the use of augmentation, better performance is achieved with larger number of epochs, which is explained by the expansion of the training sample, which leads to the need for a larger number of epochs. At the same time, when using augmentation, it was possible to achieve an accuracy of 97.83%, while without augmentation this indicator reached the maximum level of 96.94%. The obtained results demonstrate the ability of the proposed method to effectively classify texts based on the content of propaganda by neural network models of deep learning, and the use of the additional category "suspicious text" made it possible to raise the Precision and Recall indicators, which in turn makes it possible to automate the moderation of texts on the subject of propaganda with errors of no more than 1.83 % for false propaganda detection.