Method for cyberbullying neuronetwork detection using cloud services and object-oriented model
DOI:
https://doi.org/10.31891/2307-5732-2024-333-2Keywords:
cyberbullying detection, neural network, object-oriented model, cyberbullying, cloud services, Google Colab, BiLSTM, tweetAbstract
The article proposes the method for detecting cyberbullying in posts of social Internet networks, designed for the automated detection of cyberbullying in text messages published in social networks using a neural network approach. The current state of the field of automated detection of cyberbullying is considered, where, based on the studied material, it is proposed to create a method of neural network detection of cyberbullying using cloud services and an object-oriented model. To investigate the effectiveness of the proposed method, a software implementation written in the Python programming language in the PyCharm programming environment was created, as well as a laptop for execution in the Google Colab cloud service for neural network training. The English-language "Cyberbullying Classification" dataset consisting of 39,747 samples was used as research data, from which 8,000 samples containing cyberbullying and 7,945 samples without cyberbullying were selected.
The proposed approach based on the object-oriented model contributes to the creation of flexible, adaptive and easily expandable systems, and the use of cloud services for training neural networks provides an opportunity not to be limited exclusively to the resources of a physical personal computer, but to use specialized computing resources, such as graphics processors (GPU) or tensor processors (TPU), which significantly accelerate the process of training neural networks. The approach has some limitations, the main limitation of the proposed approach is the maximum length of the input sequence, which is 500 characters. Considering the average length of a tweet is about 200 characters, such a limitation is reasonable. In order to improve the result, you can supplement the training dataset and specify some restrictions on the minimum and maximum length of records.