NAIVE RULE-BASED METHOD IN SENTIMENT ANALYSIS OF UKRAINIAN-LANGUAGE CONTENT
DOI:
https://doi.org/10.31891/2307-5732-2024-343-6-22Keywords:
sentiment analysis, Naive Rule-Based approach, Ukrainian language, text preprocessing, emotion detection, sentiment lexicon, natural language processing (NLP), data analysisAbstract
This paper deals with the obstacles faced while executing the Naive Rule Based algorithms for analyzing sentiment of Ukrainian language content. Text sentiment analysis is useful to such types of content as feedbacks, brand tracking, political stances and psychological analyses. A number of steps, which include the previously described treatment of text, explication of elements of the text, emojis, links, hashtags and other special characters are abandoned within this work. The next step tackles ‘tokenization’ where the text is broken into a set of small units or words and ‘lemmatization’ where words are worn down to the basic word form for the purpose of analyses. After these steps are taken care of, a sentiment lexicon is used to classify the text in terms of its tone i.e. positive, negative and neutral. Even though it is very basic, the Naive Rule-Based method stands out for its simple and effective approach for carrying out sentiment analysis, ideal for scenarios in which more advanced and complex machine learning techniques may not be possible owing to data, computing power, and time limitations. This technique makes it possible to carry out easy customization and even use of domain specific language by simply expanding the sentiment lexicon or changing the rules. Nevertheless, there are some aspects in which the method falls short. More complex properties of language such as sarcasm, the meaning within context, and building up deep complex sentences are all aspects that the system can struggle with and therefore limit its accuracy. This study proposes some useful recommendations to address the limitations, including extending the existing emotion lexicon so more emotions can be comprehended, and the implementation of context-embedded methods. Further, hybrid methods that combine conventions and rules in addition to machine learning approaches are identified as prospects for improving the effectiveness. One such case study demonstrates the effectiveness of the Naive Rule-based approach as applied to the dataset in the Ukrainian language. The results demonstrate the capability of the method to provide clear sentiment scoring and emotion classification. Although this approach is not at the level of machine learning models, it still manages to be an efficient and feasible approach for specific use cases especially in cases where speed and clarity take precedence. The findings of this study emphasize that, in resource-poor settings, sentiment analysis can be carried out using this technique with data processing tools and methods that have low precision and context dimensions, and more ways to enhance accuracy and context dimensions are provided.