METHOD OF FORMALIZED PROCEDURE FOR SYNTHESIS AND COMPUTATION OF FEATURES FOR FAKE NEWS DETECTION
DOI:
https://doi.org/10.31891/2307-5732-2025-355-102Keywords:
fake news detection, large language models, feature computation procedure, natural language processing, text classificationAbstract
The pervasive and evolving nature of digital disinformation necessitates the development of sophisticated detection systems that are accurate, transparent, and adaptable to novel deceptive strategies. While Large Language Models (LLMs) have demonstrated considerable prowess in discerning nuanced textual patterns, their application in fake news detection often results in “black-box” systems, limiting trust and hindering the ability to respond to emergent manipulative techniques. This paper introduces a novel method designed to bridge this gap. We present a structured procedure for systematically synthesizing suspicious textual attributes, guided by LLM-driven insights, and their subsequent transformation into a robust set of quantifiable, interpretable numerical features. These features, encompassing aspects such as paraphrase intensity, sentiment polarity, stylistic anomalies, and fact-checking congruity, are then synergistically integrated with the deep contextual embeddings generated by LLMs. Rigorous experimental validation was conducted on diverse English (FakeNewsNet) and Ukrainian (Ukrainian news) datasets. The proposed method outperformed established baseline approaches, achieving substantial accuracy improvements, with figures reaching up to 89.6% for English and 88.3% for Ukrainian language texts. Key findings reveal that explicitly incorporating these engineered numeric indicators significantly enhances recall rates for deceptive articles, a critical factor in mitigating the societal impact of misinformation. Furthermore, the method’s modularity fosters adaptability, enabling the incorporation of newly identified deceptive patterns as additional numeric features without necessitating the complete retraining of the foundational LLM. This study unequivocally underscores the significant value of systematically engineered, interpretable numeric features as a vital complement to the powerful, yet often opaque, embeddings of LLMs
Downloads
Published
Issue
Section
License
Copyright (c) 2025 АНДРІЙ ШУПТА (Автор)

This work is licensed under a Creative Commons Attribution 4.0 International License.