CENTRALIZED INFRASTRUCTURE MONITORING USING THE THANOS SYSTEM: PROSPECTS AND CHALLENGES

Authors

DOI:

https://doi.org/10.31891/2307-5732-2025-347-57

Keywords:

Thanos, Prometheus, centralized monitoring, IT infrastructure, scalability, long-term data storage, distributed systems

Abstract

The article presents a comprehensive study of the use of the Thanos system for centralized monitoring of IT infrastructure in the context of the growing complexity of modern information systems. The research introduces a systematic approach to building scalable monitoring solutions with focus on high availability and efficient resource utilization. A methodology for implementing and optimizing the monitoring system has been developed, encompassing all stages from data collection and processing to analysis and visualization. The study proposes and validates a mathematical model for evaluating system efficiency based on three key parameters: data compression coefficient (C), query processing performance (P), and system reliability (R), achieving an overall efficiency score of 0.9227.

The experimental validation was conducted across three data centers (Kyiv, Lviv, Kharkiv) with 1,248 monitoring servers processing 127,492 metrics per second. The implementation demonstrates exceptional scalability with coefficients S(3) = 0.95 and S(5) = 0.94, while maintaining response times below 47ms under loads exceeding 900,000 queries per second. The system achieved significant improvements in key operational metrics: reducing incident response time from 15.7 to 9.4 minutes, increasing stored metrics volume from 2.3 to 23.7 petabytes while decreasing storage costs from $0.08 to $0.056 per gigabyte.

The research analyzes practical applications across various sectors including banking, telecommunications, online retail, cloud computing, and healthcare, detailing specific monitoring requirements and implementation strategies for each domain. The study introduces innovative features including adaptive load balancing (LoadFactor = 0.6375) and intelligent caching (CacheEfficiency = 0.046). The paper concludes by outlining future development directions, focusing on machine learning integration for predictive analytics and enhanced automated incident response mechanisms.

Published

2025-01-30

How to Cite

STREMBITSKYI, P., YUKHYMCHUK, M., LESKO, V., & PEREPELYTSIA, S. (2025). CENTRALIZED INFRASTRUCTURE MONITORING USING THE THANOS SYSTEM: PROSPECTS AND CHALLENGES. Herald of Khmelnytskyi National University. Technical Sciences, 347(1), 417-422. https://doi.org/10.31891/2307-5732-2025-347-57