MULTI-AGENT DEEP REINFORCEMENT LEARNING FRAMEWORK DESIGN FOR EFFICIENT SINGLE-INTERSECTION TRAFFIC LIGHT CONTROL

Authors

DOI:

https://doi.org/10.31891/2307-5732-2026-363-59

Keywords:

cooperative reinforcement learning , partial observability, uncertainty, decentralized training and execution, traffic light control

Abstract

This paper reformulates single-intersection traffic light control as a cooperative Decentralized Partially Observable Markov Decision Process (Dec-POMDP), treating it as a minimal testbed for studying decentralized coordination under uncertainty rather than as a standalone optimization task. Multiple agents control disjoint signal groups using fine-grained primitive actions, emphasizing modularity, robustness to sensing limitations, and compatibility with legacy stage-based control systems. To enable coordination without explicit communication, we propose an extended observation space that includes both dynamic traffic features and structural intersection information, allowing passive coordination through shared physical signals. Building on this formulation, we introduce a decentralized multi-agent deep reinforcement learning framework that integrates recurrent value estimation to mitigate partial observability, distributional reinforcement learning to preserve multi-modal return structures arising from competing coordination equilibria, and hysteretic updates to stabilize decentralized learning dynamics. Primitive-action traffic signal control induces chain-like decision processes with stochastic outcomes, where naive exploration and mean-based value estimates often lead to premature convergence to suboptimal coordination strategies. The proposed uncertainty-aware framework explicitly addresses this challenge. Preliminary simulation experiments are used to analyze learning dynamics, equilibrium sensitivity, and coordination behavior. Rather than emphasizing performance superiority, the results illustrate the behavioral implications of the proposed reformulation and learning design. This work provides a principled framework for decentralized, uncertainty-aware traffic signal control and establishes a foundation for future extensions to scalable multi-intersection coordination.

Downloads

Published

2026-03-26

How to Cite

LYTVYNENKO, M., & REBEZYUK, L. (2026). MULTI-AGENT DEEP REINFORCEMENT LEARNING FRAMEWORK DESIGN FOR EFFICIENT SINGLE-INTERSECTION TRAFFIC LIGHT CONTROL. Herald of Khmelnytskyi National University. Technical Sciences, 363(2), 446-453. https://doi.org/10.31891/2307-5732-2026-363-59