TY - GEN
T1 - Reinforcement Learning with Explainability for Traffic Signal Control
AU - Rizzo, Stefano Giovanni
AU - Vantini, Giovanna
AU - Chawla, Sanjay
N1 - Publisher Copyright:
© 2019 IEEE.
PY - 2019/10
Y1 - 2019/10
N2 - Deep reinforcement learning has recently provided promising results on the traffic light control optimization problem, by training neural network agents to select the traffic light phase. These agents learn complex models by optimizing a simple objective, such as the average traffic speed, but are considered opaque when it comes to explaining their decisions. Nevertheless, explanations are required in transferring this technology in the real world, especially in complex scenarios with nontrivial phases, such as in the case of signalized roundabouts with entry and circulatory traffic lights. In this paper, after training a Policy Gradient agent on a signalized roundabout with 11 phases and real traffic data, we analyze the relation between the agent phase preferences and the actual traffic, and we assess the agent capability of reacting to the current detectors state. Then, we estimate the effect of the road detectors state on the agent selected phases, through the SHAP model-agnostic technique, using Shapley values recovered from a linear explanation model. The results show that it is possible to extract meaningful explanations on the decision taken by a complex policy, in relation to both the traffic volumes and the lanes occupancy.
AB - Deep reinforcement learning has recently provided promising results on the traffic light control optimization problem, by training neural network agents to select the traffic light phase. These agents learn complex models by optimizing a simple objective, such as the average traffic speed, but are considered opaque when it comes to explaining their decisions. Nevertheless, explanations are required in transferring this technology in the real world, especially in complex scenarios with nontrivial phases, such as in the case of signalized roundabouts with entry and circulatory traffic lights. In this paper, after training a Policy Gradient agent on a signalized roundabout with 11 phases and real traffic data, we analyze the relation between the agent phase preferences and the actual traffic, and we assess the agent capability of reacting to the current detectors state. Then, we estimate the effect of the road detectors state on the agent selected phases, through the SHAP model-agnostic technique, using Shapley values recovered from a linear explanation model. The results show that it is possible to extract meaningful explanations on the decision taken by a complex policy, in relation to both the traffic volumes and the lanes occupancy.
UR - http://www.scopus.com/inward/record.url?scp=85076811271&partnerID=8YFLogxK
U2 - 10.1109/ITSC.2019.8917519
DO - 10.1109/ITSC.2019.8917519
M3 - Conference contribution
AN - SCOPUS:85076811271
T3 - 2019 IEEE Intelligent Transportation Systems Conference, ITSC 2019
SP - 3567
EP - 3572
BT - 2019 IEEE Intelligent Transportation Systems Conference, ITSC 2019
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2019 IEEE Intelligent Transportation Systems Conference, ITSC 2019
Y2 - 27 October 2019 through 30 October 2019
ER -