Exploring the effects of different reward function on TSCA learning process

While there has been extensive research into the applications of reinforcement learning with controlling a traffic signal control agent (TSCA), there has been little comparison between the reward functions used by different approaches. In this paper, we explore 6 different reward functions, each measured once with the current time step, and once as a difference with the previous time step, all standardized by using the same model, action space, and state space.