Figure 4From: A projected primal-dual gradient optimal control method for deep reinforcement learningThe summed up reward during the trainingBack to article page