Figure 5From: A projected primal-dual gradient optimal control method for deep reinforcement learning(a) Human arm model. (b) Trajectory after 1 time steps. (c) Trajectory after 5 time steps. (d) Trajectory after 8 time steps. (e) Trajectory after 15 time stepsBack to article page