Skip to content
This repository was archived by the owner on Oct 7, 2024. It is now read-only.

Conversation

MichelangeloConserva
Copy link

The _total_steps should be increased after the agent takes an action, as in the jax implementation, or during the update function call, as for DQN. If the assignment is done after computing the gradient, as in the current implementation, the agent will not be trained for values of the sgd_period values higher than one.

The _total_steps should be increased after the agent takes an action as in the jax implementation of this agent. For values of the sgd_period higher than one, this bugs prevents the agent from training.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant