Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion units/en/unit3/additional-readings.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@ These are **optional readings** if you want to go deeper.

- [Foundations of Deep RL Series, L2 Deep Q-Learning by Pieter Abbeel](https://youtu.be/Psrhxy88zww)
- [Playing Atari with Deep Reinforcement Learning](https://arxiv.org/abs/1312.5602)
- [Double Deep Q-Learning](https://papers.nips.cc/paper/2010/hash/091d584fced301b442654dd8c23b3fc9-Abstract.html)
- [Double Q-Learning](https://papers.nips.cc/paper/3964-double-q-learning)
- [Double Deep Q-Learning](https://arxiv.org/abs/1509.06461)
- [Prioritized Experience Replay](https://arxiv.org/abs/1511.05952)
- [Dueling Deep Q-Learning](https://arxiv.org/abs/1511.06581)
2 changes: 1 addition & 1 deletion units/en/unit3/deep-q-algorithm.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -84,7 +84,7 @@ Instead, what we see in the pseudo-code is that we:

## Double DQN [[double-dqn]]

Double DQNs, or Double Deep Q-Learning neural networks, were introduced [by Hado van Hasselt](https://papers.nips.cc/paper/3964-double-q-learning). This method **handles the problem of the overestimation of Q-values.**
Double DQNs, or [Double Deep Q-Learning neural networks](https://arxiv.org/abs/1509.06461), extend the [Double Q-Learning algorithm](https://papers.nips.cc/paper/3964-double-q-learning), introduced by Hado van Hasselt. This method **handles the problem of the overestimation of Q-values.**

To understand this problem, remember how we calculate the TD Target:

Expand Down