Skip to content

RNN word language model example #4

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Oct 24, 2016
Merged

RNN word language model example #4

merged 7 commits into from
Oct 24, 2016

Conversation

adamlerer
Copy link
Contributor

@apaszke for your perusal.

If you uncomment the profiling code, you can then run
CUDA_LAUNCH_BLOCKING=1 python main.py -model LSTM -cuda -nlayers 2

self.decoder.bias.data.fill_(0)
self.decoder.weight.data.uniform_(-initrange, initrange)

def __call__(self, hidden, input):

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

return loss / data.size(0)

# simple gradient clipping, using the total norm of the gradient
def clipGradient(model, clip):

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

@adamlerer adamlerer changed the title RNN word language model example [wip] RNN word language model example Oct 5, 2016
@adamlerer
Copy link
Contributor Author

FYI the perplexity isn't as good for this model as an "equivalent" rnnlib3 model, I'm looking at why now - probably has to do with different initialization.

@adamlerer
Copy link
Contributor Author

My latest commit fixes @colesbury and @apaszke comments, and some other bugs. I also fixed a few discrepancies with the torch model (remove biases, fix lr) so it now reaches the same perplexity as the torch model (~114).

Requires pytorch/pytorch#106

@adamlerer
Copy link
Contributor Author

I updated this to use the torch RNN library (the monolithic one, that uses cudnn). It's within 5% of the speed of the lua-torch version under the standard parameters. Ready to merge.

P.S. Should I also include the version that builds an RNN from scratch? It might be instructive for people who want to do something different than what's supported by the monolith.

@adamlerer adamlerer changed the title [wip] RNN word language model example RNN word language model example Oct 17, 2016
@adamlerer
Copy link
Contributor Author

Depends on pytorch/pytorch#129

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants