Skip to content

Wider support for gelu. Remove unused activation. Use same torch layer for silu and swish #3302

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

hypnopump
Copy link

Adds support for GeLU in modules where only silu/swish and mish were supported. Uses the same layer for silu and swish instead of a lambda definition for swish and the layer for silu

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.

@hypnopump hypnopump changed the title Wider support for gelu. use same torch layer for silu and swish Wider support for gelu. Remove unused activation. Use same torch layer for silu and swish May 1, 2023
Copy link
Contributor

@patrickvonplaten patrickvonplaten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should create a simple activations.py file as is done here: https://github.com/huggingface/transformers/blob/main/src/transformers/activations.py and import from there

@hypnopump
Copy link
Author

Probably @patrickvonplaten but that would be more of a design choice, whereas this is a path to fix a current bug.

@hypnopump
Copy link
Author

Specifically, for a UNet1D model, it only works (forward+backward pass) if an activation is specified, but no failure is raised upon module creation, only at runtime. Also, there is an inconsistency between the timestep embedding module and the rest, where the timestep embedding does not accept gelu but other modules do.

As you say, probably it would be better for the future to have all activations grouped on a single place. That would help with maintainability and avoid code duplications.

@hypnopump hypnopump requested a review from patrickvonplaten May 4, 2023 14:58
@patrickvonplaten
Copy link
Contributor

cc @williamberman let's move all those activation functions into their own file

@github-actions github-actions bot added the stale Issues that haven't received updates label May 31, 2023
@huggingface huggingface deleted a comment from github-actions bot Jun 2, 2023
@patrickvonplaten
Copy link
Contributor

@williamberman could you try to help move all the activation functions to their own filesL

@williamberman
Copy link
Contributor

hey @hypnopump sorry for the delay! I put up a PR here just since this branch is a bit behind tip and has some failing ci. Happy to merge this PR as well if you want to rebase and cherry pick the commit from my branch or what not. Up to you! :)

#3656

@hypnopump
Copy link
Author

makes sense. im closing this then as functionality is implemented in your newer PR (left a review btw :)).

@hypnopump hypnopump closed this Jun 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stale Issues that haven't received updates
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants