Skip to content

Conversation

@DevinTDHa
Copy link
Member

@DevinTDHa DevinTDHa commented Aug 8, 2023

Description

This PR contains a new annotator for WhisperForCTC.

Note: Currently only works with greedy decoding. I will enable additional behaviors, once we refactored some of our generation framework.

Highlights:

Tasks:

  • Feature extraction
  • loading saved models (official ones)
  • Onnx Inference and Serialization
  • Generation and Decoding
  • Tensorflow Serialization (there are still some issues I need to work out)
  • Upload and Test pretrained models
  • Python Side
  • Documentation
  • Example notebooks for export
    • TF
    • ONNX

How Has This Been Tested?

Screenshots (if appropriate):

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • Code improvements with no or little impact
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

  • My code follows the code style of this project.
  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.
  • I have read the CONTRIBUTING page.
  • I have added tests to cover my changes.
  • All new and existing tests passed.

@DevinTDHa DevinTDHa added new-feature Introducing a new feature DON'T MERGE Do not merge this PR labels Aug 8, 2023
@DevinTDHa DevinTDHa requested a review from maziyarpanahi August 8, 2023 16:49
@DevinTDHa DevinTDHa self-assigned this Aug 8, 2023
@DevinTDHa DevinTDHa force-pushed the feauture/SPARKNLP-624-WhisperForCTC branch from 3206b6c to ff85269 Compare August 9, 2023 09:12
@DevinTDHa DevinTDHa force-pushed the feauture/SPARKNLP-624-WhisperForCTC branch from 0ca6546 to 7c1dbd1 Compare August 14, 2023 14:16
@DevinTDHa DevinTDHa marked this pull request as ready for review August 14, 2023 14:17
@DevinTDHa
Copy link
Member Author

DevinTDHa commented Aug 22, 2023

Pretrained models are in #13931 (whisper-tiny for onnx and tf)

@maziyarpanahi maziyarpanahi changed the base branch from master to release/510-release-candidate August 24, 2023 07:46
@maziyarpanahi maziyarpanahi merged commit fd468c4 into JohnSnowLabs:release/510-release-candidate Aug 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

DON'T MERGE Do not merge this PR new-feature Introducing a new feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants