Skip to content

Conversation

ylacombe
Copy link
Contributor

@ylacombe ylacombe commented Aug 8, 2023

Following @ydshieh's #25290 which laid the groundwork for tiny Bark tests, this PR adds BarkModel tests.

Because BarkModel is a non-regular model, with no forward method and a non-regular generate, I've created some hand-made tests, which I can enrich further if you like. @ydshieh , @amyeroberts WDYT of the tests already implemented?

Many thanks!

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Aug 8, 2023

The documentation is not available anymore as the PR was closed or merged.

@unittest.skip("So far all tests will fail.")
class BarkModelTest(ModelTesterMixin, GenerationTesterMixin, unittest.TestCase):
all_model_classes = (BarkModel,) if is_torch_available() else ()
class BarkModelTest(unittest.TestCase):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without this being a subclass of ModelTesterMixin, there is no way to create tiny model for Bark.

As you mentioned there is even no forward for BarkModel, subclassing and overwritting seems too much of work for nothing.

Let's see what @amyeroberts and @sgugger would say, but I could accept this and try to modify the tiny model script to make it work for bark.

(the run_pipeline_test for pipeline testing would not be run if PipelineTesterMixin is not included here. But this is not the point for this comment)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a slippery slope though. It's a sign that the model does not fit the API of Transformers, so what is even the goal of having it in the library in that case? Transformers strength is to provide a common API for all models after all.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with your points, much has been done to try to make Bark fit with the API of Transformers but it might be a lost cause to try to further push the integration.

Adding a tiny model for Bark might not be worth it at the moment, imo.

However, note that arguments for having Bark in the library are numerous:

  • It's a SOTA model with great traction
  • It's submodels are all in line with the Transformers API
  • It paves the way for TTS integration into Transformers
  • While not totally in line with Transfomers API, it still shares many features with it (presence of a generate method, tokenization, attention mechanism, ...)

With regards to the TTS pipeline, Bark can still tested without the use of run_pipeline_test on a stand-alone basis.
However, it could be good to have at least one TTS/TTA models tested by run_pipeline_test. @ydshieh , what do you think of adding a tiny MusicGen. This time it should be much more easier since it is a proper and classic Transformers model.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, it's released and we can't just remove it. But let's be careful next time.

@ylacombe Could you remind us why BarkModel can't have a forward? I can't see it is mentioned in the [PR description] (#24086 (comment)). Probably it is in some review comment though (?).

For future PRs, such (unusual) information is better clearly stated, so the reviewers can provide their opinion.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just so I understand better - what's the motivation behind creating a tiny version of the model? Is it used in independently to the tests in test_modeling_bark.py, where a config is defined explicitly and a model created based on these values?

Bark is irregular in the sense that it's a concatenation of three auto-regressive models, meaning there's no notion of a forward pass (see comment for details). There was a long discussion about this on the PR, where we agreed that having a .generate method was the best fit for the Transformers API, and having a .forward method doesn't really work here.

I think @ylacombe has very nicely described the motivations for including the model in the Transformers library, but it's a very valid point from @sgugger that overall it's not the best fit. Maybe in the long term we can start a new TTS library that is more flexible for these kinds of audio architectures (given we're on the limit of the Transformers design), or perform a Transformers refactor to better facilitate these kinds of new TTS models

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the motivation behind creating a tiny version of the model

The motivation is to be able to run pipeline testing, for example, the one added by @ylacombe in

def run_pipeline_test(self, speech_generator, _):

which is linked to PipelineTesterMixin.

Since 10 month ago, we decided to move the logic of their creation outside the testing. But even if they are inside the testing code, the creation would fail for bark.

a config is defined explicitly and a model created based on these values?

This is true, but they are not for the pipeline testing, only the usual model tests.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need to run any test outside of a slow integration test?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my side, it's fine not to create a tiny bark. It has been this case for a few models.

Copy link
Collaborator

@ydshieh ydshieh Aug 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sgugger

Do we really need to run any test outside of a slow integration test?

If you mean the slow integration tests implemented in each model test file: yes, we need to test the model when it is used as a pipeline. The usual slow tests in test file usually don''t use the model as pipeline.

For each pipeline type, we have 2 kind of tests:

  • the ones directly runs in tests/pipelines/test_pipelines_xxx.py (like test_small_model_pt etc)
    • not related to tiny model stuff
  • the run_pipeline_test which is coupled with model test file via PipelineTesterMixin
    • involved tiny model

We have run_pipeline_test for each pipeline task (didn't check if there is any exception) for relevant model classes. It's the way for us to know if a model fits a pipeline tasks, but @Narsil might have better comment here.

However, here already has some pipeline tests using bark, so we could say it's fine not to have a tiny bark running against run_pipeline_test.

@github-actions
Copy link
Contributor

github-actions bot commented Sep 8, 2023

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

@ydshieh
Copy link
Collaborator

ydshieh commented Sep 8, 2023

We could close this (at least for now) if @ylacombe is fine. The original purpose (from my side) is no longer valid.

@ylacombe
Copy link
Contributor Author

Totally agree!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants