Skip to content

Example of Infinite Streaming for Google Speech-to-Text API v2 #11596

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
rodrigoGA opened this issue Apr 27, 2024 · 4 comments · Fixed by #11847
Closed

Example of Infinite Streaming for Google Speech-to-Text API v2 #11596

rodrigoGA opened this issue Apr 27, 2024 · 4 comments · Fixed by #11847
Assignees
Labels
priority: p3 Desirable enhancement or fix. May not be included in next release. samples Issues that are directly related to samples. triage me I really want to be triaged. type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design.

Comments

@rodrigoGA
Copy link

I have not found any examples of Infinite Streaming with the Google Speech-to-Text API v2, and it's unclear whether there is a mechanism in version 2 of the API that facilitates this functionality. Could you provide an example of implementation or confirm if this capability is supported?

Thanks!

@rodrigoGA rodrigoGA added priority: p3 Desirable enhancement or fix. May not be included in next release. triage me I really want to be triaged. type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design. labels Apr 27, 2024
@product-auto-label product-auto-label bot added the samples Issues that are directly related to samples. label Apr 27, 2024
@holtskinner
Copy link
Contributor

I found this code sample for endless streaming from microphone input using V1.

Linked in this documentation: https://cloud.google.com/speech-to-text/docs/endless-streaming-tutorial

https://github.com/GoogleCloudPlatform/python-docs-samples/blob/main/speech/microphone/transcribe_streaming_infinite.py

This code sample should work mostly the same for V2, you would just need to swap the imports from v1 to v2. I believe the client methods are the same.

@rodrigoGA
Copy link
Author

rodrigoGA commented Apr 29, 2024

No, the code is not interchangeable. The way to start the client, the API response, and I suspect the way it operates have changed.
I think a functional example would be useful since it's one of the most common use cases for the API.
Here is a code based on the v1 example, but it doesn't work as it should: https://gist.github.com/rodrigoGA/644ab63d244a9f5674e935540c6db6da

Here is a basic example I am working on, not based on the infinite streaming of v1, which does work, but I haven’t finished it yet. There are things I still don't understand about the library; I find it quite unfriendly to use. Version 1 was already not user-friendly, but version 2 is even more so.
I plan to create a wrapper around MicrophoneStream so that the Resumable streaming code can be reused outside of this example. I greatly appreciate any guidance https://gist.github.com/rodrigoGA/020a144cbd239ed0d9eba6e44c8682c0

@gangchen03
Copy link
Member

Submitted a PR #11847
it should fix the example request.

holtskinner pushed a commit that referenced this issue Jun 10, 2024
@rodrigoGA
Copy link
Author

Thank you @gangchen03 it was helpful.
But the example doesn't seem to work continuously.
After having the microphone active for 5 minutes with the telephony model, the following error message is received:

Traceback (most recent call last):
  File "/home/rodrigo/Proyectos/marta/Call-Services/venv/lib/python3.11/site-packages/google/api_core/grpc_helpers.py", line 116, in __next__
    return next(self._wrapped)
           ^^^^^^^^^^^^^^^^^^^
  File "/home/rodrigo/Proyectos/marta/Call-Services/venv/lib/python3.11/site-packages/grpc/_channel.py", line 543, in __next__
    return self._next()
           ^^^^^^^^^^^^
  File "/home/rodrigo/Proyectos/marta/Call-Services/venv/lib/python3.11/site-packages/grpc/_channel.py", line 969, in _next
    raise self
grpc._channel._MultiThreadedRendezvous: <_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.ABORTED
	details = "Max duration of 5 minutes reached for stream."
	debug_error_string = "UNKNOWN:Error received from peer ipv4:142.251.134.42:443 {grpc_message:"Max duration of 5 minutes reached for stream.", grpc_status:10, created_time:"2024-06-10T16:55:08.43835454-03:00"}"
>

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/rodrigo/Proyectos/marta/Call-Services/test/performance_test/testReasumableGoogleTranscribeV2.py", line 389, in <module>
    main(project_id)
  File "/home/rodrigo/Proyectos/marta/Call-Services/test/performance_test/testReasumableGoogleTranscribeV2.py", line 368, in main
    listen_print_loop(responses_iterator, stream)
  File "/home/rodrigo/Proyectos/marta/Call-Services/test/performance_test/testReasumableGoogleTranscribeV2.py", line 241, in listen_print_loop
    for response in responses:
  File "/home/rodrigo/Proyectos/marta/Call-Services/venv/lib/python3.11/site-packages/google/api_core/grpc_helpers.py", line 119, in __next__
    raise exceptions.from_grpc_error(exc) from exc
google.api_core.exceptions.Aborted: 409 Max duration of 5 minutes reached for stream. [type_url: "type.googleapis.com/util.StatusProto"
value: "\010\n\022\007generic\032-Max duration of 5 minutes reached for stream.*8\013\020\206\326\215\'\032/\022-Max duration of 5 minutes reached for stream.\014"
]

and the script stops working.

It also doesn't work correctly with smaller models like telephony_short, which I understand are the desired ones to use in this scenario.
In these cases, when a sentence is transcribed as final, the following error result occurs:

0: NEW REQUEST
9240: Hola

240000: NEW REQUEST
Traceback (most recent call last):
  File "/home/rodrigo/Proyectos/marta/Call-Services/venv/lib/python3.11/site-packages/google/api_core/grpc_helpers.py", line 170, in error_remapped_callable
    return _StreamingResponseIterator(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/rodrigo/Proyectos/marta/Call-Services/venv/lib/python3.11/site-packages/google/api_core/grpc_helpers.py", line 92, in __init__
    self._stored_first_result = next(self._wrapped)
                                ^^^^^^^^^^^^^^^^^^^
  File "/home/rodrigo/Proyectos/marta/Call-Services/venv/lib/python3.11/site-packages/grpc/_channel.py", line 543, in __next__
    return self._next()
           ^^^^^^^^^^^^
  File "/home/rodrigo/Proyectos/marta/Call-Services/venv/lib/python3.11/site-packages/grpc/_channel.py", line 969, in _next
    raise self
grpc._channel._MultiThreadedRendezvous: <_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.INVALID_ARGUMENT
	details = "Audio chunk can be of a maximum of 25600 bytes. Received audio of 294400 bytes instead."
	debug_error_string = "UNKNOWN:Error received from peer ipv4:142.251.133.10:443 {created_time:"2024-06-10T17:01:02.223753051-03:00", grpc_status:3, grpc_message:"Audio chunk can be of a maximum of 25600 bytes. Received audio of 294400 bytes instead."}"
>

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/rodrigo/Proyectos/marta/Call-Services/test/performance_test/testReasumableGoogleTranscribeV2.py", line 389, in <module>
    main(project_id)
  File "/home/rodrigo/Proyectos/marta/Call-Services/test/performance_test/testReasumableGoogleTranscribeV2.py", line 365, in main
    responses_iterator = client.streaming_recognize(
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/rodrigo/Proyectos/marta/Call-Services/venv/lib/python3.11/site-packages/google/cloud/speech_v2/services/speech/client.py", line 1641, in streaming_recognize
    response = rpc(
               ^^^^
  File "/home/rodrigo/Proyectos/marta/Call-Services/venv/lib/python3.11/site-packages/google/api_core/gapic_v1/method.py", line 131, in __call__
    return wrapped_func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/rodrigo/Proyectos/marta/Call-Services/venv/lib/python3.11/site-packages/google/api_core/grpc_helpers.py", line 174, in error_remapped_callable
    raise exceptions.from_grpc_error(exc) from exc
google.api_core.exceptions.InvalidArgument: 400 Audio chunk can be of a maximum of 25600 bytes. Received audio of 294400 bytes instead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
priority: p3 Desirable enhancement or fix. May not be included in next release. samples Issues that are directly related to samples. triage me I really want to be triaged. type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants