Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
24 changes: 12 additions & 12 deletions packages/google-cloud-speech/CONTRIBUTING.rst
Original file line number Diff line number Diff line change
Expand Up @@ -35,21 +35,21 @@ Using a Development Checkout
You'll have to create a development environment using a Git checkout:

- While logged into your GitHub account, navigate to the
``python-speech`` `repo`_ on GitHub.
``google-cloud-python`` `repo`_ on GitHub.

- Fork and clone the ``python-speech`` repository to your GitHub account by
- Fork and clone the ``google-cloud-python`` repository to your GitHub account by
clicking the "Fork" button.

- Clone your fork of ``python-speech`` from your GitHub account to your local
- Clone your fork of ``google-cloud-python`` from your GitHub account to your local
computer, substituting your account username and specifying the destination
as ``hack-on-python-speech``. E.g.::
as ``hack-on-google-cloud-python``. E.g.::

$ cd ${HOME}
$ git clone [email protected]:USERNAME/python-speech.git hack-on-python-speech
$ cd hack-on-python-speech
# Configure remotes such that you can pull changes from the googleapis/python-speech
$ git clone [email protected]:USERNAME/google-cloud-python.git hack-on-google-cloud-python
$ cd hack-on-google-cloud-python
# Configure remotes such that you can pull changes from the googleapis/google-cloud-python
# repository into your local repository.
$ git remote add upstream [email protected]:googleapis/python-speech.git
$ git remote add upstream [email protected]:googleapis/google-cloud-python.git
# fetch and merge changes from upstream into main
$ git fetch upstream
$ git merge upstream/main
Expand All @@ -60,7 +60,7 @@ repo, from which you can submit a pull request.
To work on the codebase and run the tests, we recommend using ``nox``,
but you can also use a ``virtualenv`` of your own creation.

.. _repo: https://github.com/googleapis/python-speech
.. _repo: https://github.com/googleapis/google-cloud-python

Using ``nox``
=============
Expand Down Expand Up @@ -113,7 +113,7 @@ Coding Style
export GOOGLE_CLOUD_TESTING_BRANCH="main"

By doing this, you are specifying the location of the most up-to-date
version of ``python-speech``. The
version of ``google-cloud-python``. The
remote name ``upstream`` should point to the official ``googleapis``
checkout and the branch should be the default branch on that remote (``main``).

Expand Down Expand Up @@ -209,7 +209,7 @@ The `description on PyPI`_ for the project comes directly from the
``README``. Due to the reStructuredText (``rst``) parser used by
PyPI, relative links which will work on GitHub (e.g. ``CONTRIBUTING.rst``
instead of
``https://github.com/googleapis/python-speech/blob/main/CONTRIBUTING.rst``)
``https://github.com/googleapis/google-cloud-python/blob/main/CONTRIBUTING.rst``)
may cause problems creating links or rendering the description.

.. _description on PyPI: https://pypi.org/project/google-cloud-speech
Expand All @@ -236,7 +236,7 @@ We support:

Supported versions can be found in our ``noxfile.py`` `config`_.

.. _config: https://github.com/googleapis/python-speech/blob/main/packages/google-cloud-speech/noxfile.py
.. _config: https://github.com/googleapis/google-cloud-python/blob/main/packages/google-cloud-speech/noxfile.py


**********
Expand Down
2 changes: 1 addition & 1 deletion packages/google-cloud-speech/docs/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -156,7 +156,7 @@
html_theme_options = {
"description": "Google Cloud Client Libraries for google-cloud-speech",
"github_user": "googleapis",
"github_repo": "python-speech",
"github_repo": "google-cloud-python",
"github_banner": True,
"font_family": "'Roboto', Georgia, sans",
"head_font_family": "'Roboto', Georgia, serif",
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -63,6 +63,7 @@
CustomClass,
PhraseSet,
SpeechAdaptation,
TranscriptNormalization,
)

__all__ = (
Expand Down Expand Up @@ -104,4 +105,5 @@
"CustomClass",
"PhraseSet",
"SpeechAdaptation",
"TranscriptNormalization",
)
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,12 @@
UpdateCustomClassRequest,
UpdatePhraseSetRequest,
)
from .types.resource import CustomClass, PhraseSet, SpeechAdaptation
from .types.resource import (
CustomClass,
PhraseSet,
SpeechAdaptation,
TranscriptNormalization,
)

from google.cloud.speech_v1.helpers import SpeechHelpers

Expand Down Expand Up @@ -99,6 +104,7 @@ class SpeechClient(SpeechHelpers, SpeechClient):
"StreamingRecognitionResult",
"StreamingRecognizeRequest",
"StreamingRecognizeResponse",
"TranscriptNormalization",
"TranscriptOutputConfig",
"UpdateCustomClassRequest",
"UpdatePhraseSetRequest",
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,7 @@
UpdateCustomClassRequest,
UpdatePhraseSetRequest,
)
from .resource import CustomClass, PhraseSet, SpeechAdaptation
from .resource import CustomClass, PhraseSet, SpeechAdaptation, TranscriptNormalization

__all__ = (
"LongRunningRecognizeMetadata",
Expand Down Expand Up @@ -85,4 +85,5 @@
"CustomClass",
"PhraseSet",
"SpeechAdaptation",
"TranscriptNormalization",
)
Original file line number Diff line number Diff line change
Expand Up @@ -359,6 +359,13 @@ class RecognitionConfig(proto.Message):
adaptation <https://cloud.google.com/speech-to-text/docs/adaptation>`__
documentation. When speech adaptation is set it supersedes
the ``speech_contexts`` field.
transcript_normalization (google.cloud.speech_v1.types.TranscriptNormalization):
Optional. Use transcription normalization to
automatically replace parts of the transcript
with phrases of your choosing. For
StreamingRecognize, this normalization only
applies to stable partial transcripts (stability
> 0.8) and final transcripts.
speech_contexts (MutableSequence[google.cloud.speech_v1.types.SpeechContext]):
Array of
[SpeechContext][google.cloud.speech.v1.SpeechContext]. A
Expand Down Expand Up @@ -551,6 +558,12 @@ class AudioEncoding(proto.Enum):
5574. In other words, each RTP header is replaced with a
single byte containing the block length. Only Speex wideband
is supported. ``sample_rate_hertz`` must be 16000.
MP3 (8):
MP3 audio. MP3 encoding is a Beta feature and only available
in v1p1beta1. Support all standard MP3 bitrates (which range
from 32-320 kbps). When using this encoding,
``sample_rate_hertz`` has to match the sample rate of the
file being used.
WEBM_OPUS (9):
Opus encoded audio frames in WebM container
(`OggOpus <https://wiki.xiph.org/OggOpus>`__).
Expand All @@ -565,6 +578,7 @@ class AudioEncoding(proto.Enum):
AMR_WB = 5
OGG_OPUS = 6
SPEEX_WITH_HEADER_BYTE = 7
MP3 = 8
WEBM_OPUS = 9

encoding: AudioEncoding = proto.Field(
Expand Down Expand Up @@ -605,6 +619,11 @@ class AudioEncoding(proto.Enum):
number=20,
message=resource.SpeechAdaptation,
)
transcript_normalization: resource.TranscriptNormalization = proto.Field(
proto.MESSAGE,
number=24,
message=resource.TranscriptNormalization,
)
speech_contexts: MutableSequence["SpeechContext"] = proto.RepeatedField(
proto.MESSAGE,
number=6,
Expand Down Expand Up @@ -659,7 +678,7 @@ class SpeakerDiarizationConfig(proto.Message):
enable_speaker_diarization (bool):
If 'true', enables speaker detection for each recognized
word in the top alternative of the recognition result using
a speaker_tag provided in the WordInfo.
a speaker_label provided in the WordInfo.
min_speaker_count (int):
Minimum number of speakers in the
conversation. This range gives you more
Expand Down Expand Up @@ -1469,8 +1488,17 @@ class WordInfo(proto.Message):
speaker within the audio. This field specifies which one of
those speakers was detected to have spoken this word. Value
ranges from '1' to diarization_speaker_count. speaker_tag is
set if enable_speaker_diarization = 'true' and only in the
top alternative.
set if enable_speaker_diarization = 'true' and only for the
top alternative. Note: Use speaker_label instead.
speaker_label (str):
Output only. A label value assigned for every unique speaker
within the audio. This field specifies which speaker was
detected to have spoken this word. For some models, like
medical_conversation this can be actual speaker role, for
example "patient" or "provider", but generally this would be
a number identifying a speaker. This field is only set if
enable_speaker_diarization = 'true' and only for the top
alternative.
"""

start_time: duration_pb2.Duration = proto.Field(
Expand All @@ -1495,6 +1523,10 @@ class WordInfo(proto.Message):
proto.INT32,
number=5,
)
speaker_label: str = proto.Field(
proto.STRING,
number=6,
)


class SpeechAdaptationInfo(proto.Message):
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@
"CustomClass",
"PhraseSet",
"SpeechAdaptation",
"TranscriptNormalization",
},
)

Expand Down Expand Up @@ -228,4 +229,54 @@ class ABNFGrammar(proto.Message):
)


class TranscriptNormalization(proto.Message):
r"""Transcription normalization configuration. Use transcription
normalization to automatically replace parts of the transcript
with phrases of your choosing. For StreamingRecognize, this
normalization only applies to stable partial transcripts
(stability > 0.8) and final transcripts.

Attributes:
entries (MutableSequence[google.cloud.speech_v1.types.TranscriptNormalization.Entry]):
A list of replacement entries. We will perform replacement
with one entry at a time. For example, the second entry in
["cat" => "dog", "mountain cat" => "mountain dog"] will
never be applied because we will always process the first
entry before it. At most 100 entries.
"""

class Entry(proto.Message):
r"""A single replacement configuration.

Attributes:
search (str):
What to replace. Max length is 100
characters.
replace (str):
What to replace with. Max length is 100
characters.
case_sensitive (bool):
Whether the search is case sensitive.
"""

search: str = proto.Field(
proto.STRING,
number=1,
)
replace: str = proto.Field(
proto.STRING,
number=2,
)
case_sensitive: bool = proto.Field(
proto.BOOL,
number=3,
)

entries: MutableSequence[Entry] = proto.RepeatedField(
proto.MESSAGE,
number=1,
message=Entry,
)


__all__ = tuple(sorted(__protobuf__.manifest))
Original file line number Diff line number Diff line change
Expand Up @@ -102,10 +102,12 @@ replacements:
packages/google-cloud-speech/google/cloud/speech_v1/__init__.py,
]
before: |
from .types.resource import CustomClass, PhraseSet, SpeechAdaptation\n
\)

__all__ = \(
after: |
from .types.resource import CustomClass, PhraseSet, SpeechAdaptation\n
)

from google.cloud.speech_v1.helpers import SpeechHelpers\n\n
class SpeechClient(SpeechHelpers, SpeechClient):
__doc__ = SpeechClient.__doc__\n\n
Expand Down