1717# The IBM® Speech to Text service provides an API that uses IBM's speech-recognition
1818# capabilities to produce transcripts of spoken audio. The service can transcribe speech
1919# from various languages and audio formats. It addition to basic transcription, the
20- # service can produce detailed information about many aspects of the audio. For most
21- # languages, the service supports two sampling rates, broadband and narrowband. It returns
22- # all JSON response content in the UTF-8 character set. For more information about the
23- # service, see the [IBM® Cloud
20+ # service can produce detailed information about many different aspects of the audio. For
21+ # most languages, the service supports two sampling rates, broadband and narrowband. It
22+ # returns all JSON response content in the UTF-8 character set.
23+ #
24+ # For more information about the service, see the [IBM® Cloud
2425# documentation](https://console.bluemix.net/docs/services/speech-to-text/index.html).
2526#
2627# ### API usage guidelines
2728# * **Audio formats:** The service accepts audio in many formats (MIME types). See [Audio
2829# formats](https://console.bluemix.net/docs/services/speech-to-text/audio-formats.html).
29- # * **HTTP interfaces:** The service provides three HTTP interfaces for speech
30- # recognition. The sessionless interface includes a single synchronous method. The
31- # session-based interface includes multiple synchronous methods for maintaining a long,
32- # multi-turn exchange with the service. And the asynchronous interface provides multiple
33- # methods that use registered callbacks and polling for non-blocking recognition. See [The
34- # HTTP REST interface](https://console.bluemix.net/docs/services/speech-to-text/http.html)
35- # and [The asynchronous HTTP
30+ # * **HTTP interfaces:** The service provides two HTTP Representational State Transfer
31+ # (REST) interfaces for speech recognition. The basic interface includes a single
32+ # synchronous method. The asynchronous interface provides multiple methods that use
33+ # registered callbacks and polling for non-blocking recognition. See [The HTTP
34+ # interface](https://console.bluemix.net/docs/services/speech-to-text/http.html) and [The
35+ # asynchronous HTTP
3636# interface](https://console.bluemix.net/docs/services/speech-to-text/async.html).
37- #
38- # **Important:** The session-based interface is deprecated as of August 8, 2018, and
39- # will be removed from service on September 7, 2018. Use the sessionless, asynchronous, or
40- # WebSocket interface instead. For more information, see the August 8 service update in
41- # the [Release
42- # notes](https://console.bluemix.net/docs/services/speech-to-text/release-notes.html#August2018).
4337# * **WebSocket interface:** The service also offers a WebSocket interface for speech
4438# recognition. The WebSocket interface provides a full-duplex, low-latency communication
4539# channel. Clients send requests and audio to the service and receive results over a
4640# single connection in an asynchronous fashion. See [The WebSocket
4741# interface](https://console.bluemix.net/docs/services/speech-to-text/websockets.html).
48- # * **Customization:** Use language model customization to expand the vocabulary of a base
49- # model with domain-specific terminology. Use acoustic model customization to adapt a base
50- # model for the acoustic characteristics of your audio. Language model customization is
51- # generally available for production use by most supported languages; acoustic model
52- # customization is beta functionality that is available for all supported languages. See
53- # [The customization
42+ # * **Customization:** The service offers two customization interfaces. Use language model
43+ # customization to expand the vocabulary of a base model with domain-specific terminology.
44+ # Use acoustic model customization to adapt a base model for the acoustic characteristics
45+ # of your audio. Language model customization is generally available for production use by
46+ # most supported languages; acoustic model customization is beta functionality that is
47+ # available for all supported languages. See [The customization
5448# interface](https://console.bluemix.net/docs/services/speech-to-text/custom.html).
5549# * **Customization IDs:** Many methods accept a customization ID to identify a custom
5650# language or custom acoustic model. Customization IDs are Globally Unique Identifiers
@@ -175,27 +169,27 @@ def get_model(model_id:)
175169 response
176170 end
177171 #########################
178- # Sessionless
172+ # Synchronous
179173 #########################
180174
181175 ##
182176 # @!method recognize(audio:, content_type:, model: nil, customization_id: nil, acoustic_customization_id: nil, base_model_version: nil, customization_weight: nil, inactivity_timeout: nil, keywords: nil, keywords_threshold: nil, max_alternatives: nil, word_alternatives_threshold: nil, word_confidence: nil, timestamps: nil, profanity_filter: nil, smart_formatting: nil, speaker_labels: nil)
183177 # Recognize audio.
184- # Sends audio and returns transcription results for a sessionless recognition
185- # request. Returns only the final results; to enable interim results, use
186- # session-based requests or the WebSocket API. The service imposes a data size limit
187- # of 100 MB. It automatically detects the endianness of the incoming audio and, for
188- # audio that includes multiple channels, downmixes the audio to one-channel mono
189- # during transcoding. (For the `audio/l16` format, you can specify the endianness.)
178+ # Sends audio and returns transcription results for a recognition request. Returns
179+ # only the final results; to enable interim results, use the WebSocket API. The
180+ # service imposes a data size limit of 100 MB. It automatically detects the
181+ # endianness of the incoming audio and, for audio that includes multiple channels,
182+ # downmixes the audio to one-channel mono during transcoding. (For the `audio/l16`
183+ # format, you can specify the endianness.)
190184 #
191185 # ### Streaming mode
192186 #
193187 # For requests to transcribe live audio as it becomes available, you must set the
194188 # `Transfer-Encoding` header to `chunked` to use streaming mode. In streaming mode,
195189 # the server closes the connection (status code 408) if the service receives no data
196- # chunk for 30 seconds and the service has no audio to transcribe for 30 seconds.
197- # The server also closes the connection (status code 400) if no speech is detected
198- # for `inactivity_timeout` seconds of audio (not processing time); use the
190+ # chunk for 30 seconds and it has no audio to transcribe for 30 seconds. The server
191+ # also closes the connection (status code 400) if no speech is detected for
192+ # `inactivity_timeout` seconds of audio (not processing time); use the
199193 # `inactivity_timeout` parameter to change the default of 30 seconds.
200194 #
201195 # ### Audio formats (content types)
@@ -234,38 +228,32 @@ def get_model(model_id:)
234228 # limit imposed by most HTTP servers and proxies. You can encounter this limit, for
235229 # example, if you want to spot a very large number of keywords.
236230 #
237- # For information about submitting a multipart request, see [Submitting multipart
238- # requests as form
239- # data](https://console.bluemix.net/docs/services/speech-to-text/http.html#HTTP-multi).
231+ # For information about submitting a multipart request, see [Making a multipart HTTP
232+ # request](https://console.bluemix.net/docs/services/speech-to-text/http.html#HTTP-multi).
240233 # @param audio [String] The audio to transcribe in the format specified by the `Content-Type` header.
241234 # @param content_type [String] The type of the input.
242- # @param model [String] The identifier of the model that is to be used for the recognition request or, for
243- # the **Create a session** method, with the new session.
235+ # @param model [String] The identifier of the model that is to be used for the recognition request.
244236 # @param customization_id [String] The customization ID (GUID) of a custom language model that is to be used with the
245- # recognition request or, for the **Create a session** method, with the new session.
246- # The base model of the specified custom language model must match the model
247- # specified with the `model` parameter. You must make the request with service
248- # credentials created for the instance of the service that owns the custom model. By
249- # default, no custom language model is used.
237+ # recognition request. The base model of the specified custom language model must
238+ # match the model specified with the `model` parameter. You must make the request
239+ # with service credentials created for the instance of the service that owns the
240+ # custom model. By default, no custom language model is used.
250241 # @param acoustic_customization_id [String] The customization ID (GUID) of a custom acoustic model that is to be used with the
251- # recognition request or, for the **Create a session** method, with the new session.
252- # The base model of the specified custom acoustic model must match the model
253- # specified with the `model` parameter. You must make the request with service
254- # credentials created for the instance of the service that owns the custom model. By
255- # default, no custom acoustic model is used.
242+ # recognition request. The base model of the specified custom acoustic model must
243+ # match the model specified with the `model` parameter. You must make the request
244+ # with service credentials created for the instance of the service that owns the
245+ # custom model. By default, no custom acoustic model is used.
256246 # @param base_model_version [String] The version of the specified base model that is to be used with recognition
257- # request or, for the **Create a session** method, with the new session. Multiple
258- # versions of a base model can exist when a model is updated for internal
259- # improvements. The parameter is intended primarily for use with custom models that
260- # have been upgraded for a new base model. The default value depends on whether the
261- # parameter is used with or without a custom model. For more information, see [Base
262- # model
247+ # request. Multiple versions of a base model can exist when a model is updated for
248+ # internal improvements. The parameter is intended primarily for use with custom
249+ # models that have been upgraded for a new base model. The default value depends on
250+ # whether the parameter is used with or without a custom model. For more
251+ # information, see [Base model
263252 # version](https://console.bluemix.net/docs/services/speech-to-text/input.html#version).
264253 # @param customization_weight [Float] If you specify the customization ID (GUID) of a custom language model with the
265- # recognition request or, for sessions, with the **Create a session** method, the
266- # customization weight tells the service how much weight to give to words from the
267- # custom language model compared to those from the base model for the current
268- # request.
254+ # recognition request, the customization weight tells the service how much weight to
255+ # give to words from the custom language model compared to those from the base model
256+ # for the current request.
269257 #
270258 # Specify a value between 0.0 and 1.0. Unless a different customization weight was
271259 # specified for the custom model when it was trained, the default value is 0.3. A
@@ -658,8 +646,7 @@ def unregister_callback(callback_url:)
658646 # formats](https://console.bluemix.net/docs/services/speech-to-text/audio-formats.html).
659647 # @param audio [String] The audio to transcribe in the format specified by the `Content-Type` header.
660648 # @param content_type [String] The type of the input.
661- # @param model [String] The identifier of the model that is to be used for the recognition request or, for
662- # the **Create a session** method, with the new session.
649+ # @param model [String] The identifier of the model that is to be used for the recognition request.
663650 # @param callback_url [String] A URL to which callback notifications are to be sent. The URL must already be
664651 # successfully white-listed by using the **Register a callback** method. You can
665652 # include the same callback URL with any number of job creation requests. Omit the
@@ -679,10 +666,12 @@ def unregister_callback(callback_url:)
679666 # * `recognitions.failed` generates a callback notification if the service
680667 # experiences an error while processing the job.
681668 #
682- # Omit the parameter to subscribe to the default events: `recognitions.started`,
683- # `recognitions.completed`, and `recognitions.failed`. The `recognitions.completed`
684- # and `recognitions.completed_with_results` events are incompatible; you can specify
685- # only of the two events. If the job does not include a callback URL, omit the
669+ # The `recognitions.completed` and `recognitions.completed_with_results` events are
670+ # incompatible. You can specify only of the two events.
671+ #
672+ # If the job includes a callback URL, omit the parameter to subscribe to the default
673+ # events: `recognitions.started`, `recognitions.completed`, and
674+ # `recognitions.failed`. If the job does not include a callback URL, omit the
686675 # parameter.
687676 # @param user_token [String] If the job includes a callback URL, a user-specified string that the service is to
688677 # include with each callback notification for the job; the token allows the user to
@@ -693,30 +682,26 @@ def unregister_callback(callback_url:)
693682 # this time. Omit the parameter to use a time to live of one week. The parameter is
694683 # valid with or without a callback URL.
695684 # @param customization_id [String] The customization ID (GUID) of a custom language model that is to be used with the
696- # recognition request or, for the **Create a session** method, with the new session.
697- # The base model of the specified custom language model must match the model
698- # specified with the `model` parameter. You must make the request with service
699- # credentials created for the instance of the service that owns the custom model. By
700- # default, no custom language model is used.
685+ # recognition request. The base model of the specified custom language model must
686+ # match the model specified with the `model` parameter. You must make the request
687+ # with service credentials created for the instance of the service that owns the
688+ # custom model. By default, no custom language model is used.
701689 # @param acoustic_customization_id [String] The customization ID (GUID) of a custom acoustic model that is to be used with the
702- # recognition request or, for the **Create a session** method, with the new session.
703- # The base model of the specified custom acoustic model must match the model
704- # specified with the `model` parameter. You must make the request with service
705- # credentials created for the instance of the service that owns the custom model. By
706- # default, no custom acoustic model is used.
690+ # recognition request. The base model of the specified custom acoustic model must
691+ # match the model specified with the `model` parameter. You must make the request
692+ # with service credentials created for the instance of the service that owns the
693+ # custom model. By default, no custom acoustic model is used.
707694 # @param base_model_version [String] The version of the specified base model that is to be used with recognition
708- # request or, for the **Create a session** method, with the new session. Multiple
709- # versions of a base model can exist when a model is updated for internal
710- # improvements. The parameter is intended primarily for use with custom models that
711- # have been upgraded for a new base model. The default value depends on whether the
712- # parameter is used with or without a custom model. For more information, see [Base
713- # model
695+ # request. Multiple versions of a base model can exist when a model is updated for
696+ # internal improvements. The parameter is intended primarily for use with custom
697+ # models that have been upgraded for a new base model. The default value depends on
698+ # whether the parameter is used with or without a custom model. For more
699+ # information, see [Base model
714700 # version](https://console.bluemix.net/docs/services/speech-to-text/input.html#version).
715701 # @param customization_weight [Float] If you specify the customization ID (GUID) of a custom language model with the
716- # recognition request or, for sessions, with the **Create a session** method, the
717- # customization weight tells the service how much weight to give to words from the
718- # custom language model compared to those from the base model for the current
719- # request.
702+ # recognition request, the customization weight tells the service how much weight to
703+ # give to words from the custom language model compared to those from the base model
704+ # for the current request.
720705 #
721706 # Specify a value between 0.0 and 1.0. Unless a different customization weight was
722707 # specified for the custom model when it was trained, the default value is 0.3. A
@@ -1546,14 +1531,16 @@ def add_words(customization_id:, words:)
15461531 # Omit this field for the **Add a custom word** method.
15471532 # @param sounds_like [Array[String]] An array of sounds-like pronunciations for the custom word. Specify how words that
15481533 # are difficult to pronounce, foreign words, acronyms, and so on can be pronounced
1549- # by users. For a word that is not in the service's base vocabulary, omit the
1550- # parameter to have the service automatically generate a sounds-like pronunciation
1551- # for the word. For a word that is in the service's base vocabulary, use the
1552- # parameter to specify additional pronunciations for the word. You cannot override
1553- # the default pronunciation of a word; pronunciations you add augment the
1554- # pronunciation from the base vocabulary. A word can have at most five sounds-like
1555- # pronunciations, and a pronunciation can include at most 40 characters not
1556- # including spaces.
1534+ # by users.
1535+ # * For a word that is not in the service's base vocabulary, omit the parameter to
1536+ # have the service automatically generate a sounds-like pronunciation for the word.
1537+ # * For a word that is in the service's base vocabulary, use the parameter to
1538+ # specify additional pronunciations for the word. You cannot override the default
1539+ # pronunciation of a word; pronunciations you add augment the pronunciation from the
1540+ # base vocabulary.
1541+ #
1542+ # A word can have at most five sounds-like pronunciations. A pronunciation can
1543+ # include at most 40 characters not including spaces.
15571544 # @param display_as [String] An alternative spelling for the custom word when it appears in a transcript. Use
15581545 # the parameter when you want the word to have a spelling that is different from its
15591546 # usual representation or from its spelling in corpora training data.
0 commit comments