You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+9-117Lines changed: 9 additions & 117 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -36,107 +36,21 @@ var recognizeMic = require('watson-speech/speech-to-text/recognize-microphone');
36
36
```
37
37
38
38
39
-
Breaking change for v0.22.0
40
-
----------------------------
41
-
42
-
The format of objects emitted in objectMode has changed from `{alternatives: [...], index: 1}` to `{results: [{alternatives: [...]}], result_index: 1}`.
43
-
44
-
There is a new `ResultExtractor` class that restores the old behavior; `recognizeMicrophone()` and `recognizeFile()` both accept a new `extract_results` option to enable it.
45
-
46
-
This was done to enable the new `speaker_labels` feature. The format now exactly matches what the Watson Speech to Text service returns and shouldn't change again unless the Watson service changes.
47
-
48
-
49
-
API & Examples
50
-
--------------
51
-
52
-
The basic API is outlined below, see complete API docs at http://watson-developer-cloud.github.io/speech-javascript-sdk/master/
53
-
54
-
See several basic examples at http://watson-speech.mybluemix.net/ ([source](https://github.com/watson-developer-cloud/speech-javascript-sdk/tree/master/examples/))
55
-
56
-
See a more advanced example at https://speech-to-text-demo.mybluemix.net/
57
-
58
-
All API methods require an auth token that must be [generated server-side](https://github.com/watson-developer-cloud/node-sdk#authorization).
59
-
(See https://github.com/watson-developer-cloud/speech-javascript-sdk/tree/master/examples/ for a couple of basic examples in Node.js and Python.)
60
-
61
-
_NOTE_: The `token` parameter only works for CF instances of services. For RC services using IAM for authentication, the `access_token` parameter must be used.
Speaks the supplied text through an automatically-created `<audio>` element.
68
-
Currently limited to text that can fit within a GET URL (this is particularly an issue on [Internet Explorer before Windows 10](http://stackoverflow.com/questions/32267442/url-length-limitation-of-microsoft-edge)
69
-
where the max length is around 1000 characters after the token is accounted for.)
70
-
71
-
Options:
72
-
* text - the text to speak
73
-
* url - the Watson Text to Speech API URL (defaults to https://stream.watsonplatform.net/text-to-speech/api)
74
-
* voice - the desired playback voice's name - see .getVoices(). Note that the voices are language-specific.
75
-
* customization_id - GUID of a custom voice model - omit to use the voice with no customization.
76
-
* autoPlay - set to false to prevent the audio from automatically playing
77
-
78
-
Relies on browser audio support: should work reliably in Chrome and Firefox on desktop and Android. Edge works with a little help. Safari and all iOS browsers do not seem to work yet.
The `recognizeMicrophone()` and `recognizeFile()` helper methods are recommended for most use-cases. They set up the streams in the appropriate order and enable common options. These two methods are documented below.
83
-
84
-
The core of the library is the [RecognizeStream] that performs the actual transcription, and a collection of other Node.js-style streams that manipulate the data in various ways. For less common use-cases, the core components may be used directly with the helper methods serving as optional templates to follow. The full library is documented at http://watson-developer-cloud.github.io/speech-javascript-sdk/master/module-watson-speech_speech-to-text.html
*`keepMicrophone`: if true, preserves the MicrophoneStream for subsequent calls, preventing additional permissions requests in Firefox
90
-
*`mediaStream`: Optionally pass in an existing media stream rather than prompting the user for microphone access.
91
-
* Other options passed to [RecognizeStream]
92
-
* Other options passed to [SpeakerStream] if `options.resultsbySpeaker` is set to true
93
-
* Other options passed to [FormatStream] if `options.format` is not set to false
94
-
* Other options passed to [WritableElementStream] if `options.outputElement` is set
95
-
96
-
Requires the `getUserMedia` API, so limited browser compatibility (see http://caniuse.com/#search=getusermedia)
97
-
Also note that Chrome requires https (with a few exceptions for localhost and such) - see https://www.chromium.org/Home/chromium-security/prefer-secure-origins-for-powerful-new-features
98
-
99
-
No more data will be set after `.stop()` is called on the returned stream, but additional results may be recieved for already-sent data.
Can recognize and optionally attempt to play a URL, [File](https://developer.mozilla.org/en-US/docs/Web/API/File) or [Blob](https://developer.mozilla.org/en-US/docs/Web/API/Blob)
105
-
(such as from an `<input type="file"/>` or from an ajax request.)
106
-
107
-
Options:
108
-
*`file`: a String URL or a `Blob` or `File` instance. Note that [CORS] restrictions apply to URLs.
109
-
*`play`: (optional, default=`false`) Attempt to also play the file locally while uploading it for transcription
110
-
* Other options passed to [RecognizeStream]
111
-
* Other options passed to [TimingStream] if `options.realtime` is true, or unset and `options.play` is true
112
-
* Other options passed to [SpeakerStream] if `options.resultsbySpeaker` is set to true
113
-
* Other options passed to [FormatStream] if `options.format` is not set to false
114
-
* Other options passed to [WritableElementStream] if `options.outputElement` is set
115
-
116
-
`play`requires that the browser support the format; most browsers support wav and ogg/opus, but not flac.)
117
-
Will emit an `UNSUPPORTED_FORMAT` error on the RecognizeStream if playback fails. This error is special in that it does not stop the streaming of results.
118
-
119
-
Playback will automatically stop when `.stop()` is called on the returned stream.
120
-
121
-
For Mobile Safari compatibility, a URL must be provided, and `recognizeFile()` must be called in direct response to a user interaction (so the token must be pre-loaded).
122
-
123
-
124
39
## Changes
125
40
126
-
There have been a few breaking changes in recent releases:
127
-
128
-
* Removed `SpeechToText.recognizeElement()` due to quality issues. The code is [avaliable in an (unsupported) example](https://github.com/watson-developer-cloud/speech-javascript-sdk/tree/master/examples/static/audio-video-deprecated) if you wish to use it with current releases of the SDK.
129
-
* renamed `recognizeBlob` to `recognizeFile` to make the primary usage more apparent
130
-
* Changed `playFile` option of `recognizeBlob()` to just `play`, corrected default
131
-
* Changed format of objects emitted in objectMode to exactly match what service sends. Added `ResultStream` class and `extract_results` option to enable older behavior.
132
-
* Changed `playback-error` event to just `error` when recognizing and playing a file. Check for `error.name == 'UNSUPPORTED_FORMAT'` to identify playback errors. This error is special in that it does not stop the streaming of results.
133
-
* Renamed `recognizeFile()`'s `data` option to `file` because it now may be a URL. Using a URL enables faster playback and mobile Safari support
134
-
* Continous flag for OPENING_MESSAGE_PARAMS_ALLOWED has been removed
135
-
136
41
See [CHANGELOG.md](CHANGELOG.md) for a complete list of changes.
137
42
138
43
## Development
139
44
45
+
### Use examples for development
46
+
The provided examples can be used to test developmental code in action:
47
+
*`cd examples/`
48
+
*`npm run dev`
49
+
50
+
This will build the local code, move the new bundle into the `examples/` directory, and start a new server at `localhost:3000` where the examples will be running.
51
+
52
+
Note: This requires valid service credentials.
53
+
140
54
### Testing
141
55
The test suite is broken up into offline unit tests and integration tests that test against actual service instances.
142
56
*`npm test` will run the linter and the offline tests
@@ -146,25 +60,3 @@ The test suite is broken up into offline unit tests and integration tests that t
146
60
To run the integration tests, a file with service credentials is required. This file must be called `stt-auth.json` and must be located in `/test/resources/`. There are tests for usage of both CF and RC service instances. For testing CF, the required keys in this configuration file are `username` and `password`. For testing RC, a key of either `iam_acess_token` or `iam_apikey` is required. Optionally, a service URL for an RC instance can be provided under the key `rc_service_url` if the service is available under a URL other than `https://stream.watsonplatform.net/speech-to-text/api`.
147
61
148
62
For an example, see `test/resources/stt-auth-example.json`.
149
-
150
-
## todo
151
-
152
-
* Further solidify API
153
-
* break components into standalone npm modules where it makes sense
154
-
* run integration tests on travis (fall back to offline server for pull requests)
0 commit comments