You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+25-15Lines changed: 25 additions & 15 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -10,7 +10,7 @@ Allows you to easily add voice recognition and synthesis to any web app with min
10
10
This library is primarily intended for use in web browsers.
11
11
Check out [watson-developer-cloud](https://www.npmjs.com/package/watson-developer-cloud) to use Watson services (speech and others) from Node.js.
12
12
13
-
However, a server-side component is required to generate auth tokens.
13
+
However, a **server-side component is required to generate auth tokens**.
14
14
The examples/ folder includes example Node.js and Python servers, and SDKs are available for [Node.js](https://github.com/watson-developer-cloud/node-sdk#authorization),
The format of objects emitted in objectMode has changed from `{alternatives: [...], index: 1}` to `{results: [{alternatives: [...]}], result_index: 1}`.
35
-
This was done to enable the new `speaker_labels` feature.
36
-
There is a new `ResultExtractor` class and `recognizeMicrophone()` and `recognizeFile()` both accept a new `extract_results` option to restore the old behavior.
37
35
38
-
The format now exactly matches what the Watson Speech to Text service returns and shouldn't change again unless the Watson service changes.
36
+
There is a new `ResultExtractor` class that restores the old behavior; `recognizeMicrophone()` and `recognizeFile()` both accept a new `extract_results` option to enable it.
37
+
38
+
This was done to enable the new `speaker_labels` feature. The format now exactly matches what the Watson Speech to Text service returns and shouldn't change again unless the Watson service changes.
39
39
40
40
41
41
API & Examples
42
42
--------------
43
43
44
44
The basic API is outlined below, see complete API docs at http://watson-developer-cloud.github.io/speech-javascript-sdk/master/
45
45
46
-
See several examples at https://github.com/watson-developer-cloud/speech-javascript-sdk/tree/master/examples/static/
46
+
See code for several basic examples at https://github.com/watson-developer-cloud/speech-javascript-sdk/tree/master/examples/static/
47
+
48
+
See a live example at https://speech-to-text-demo.mybluemix.net/
47
49
48
50
All API methods require an auth token that must be [generated server-side](https://github.com/watson-developer-cloud/node-sdk#authorization).
49
51
(See https://github.com/watson-developer-cloud/speech-javascript-sdk/tree/master/examples/ for a couple of basic examples in Node.js and Python.)
@@ -57,46 +59,52 @@ Currently limited to text that can fit within a GET URL (this is particularly an
57
59
where the max length is around 1000 characters after the token is accounted for.)
58
60
59
61
Options:
60
-
* text - the text to transcribe // todo: list supported languages
62
+
* text - the text to speak
61
63
* voice - the desired playback voice's name - see .getVoices(). Note that the voices are language-specific.
62
64
* autoPlay - set to false to prevent the audio from automatically playing
The `recognizeMicrophone()` and `recognizeFile()` helper methods are recommended for most use-cases. They set up the streams in the appropriate order and enable common options. These two methods are documented below.
70
+
71
+
The core of the library is the [RecognizeStream] that performs the actual transcription, and a collection of other Node.js-style streams that manipulate the data in various ways. For less common use-cases, the core components may be used directly with the helper methods serving as optional templates to follow. The full library is documented at http://watson-developer-cloud.github.io/speech-javascript-sdk/master/module-watson-speech_speech-to-text.html
*`keepMic`: if true, preserves the MicrophoneStream for subsequent calls, preventing additional permissions requests in Firefox
72
77
* Other options passed to [RecognizeStream]
78
+
* Other options passed to [SpeakerStream] if `options.resultsbySpeaker` is set to true
79
+
* Other options passed to [FormatStream] if `options.format` is not set to false
73
80
* Other options passed to [WritableElementStream] if `options.outputElement` is set
74
81
75
82
Requires the `getUserMedia` API, so limited browser compatibility (see http://caniuse.com/#search=getusermedia)
76
83
Also note that Chrome requires https (with a few exceptions for localhost and such) - see https://www.chromium.org/Home/chromium-security/prefer-secure-origins-for-powerful-new-features
77
84
78
-
Pipes results through a [FormatStream] by default, set `options.format=false` to disable.
85
+
No more data will be set after `.stop()` is called on the returned stream, but additional results may be recieved for already-sent data.
Can recognize and optionally attempt to play a [File](https://developer.mozilla.org/en-US/docs/Web/API/File) or [Blob](https://developer.mozilla.org/en-US/docs/Web/API/Blob)
90
+
Can recognize and optionally attempt to play a URL, [File](https://developer.mozilla.org/en-US/docs/Web/API/File) or [Blob](https://developer.mozilla.org/en-US/docs/Web/API/Blob)
84
91
(such as from an `<input type="file"/>` or from an ajax request.)
85
92
86
93
Options:
87
-
*`file`: a String URL or a `Blob` or `File` instance.
94
+
*`file`: a String URL or a `Blob` or `File` instance. Note that [CORS] restrictions apply to URLs.
88
95
*`play`: (optional, default=`false`) Attempt to also play the file locally while uploading it for transcription
89
96
* Other options passed to [RecognizeStream]
97
+
* Other options passed to [TimingStream] if `options.realtime` is true, or unset and `options.play` is true
98
+
* Other options passed to [SpeakerStream] if `options.resultsbySpeaker` is set to true
99
+
* Other options passed to [FormatStream] if `options.format` is not set to false
90
100
* Other options passed to [WritableElementStream] if `options.outputElement` is set
91
101
92
102
`play`requires that the browser support the format; most browsers support wav and ogg/opus, but not flac.)
93
-
Will emit an `UNSUPPORTED_FORMAT` error on the RecognizeStream if playback fails.
94
-
Playback will automatically stop when `.stop()` is called on the returned stream.
95
-
For Mobile Safari compatibility, a URL must be provided, and `recognizeFile()` must be called in direct response to a user interaction (so the token must be pre-loaded).
103
+
Will emit an `UNSUPPORTED_FORMAT` error on the RecognizeStream if playback fails. This error is special in that it does not stop the streaming of results.
96
104
97
-
Pipes results through a [TimingStream] by if `options.play=true`, set `options.realtime=false` to disable.
105
+
Playback will automatically stop when `.stop()` is called on the returned stream.
98
106
99
-
Pipes results through a [FormatStream] by default, set `options.format=false`to disable.
107
+
For Mobile Safari compatibility, a URL must be provided, and `recognizeFile()` must be called in direct response to a user interaction (so the token must be pre-loaded).
100
108
101
109
102
110
## Changes
@@ -107,7 +115,7 @@ There have been a few breaking changes in recent releases:
107
115
* renamed `recognizeBlob` to `recognizeFile` to make the primary usage more apparent
108
116
* Changed `playFile` option of `recognizeBlob()` to just `play`, corrected default
109
117
* Changed format of objects emitted in objectMode to exactly match what service sends. Added `ResultStrean` class and `extract_results` option to enable older behavior.
110
-
* Changed `playback-error` event to just `error` when recognizing and playing a file. Check for `error.name == 'UNSUPPORTED_FORMAT'` to identify playback errors
118
+
* Changed `playback-error` event to just `error` when recognizing and playing a file. Check for `error.name == 'UNSUPPORTED_FORMAT'` to identify playback errors. This error is special in that it does not stop the streaming of results.
111
119
* Renamed `recognizeFile()`'s `data` option to `file` because it now may be a URL. Using a URL enables faster playback and mobile Safari support
112
120
113
121
See [CHANGELOG.md](CHANGELOG.md) for a complete list of changes.
@@ -131,3 +139,5 @@ See [CHANGELOG.md](CHANGELOG.md) for a complete list of changes.
0 commit comments