Skip to content

Commit 1818433

Browse files
committed
adding support for STT profanity_filter & documenting keywords and words_alternatives options
1 parent 949d4b6 commit 1818433

File tree

2 files changed

+26
-17
lines changed

2 files changed

+26
-17
lines changed

README.md

Lines changed: 15 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,8 @@ Allows you to easily add voice recognition and synthesis to any web app with min
88

99
**Warning** This library is still has a few rough edges and may yet see breaking changes.
1010

11-
**For Web Browsers Only** This library is primarily intended for use in browsers.
11+
### For Web Browsers Only
12+
This library is primarily intended for use in browsers.
1213
Check out [watson-developer-cloud](https://www.npmjs.com/package/watson-developer-cloud) to use Watson services (speech and others) from Node.js.
1314

1415
However, a server-side component is required to generate auth tokens.
@@ -17,15 +18,23 @@ The examples/ folder includes example Node.js and Python servers, and SDKs are a
1718
[Python](https://github.com/watson-developer-cloud/python-sdk/blob/master/examples/authorization_v1.py),
1819
and there is also a [REST API](http://www.ibm.com/smarterplanet/us/en/ibmwatson/developercloud/doc/getting_started/gs-tokens.shtml).
1920

21+
### Examples
2022
See several examples at https://github.com/watson-developer-cloud/speech-javascript-sdk/tree/master/examples
2123

22-
This library is built with [browserify](http://browserify.org/) and easy to use in browserify-based projects (`npm install --save watson-speech`), but you can also grab the compiled bundle from the
23-
`dist/` folder and use it as a standalone library.
24+
### Installation - npm
2425

25-
Basic API
26+
This library is built with [browserify](http://browserify.org/) and easy to use in browserify-based projects :
27+
28+
npm install --save watson-speech
29+
30+
### Installation - standalone
31+
32+
Pre-compiled bundles are also available from on GitHub Releases: https://github.com/watson-developer-cloud/speech-javascript-sdk/releases
33+
34+
API
2635
---------
2736

28-
Complete API docs should be published at http://watson-developer-cloud.github.io/speech-javascript-sdk/
37+
The basic API is outlined here, see complete API docs at http://watson-developer-cloud.github.io/speech-javascript-sdk/
2938

3039
All API methods require an auth token that must be [generated server-side](https://github.com/watson-developer-cloud/node-sdk#authorization).
3140
(Snp teee examples/token-server.js for a basic example.)
@@ -43,11 +52,6 @@ Options:
4352
* voice - the desired playback voice's name - see .getVoices(). Note that the voices are language-specific.
4453
* autoPlay - set to false to prevent the audio from automatically playing
4554

46-
### `.getVoices()` -> Promise
47-
48-
Returns a promise that resolves to an array of objects containing the name, language, gender, and other details for each voice.
49-
50-
Requires[window.fetch](https://developer.mozilla.org/en-US/docs/Web/API/Fetch_API), a [pollyfill](https://www.npmjs.com/package/whatwg-fetch) for IE/Edge and older Chrome/Firefox.
5155

5256
## `WatsonSpeech.SpeechToText`
5357

@@ -140,6 +144,7 @@ Accepts input from `RecognizeStream()` and friends, writes text to supplied `out
140144
* Removed `SpeechToText.recognizeElement()` due to quality issues
141145
* Added `options.element` to TextToSpeech.synthesize() to support playing through exiting elements
142146
* Fixed a couple of bugs in the TimingStream
147+
* Added support for STT profanity_filter & documented keywords and words_alternatives options.
143148

144149
### v0.14
145150
* Moved getUserMedia shim to a [standalone library](https://www.npmjs.com/package/get-user-media-promise)

speech-to-text/recognize-stream.js

Lines changed: 11 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,7 @@ var defaults = require('defaults');
2626
var qs = require('../util/querystring.js');
2727

2828
var OPENING_MESSAGE_PARAMS_ALLOWED = ['continuous', 'max_alternatives', 'timestamps', 'word_confidence', 'inactivity_timeout',
29-
'content-type', 'interim_results', 'keywords', 'keywords_threshold', 'word_alternatives_threshold'];
29+
'content-type', 'interim_results', 'keywords', 'keywords_threshold', 'word_alternatives_threshold', 'profanity_filter'];
3030

3131
var QUERY_PARAMS_ALLOWED = ['model', 'watson-token']; // , 'X-Watson-Learning-Opt-Out' - should be allowed but currently isn't due to a service bug
3232

@@ -38,7 +38,9 @@ var QUERY_PARAMS_ALLOWED = ['model', 'watson-token']; // , 'X-Watson-Learning-Op
3838
*
3939
* By default, only finalized text is emitted in the data events, however in `readableObjectMode` (usually just `objectMode` when using a helper method).
4040
*
41-
* An interim result looks like this (assuming all features are enabled):
41+
* Todo: add keywords, word_alternatives to examples
42+
*
43+
* An interim result looks like this:
4244
```js
4345
{ alternatives:
4446
[ { timestamps:
@@ -62,7 +64,7 @@ var QUERY_PARAMS_ALLOWED = ['model', 'watson-token']; // , 'X-Watson-Learning-Op
6264
result_index: 3 }
6365
```
6466
65-
While a final result looks like this (again, assuming all features are enabled):
67+
While a final result looks like this (some features only appear in final results):
6668
```js
6769
{ alternatives:
6870
[ { word_confidence:
@@ -104,9 +106,7 @@ var QUERY_PARAMS_ALLOWED = ['model', 'watson-token']; // , 'X-Watson-Learning-Op
104106
final: true,
105107
result_index: 3 }
106108
```
107-
108-
109-
109+
110110
*
111111
* @param {Object} options
112112
* @param {String} [options.model='en-US_BroadbandModel'] - voice model to use. Microphone streaming only supports broadband models.
@@ -117,9 +117,13 @@ var QUERY_PARAMS_ALLOWED = ['model', 'watson-token']; // , 'X-Watson-Learning-Op
117117
* @param {Boolean} [options.word_confidence=false] - include confidence scores with results. Defaults to true when in objectMode.
118118
* @param {Boolean} [options.timestamps=false] - include timestamps with results. Defaults to true when in objectMode.
119119
* @param {Number} [options.max_alternatives=1] - maximum number of alternative transcriptions to include. Defaults to 3 when in objectMode.
120+
* @param {Array<String>} [options.keywords] - a list of keywords to search for in the audio
121+
* @param {Number} [options.keywords_threshold] - Number between 0 and 1 representing the minimum confidence before including a keyword in the results. Required when options.keywords is set
122+
* @param {Number} [options.word_alternatives_threshold] - Number between 0 and 1 representing the minimum confidence before including an alternative word in the results. Must be set to enable word alternatives,
123+
* @param {Boolean} [options.profanity_filter=false] - set to true to filter out profanity and replace the words with *'s
120124
* @param {Number} [options.inactivity_timeout=30] - how many seconds of silence before automatically closing the stream (even if continuous is true). use -1 for infinity
121125
* @param {Boolean} [options.readableObjectMode=false] - emit `result` objects instead of string Buffers for the `data` events. Changes several other defaults.
122-
* @param {Number} [options.X-WDC-PL-OPT-OUT=0] set to 1 to opt-out of allowing Watson to use this request to improve it's services
126+
* @param {Number} [options.X-WDC-PL-OPT-OUT=0] - set to 1 to opt-out of allowing Watson to use this request to improve it's services
123127
*
124128
* //todo: investigate other options at http://www.ibm.com/smarterplanet/us/en/ibmwatson/developercloud/apis/#!/speech-to-text/recognizeSessionless
125129
*

0 commit comments

Comments
 (0)