You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There were a number of issues with this method:
* Audio was transcoded and resampled twice, leading to lower transcription quality
* The audio output does not seem to be 100% the same through consecutive replays of a given file, leading to slight variations in the transcription
* If the source was pause/stopped/reqound/fastforwarded, this would affect the transcription
The last issue could have been handled with significant effort, but the former could not.
Extract audio from an `<audio>` or `<video>` element and transcribe speech.
72
-
73
-
This method has some limitations:
74
-
* the audio is run through two lossy conversions: first from the source format to WebAudio, and second to l16 (raw wav) for Watson
75
-
* the WebAudio API does not guarantee the same exact output for the same file played twice, so it's possible to receive slight different transcriptions for the same file played repeatedly
76
-
* it transcribes the audio as it is heard, so pausing or skipping will affect the transcription
77
-
* audio that is paused for too long will cause the socket to time out and disconnect, preventing further transcription (without setting things up again)
78
-
79
-
Because of these limitations, there are two alternative methods that may be preferable in some situations:
80
-
* fetch the audio via ajax and then pass it to `recognizeFile()` - this resolves the issues around lossy conversion and inexact transcription, but the audio playback, if enabled, cannot be paused, rewound, etc.
81
-
* Pre-process the audio and generate a [WebVTT](https://developer.mozilla.org/en-US/docs/Web/API/Web_Video_Text_Tracks_Format) subtitles file to insert in `<track>`, completely bypassing this SDK. This resolves all of the above issues, and gives you an opportunity to review and/or edit the subtitles if desired.
82
-
83
-
Options:
84
-
*`element`: an `<audio>` or `<video>` element (could be generated pragmatically, e.g. `new Audio()`)
85
-
* Other options passed to MediaElementAudioStream and RecognizeStream
86
-
* Other options passed to WritableElementStream if `options.outputElement` is set
87
-
88
-
Requires that the browser support MediaElement and whatever audio codec is used in your media file.
89
-
90
-
Will automatically call `.play()` the `element`, set `options.autoPlay=false` to disable. Calling `.stop()` on the returned stream will automatically call `.stop()` on the `element`.
91
-
92
-
Pipes results through a `{FormatStream}` by default, set `options.format=false` to disable.
67
+
Known issue: Firefox continues to display a microphone icon in the address bar after recording has ceased. This is a browser bug.
Copy file name to clipboardExpand all lines: examples/static/index.html
-2Lines changed: 0 additions & 2 deletions
Original file line number
Diff line number
Diff line change
@@ -16,8 +16,6 @@ <h2>Speech to Text</h2>
16
16
<li><ahref="file-realtime-vs-no-realtime.html">Transcribe from file, Comparing <code>{realtime: true}</code> to <code>{realtime: false}</code></a></li>
17
17
<li><ahref="file-promise.html">Transcribe from file, Promise</a></li>
18
18
<li><ahref="file-ajax.html">Transcribe from file loaded over AJAX</a></li>
19
-
<li><ahref="audio-element.html">Transcribe from HTML5 <audio> element, Streaming</a></li>
20
-
<li><ahref="audio-element-programmatic.html">Transcribe from <code>new Audio()</code>, Streaming</a></li>
0 commit comments