Skip to content

Commit 83f53e8

Browse files
authored
Merge pull request #44 from data-integrations/feature/CDAP-14474-speech
CDAP-14474 Update GCP Speech transform
2 parents e012cd6 + 0e1d5b8 commit 83f53e8

File tree

7 files changed

+458
-288
lines changed

7 files changed

+458
-288
lines changed

docs/SpeechToText-transform.md

Lines changed: 44 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,44 @@
1+
# Google Cloud Speech-to-Text Transform
2+
3+
Description
4+
-----------
5+
This plugin converts audio files to text by using Google Cloud Speech-to-Text.
6+
7+
Google Cloud Speech-to-Text enables developers to convert audio to text by applying powerful neural network models.
8+
9+
Credentials
10+
-----------
11+
If the plugin is run on a Google Cloud Dataproc cluster, the service account key does not need to be
12+
provided and can be set to 'auto-detect'.
13+
Credentials will be automatically read from the cluster environment.
14+
15+
If the plugin is not run on a Dataproc cluster, the path to a service account key must be provided.
16+
The service account key can be found on the Dashboard in the Cloud Platform Console.
17+
Make sure the account key has permission to access Google Cloud Spanner.
18+
The service account key file needs to be available on every node in your cluster and
19+
must be readable by all users running the job.
20+
21+
Properties
22+
----------
23+
**Audio Field:** Name of the input field which contains the raw audio data in bytes.
24+
25+
**Audio Encoding**: Audio encoding of the data sent in the audio message. All encodings support only 1 channel (mono)
26+
audio. Only 'FLAC' and 'WAV' include a header that describes the bytes of audio that follow the header.
27+
The other encodings are raw audio bytes with no header.
28+
29+
**Sampling Rate**: Sample rate in Hertz of the audio data sent in all 'RecognitionAudio' messages.
30+
Valid values are: 8000-48000. 16000 is optimal. For best results, set the sampling rate of the audio source to
31+
16000 Hz. If that's not possible, use the native sample rate of the audio source (instead of re-sampling).
32+
33+
**Profanity**: Whether to attempt filtering profanities, replacing all but the initial character in each filtered
34+
word with asterisks, e.g. "f***". If set to `false`, profanities won't be filtered out.
35+
36+
**Language**: The language of the supplied audio as a [BCP-47](https://www.rfc-editor.org/rfc/bcp/bcp47.txt)
37+
language tag. Example: "en-US". See [Language Support](https://cloud.google.com/speech/docs/languages) for a list of
38+
the currently supported language codes.
39+
40+
**Transcription Parts Field**: The field to store the transcription parts. It will be an array of records. Each record
41+
in the array represents one part of the full audio data and will contain the transcription and confidence for that part.
42+
43+
**Transcription Text Field**: The field to store the transcription of the full audio data. It is generated using the
44+
transcription for each part with the highest confidence.

docs/SpeechTranslator-transform.md

Lines changed: 0 additions & 1 deletion
This file was deleted.

src/main/java/co/cask/gcp/common/GCPConfig.java

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@
1111
* Contains config properties common to all GCP plugins, like project id and service account key.
1212
*/
1313
public class GCPConfig extends PluginConfig {
14-
private static final String AUTO_DETECT = "auto-detect";
14+
public static final String AUTO_DETECT = "auto-detect";
1515

1616
@Description("Google Cloud Project ID, which uniquely identifies a project. "
1717
+ "It can be found on the Dashboard in the Google Cloud Platform Console.")

0 commit comments

Comments
 (0)