|
1 |
| -# Integrate OCI AI Speech Service and Generative AI Summarization in Visual Builder |
2 |
| - |
3 |
| -# Introduction |
4 |
| - |
5 |
| -OCI Speech is an AI service that applies automatic speech recognition technology to transform audio-based content into text. Generative AI, The Large Language Model (LLM) analyzes the text input and can generate, summarize, transform, and extract information. Using these AI capabilities, we built a low code application- “Integrate OCI AI Speech Service and Generative AI Summarization in Visual Builder " to invoke AI Speech REST API to convert audio files into text and then further invoke the Generative AI REST API to Summarize it. |
| 1 | +# Transcribe and summarize speech-to-text |
| 2 | + |
| 3 | +OCI Speech is an AI service that applies automatic speech recognition technology to transform audio-based content into text. Generative AI, The Large Language Model (LLM) analyzes the text input and can generate, summarize, transform, and extract information. Using these AI capabilities, we built a low code application- “Integrate OCI AI Speech Service and Generative AI Service for Summarization in Visual Builder " to invoke AI Speech REST API to convert audio files into text and then further invoke the Generative AI REST API to Summarize it. |
6 | 4 |
|
7 | 5 | Reviewed: 20.02.2024
|
8 |
| - |
9 |
| -<img src="./files/AISpeechGenAISummary.png"></img> |
10 |
| - |
11 |
| -# Prerequisites |
12 |
| - |
13 |
| -Before getting started, make sure you have access to these services: |
14 |
| - |
15 |
| -- Oracle Speech Service |
16 |
| -- Oracle Generative AI Service |
17 |
| -- Oracle Visual Builder Cloud Service |
18 |
| -- Oracle Visual Builder Service Connection |
19 |
| - |
20 |
| -# AI Speech and OCI Generative AI Service Integration Architecture |
21 |
| - |
22 |
| -1. AI Speech App using VBCS |
23 |
| - |
24 |
| -- Oracle Visual Builder Cloud Service (VBCS) is a hosted environment for your application development infrastructure. It provides an open-source standards-based development service to create, collaborate on, and deploy applications within Oracle Cloud. This application is developed in VBCS. |
25 |
| - |
26 |
| -2. Transcriptions with OCI AI Speech Service: |
27 |
| -- Speech harnesses the power of spoken language enabling you to easily convert media files containing human speech into highly exact text transcriptions. |
28 |
| -- Produces accurate and easy-to-use JSON and SubRip Subtitle (SRT) files written directly to the Object Storage bucket you choose. |
29 |
| - |
30 |
| -3. Integration with OCI Generative AI Service: |
31 |
| -- The transcriptions (text) are sent to the OCI Generative AI Service for text summarization. |
32 |
| - |
33 |
| -4. Integration with OCI AI Vision and OCI Generative AI Service using Visual Builder Service Endpoint: |
34 |
| -- Build a Service Connection Endpoint option is used to integrate the VBCS app and OCI Object Storage, OCI AI Speech Service, and Generative AI Summarization. |
35 |
| - |
36 |
| -5. Summarization Process: |
37 |
| -- OCI Generative AI Service generates text using the keywords received from OCI Speech service, to create a concise summary of the audio or video. |
38 |
| - |
39 |
| - |
40 |
| -<img src="./files/AISpeechSummaryAppArch.svg"></img> |
41 |
| - |
42 |
| -# Application Flow in Detail (VBCS, OCI Speech, OCI Generative AI Service) |
43 |
| - |
44 |
| -In this application, the drag-and-drop component in VBCS allows the user to drop the audio or video. |
45 |
| -- Create a Service Endpoint connection in Visual Builder to handle the communication between Visual Builder and OCI Speech Service. |
46 |
| -- Pass the selected audio or video from Visual Builder to OCI Speech Service to convert it into text. |
47 |
| -- OCI Speech Service analyzes the media (audio or video) file and converts it into text. |
48 |
| -- The OCI Speech Service returns the transcription to the AI Speech Service Endpoint and returns the results to the Visual Builder app. |
49 |
| -- The transcription further passes to the Generative AI Service Endpoint and returns the Summarization results to the Visual Builder app. |
50 |
| - |
51 |
| - User (Visual Builder) --> (Drag and Drop File) --> |Media File (adudio or video) --> (Service Endpoint) --> |OCI Speech Service| --> |Speech to Text| --> (Service Endpoint) --> |Result| --> (Visual Builder) --> (Gen AI Service Endpoint) --> |Result| --> (Visual Builder) |
52 |
| - |
53 |
| - <img src="./files/AISpeechEngine.png"></img> |
54 |
| - |
55 |
| -# Service Endpoint call - Invoke OCI Object Storage |
56 |
| - |
57 |
| - uploadfile - /n/{namespaceName}/b/{bucketName}/o/{objectName} |
58 |
| - getObject - /n/{namespaceName}/b/{bucketName}/o/{outputFolderName}/{outputObjectName} |
59 |
| - |
60 |
| - |
61 |
| -# Service Endpoint call - Invoke AI Speech Service |
62 |
| - |
63 |
| - create transcription - /transcriptionJobs |
64 |
| - get transcription - transcriptionJobs/{transcriptionJobId} |
65 |
| - |
66 |
| -# Service Endpoint call - Invoke Generative AI Service |
67 |
| - |
68 |
| - create summary - /20231130/actions/summarizeText |
69 |
| - |
70 |
| - |
71 |
| -# Conclusion |
72 |
| - |
73 |
| -In this article, we've covered how to utilize Oracle AI Speech Service features to provide a transription and summarize using Generative AI service. |
74 |
| - |
75 |
| -Feel free to modify and expand upon this template according to your specific use case and preferences. |
76 |
| - |
77 |
| - |
| 6 | + |
| 7 | +# When to use this asset? |
| 8 | + |
| 9 | +See the README document in the /files folder. |
| 10 | + |
| 11 | +# How to use this asset? |
| 12 | + |
| 13 | +See the README document in the /files folder. |
| 14 | + |
78 | 15 | # License
|
79 | 16 |
|
80 | 17 | Copyright (c) 2024 Oracle and/or its affiliates.
|
81 | 18 |
|
82 | 19 | Licensed under the Universal Permissive License (UPL), Version 1.0.
|
83 | 20 |
|
84 | 21 | See [LICENSE](https://github.com/oracle-devrel/technology-engineering/blob/main/LICENSE) for more details.
|
85 |
| - |
0 commit comments