On Demand Azure Speech Generation #1894

john0isaac · 2024-08-07T22:44:35Z

Purpose

Does this introduce a breaking change?

When developers merge from main and run the server, azd up, or azd deploy, will this produce an error?
If you're not sure, try it out on an old environment.

[ ] Yes
[x] No

Does this require changes to learn.microsoft.com docs?

This repository is referenced by this tutorial
which includes deployment, settings and usage instructions. If text or screenshot need to change in the tutorial,
check the box below and notify the tutorial author. A Microsoft employee can do this for you if you're an external contributor.

[ ] Yes
[x] No

Type of change

[ ] Bugfix
[x] Feature
[ ] Code style update (formatting, local variables)
[x] Refactoring (no functional changes, no api changes)
[ ] Documentation content changes
[ ] Other... Please describe:

Code quality checklist

See CONTRIBUTING.md for more details.

The current tests all pass (python -m pytest).
I added tests that prove my fix is effective or that my feature works
I ran python -m pytest --cov to verify 100% coverage of added lines
I ran python -m mypy to check for type errors
I either used the pre-commit hooks or ran ruff and black manually on my code.

john0isaac · 2024-08-07T22:46:02Z

TODO

Cache the Generated Audio after the first time it's generated to not incur more charges.
Handel cases where the Azure Speech API request function returns null. [This was already being handled by console.error()]

john0isaac · 2024-08-08T21:16:35Z

@TaylorN15, can you look at this and let me know if it's working as you expect?

@pamelafox done, it's not the prettiest implementation but it works.
I couldn't figure out a way to update the URLs and Caching them on click other than passing all of the URLs, index, and setter to the AzureSpeechUrls function to check then make a request and use setter to update the value.

This is the initial implementation for on-demand speech request b4c1b21 but every time you click on it, it will make a new request to Azure Speech Service.

app/frontend/src/components/Answer/SpeechOutputAzure.tsx

taylorn-ai · 2024-08-08T22:19:43Z

This looks good, very quick, nice job :)

Why is speechUrls an array? And why pass it down from Chat.tsx? Is that because you were just trying not change too much code?

taylorn-ai · 2024-08-09T01:04:43Z

Also it seems you can play multiple concurrently, this doesn't happen with the web speech API as it seems to handle it automatically. I think it would be best to have some sort of state to manage if audio is playing on any answer, and pause/stop on others if another one is played.

john0isaac · 2024-08-09T07:45:10Z

Why is speechUrls an array? And why pass it down from Chat.tsx? Is that because you were just trying not change too much code?

nope like I said on the issue the AzureSpeechOutput is a component that can't keep track of its state anything done inside it is not saved.
To save its state you need to keep track of it in the Parent Page like we keep an array of answers to get the Thought Process etc.

Also it seems you can play multiple concurrently, this doesn't happen with the web speech API as it seems to handle it automatically. I think it would be best to have some sort of state to manage if audio is playing on any answer, and pause/stop on others if another one is played.

This is out of the scope of this PR, and it was never being handled correctly for Azure Speech Service.
The web speech Api is not handling it automatically I implemented the logic to handle this to not play multiple audios concurrently.
But happy to fix it too!

Tried working on it but I couldn't get it to work, people just have to not click on multiple audio or when they click on an audio they need to click on it again to pause it before clicking on another audio.

taylorn-ai · 2024-08-09T20:26:16Z

I have gotten this working in my implementation, but quite a few changes from what you have so it's hard to compare.

// Chat.tsx
    const [activeAudioAnswer, setActiveAudioAnswer] = useState<number | null>(null);
    const [isAudioPlaying, setIsAudioPlaying] = useState<boolean>(false);
    const [cachedAudioUrls, setCachedAudioUrls] = useState<{ [key: number]: string }>({});
    const ttsAudio = useRef(new Audio()).current;

    const onToggleAudioPlay = (index: number, speechUrl: string) => {
        if (isAudioPlaying && activeAudioAnswer === index) {
            stopAudio();  // Stop if it's currently playing the same answer
        } else {
            if (isAudioPlaying) {
                stopAudio();  // Stop any currently playing audio
            }
            playAudio(index, speechUrl);
        }
    };

    const playAudio = (index: number, speechUrl: string) => {
        ttsAudio.src = speechUrl;
        ttsAudio.play().then(() => {
            setActiveAudioAnswer(index);
            setIsAudioPlaying(true);
        }).catch((error) => {
            handleError(error, "Error playing audio:");
        });

        ttsAudio.onended = () => {
            setIsAudioPlaying(false);
            setActiveAudioAnswer(null);
        };
    };

    const stopAudio = () => {
        ttsAudio.pause();
        ttsAudio.currentTime = 0;
        setActiveAudioAnswer(null);
        setIsAudioPlaying(false);
    };

    const cacheAudioUrl = (index: number, url: string) => {
        setCachedAudioUrls(prevUrls => ({
            ...prevUrls,
            [index]: url
        }));
    };
...
<Answer
    index={index}
    isAudioPlaying={index === activeAudioAnswer && isAudioPlaying}
    onToggleAudioPlay={onToggleAudioPlay}
    cachedAudioUrl={cachedAudioUrls[index] || null}
    onCacheAudioUrl={cacheAudioUrl}
...
/>

// Answer.tsx
const [isAudioLoading, setIsAudioLoading] = useState<boolean>(false);

    const toggleAudioPlay = async () => {
        if (cachedAudioUrl) {
            onToggleAudioPlay(index, cachedAudioUrl);
        } else {
            setIsAudioLoading(true);
            const plainText = removeHtmlTags(sanitizedAnswerHtml);
            const token = client ? await getToken(client) : undefined;
            const speechUrl = await getSpeechApi(token, plainText);

            if (!speechUrl) {
                handleError("Error", "An error occurred while generating speech");
                setIsAudioLoading(false);
                return;
            }

            onCacheAudioUrl(index, speechUrl);
            onToggleAudioPlay(index, speechUrl);
            setIsAudioLoading(false);
        }
    };

Then I pass the state of the audio loading/playing into the buttons so that it shows a play/stop depending on state.

john0isaac · 2024-08-11T14:17:50Z

@TaylorN15 welp, I didn't want to do this I found a workaround.
I implemented it differently to maintain the modularity of the implementation.
Thanks for your code though it helped, can you give it a go now and let me know if there is anything else.

taylorn-ai · 2024-08-11T21:14:27Z

@TaylorN15 welp, I didn't want to do this I found a workaround.

I implemented it differently to maintain the modularity of the implementation.

Thanks for your code though it helped, can you give it a go now and let me know if there is anything else.

Looks good. I don't have time to deploy and test, like I said my code is quite different.

prevent speech generation while streaming the response

app/frontend/src/pages/ask/Ask.tsx

app/frontend/src/components/Answer/Answer.tsx

pamelafox · 2024-08-22T22:34:15Z

Hm I just had a bug where I got an answer, heard the speech, then cleared chat and asked anopther question, and it played the original speech. I'm trying to replicate it to figure out what leads to it.

john0isaac · 2024-08-23T04:13:25Z

When you clear the chat all of the generated voice urls are set to null so, i'm not sure about that.
Maybe it was cached in the audio var?
Do I need to clear the audio too, when you click on clear?

john0isaac · 2024-08-23T17:02:54Z

Hm I just had a bug where I got an answer, heard the speech, then cleared chat and asked anopther question, and it played the original speech. I'm trying to replicate it to figure out what leads to it.

I could not replicate this issue.

app/frontend/src/pages/ask/Ask.tsx

app/frontend/src/pages/chat/Chat.tsx

Co-authored-by: Pamela Fox <[email protected]>

app/frontend/src/pages/chat/Chat.tsx

pamelafox · 2024-08-23T18:20:15Z

I haven't replicated the issue either. I think the new code architecture looks good.

I just pushed a change which preloads the "sync" icon. That's because I was briefly seeing the no-image-indicator briefly before the sync icon came in, and I found that confusing.
I preload it by adding an invisible disabled button with the same icon, and verified it passes accessibility checks.

pamelafox · 2024-08-23T18:37:50Z

Thanks, merged!

john0isaac commented Aug 8, 2024

View reviewed changes

app/frontend/src/components/Answer/SpeechOutputAzure.tsx Outdated Show resolved Hide resolved

john0isaac force-pushed the on-demand-azure-speech branch 2 times, most recently from c6dcd30 to ff043d8 Compare August 9, 2024 10:26

john0isaac force-pushed the on-demand-azure-speech branch 2 times, most recently from 4019b99 to 84147b3 Compare August 19, 2024 21:36

zedhaque mentioned this pull request Aug 20, 2024

Frontend multi-language support #1690 #1790

Merged

5 tasks

John Aziz added 3 commits August 22, 2024 08:16

on demand speech, fix Azure-Samples#1892

7cf281e

cache speech urls

4a285ee

maintain one audio source across app

9c551f3

prevent speech generation while streaming the response

john0isaac force-pushed the on-demand-azure-speech branch from 84147b3 to 9c551f3 Compare August 22, 2024 05:16

pamelafox reviewed Aug 22, 2024

View reviewed changes

app/frontend/src/pages/ask/Ask.tsx Outdated Show resolved Hide resolved

pamelafox reviewed Aug 22, 2024

View reviewed changes

app/frontend/src/components/Answer/Answer.tsx Show resolved Hide resolved

create speechConfig type to group speech config

0b75a39

john0isaac requested a review from pamelafox August 23, 2024 17:03

pamelafox reviewed Aug 23, 2024

View reviewed changes

app/frontend/src/pages/ask/Ask.tsx Show resolved Hide resolved

pamelafox reviewed Aug 23, 2024

View reviewed changes

app/frontend/src/pages/chat/Chat.tsx Show resolved Hide resolved

Update app/frontend/src/pages/ask/Ask.tsx

92462a9

Co-authored-by: Pamela Fox <[email protected]>

john0isaac commented Aug 23, 2024

View reviewed changes

app/frontend/src/pages/chat/Chat.tsx Show resolved Hide resolved

Update app/frontend/src/pages/chat/Chat.tsx

2d959bd

john0isaac requested a review from pamelafox August 23, 2024 17:12

Preload the sync icon

4fcd5b5

pamelafox approved these changes Aug 23, 2024

View reviewed changes

pamelafox merged commit f7969c0 into Azure-Samples:main Aug 23, 2024
10 checks passed

john0isaac deleted the on-demand-azure-speech branch August 23, 2024 19:08

On Demand Azure Speech Generation #1894

On Demand Azure Speech Generation #1894

Uh oh!

Conversation

john0isaac commented Aug 7, 2024

Purpose

Does this introduce a breaking change?

Does this require changes to learn.microsoft.com docs?

Type of change

Code quality checklist

Uh oh!

john0isaac commented Aug 7, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

john0isaac commented Aug 8, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

taylorn-ai commented Aug 8, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

taylorn-ai commented Aug 9, 2024

Uh oh!

john0isaac commented Aug 9, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

taylorn-ai commented Aug 9, 2024

Uh oh!

john0isaac commented Aug 11, 2024

Uh oh!

taylorn-ai commented Aug 11, 2024

Uh oh!

Uh oh!

Uh oh!

pamelafox commented Aug 22, 2024

Uh oh!

john0isaac commented Aug 23, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

john0isaac commented Aug 23, 2024

Uh oh!

Uh oh!

Uh oh!

Uh oh!

pamelafox commented Aug 23, 2024

Uh oh!

Uh oh!

pamelafox commented Aug 23, 2024

Uh oh!

Uh oh!

john0isaac commented Aug 7, 2024 •

edited

Loading

john0isaac commented Aug 8, 2024 •

edited

Loading

taylorn-ai commented Aug 8, 2024 •

edited

Loading

john0isaac commented Aug 9, 2024 •

edited

Loading

john0isaac commented Aug 23, 2024 •

edited

Loading