Skip to content

Commit c8c5744

Browse files
authored
Supports continuous speech recognition and barge-in (#5426)
* Add mock SpeechSynthesis * Clean up * Use jest-mock * Add expectingInput * Complete the test * Add import map * Use import map * Add await to resolveAll() * Complete case * No need to wait for send when barge-in * Add interims * Support barge-in * Bump version * Bump version * Continue to show "Listening..." * Bump react-dictate-button * Add more expectations * Clean up * Clean up * Add tests * Clean up * Clean up * Add more scenarios * Ignore html2 * Ported test * Added test * Bump react-dictate-button * Add entry * Update entries * Bump to [email protected] * Bump to [email protected] * Clean up * Clean up * More comments * Add perform card action * Add perform card action tests * Add test * More scenarios * Better comments * Better comment * Add comment * Add speech error telemetry * Add types
1 parent 4091100 commit c8c5744

32 files changed

+2246
-129
lines changed

CHANGELOG.md

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -80,6 +80,8 @@ Notes: web developers are advised to use [`~` (tilde range)](https://github.com/
8080
- When set to `'activity-status'`, feedback buttons appear in the activity status area (default behavior)
8181
- Added support for including activity ID and key into form data indicated by `data-webchat-include-activity-id` and `data-webchat-include-activity-key` attributes, in PR [#5418](https://github.com/microsoft/BotFramework-WebChat/pull/5418), by [@OEvgeny](https://github.com/OEvgeny)
8282
- Added dedicated loading animation for messages in preparing state for Fluent theme, in PR [#5423](https://github.com/microsoft/BotFramework-WebChat/pull/5423), by [@OEvgeny](https://github.com/OEvgeny)
83+
- Resolved [#2661](https://github.com/microsoft/BotFramework-WebChat/issues/2661) and [#5352](https://github.com/microsoft/BotFramework-WebChat/issues/5352). Added speech recognition continuous mode with barge-in support, in PR [#5426](https://github.com/microsoft/BotFramework-WebChat/pull/5426), by [@RushikeshGavali](https://github.com/RushikeshGavali) and [@compulim](https://github.com/compulim)
84+
- Set `styleOptions.speechRecognitionContinuous` to `true` with a Web Speech API provider with continuous mode support
8385

8486
### Changed
8587

@@ -101,9 +103,10 @@ Notes: web developers are advised to use [`~` (tilde range)](https://github.com/
101103
- Switched math block syntax from `$$` to Tex-style `\[ \]` and `\( \)` delimiters with improved rendering and error handling, in PR [#5353](https://github.com/microsoft/BotFramework-WebChat/pull/5353), by [@OEvgeny](https://github.com/OEvgeny)
102104
- Improved avatar display and grouping behavior by fixing rendering issues and activity sender identification, in PR [#5346](https://github.com/microsoft/BotFramework-WebChat/pull/5346), by [@OEvgeny](https://github.com/OEvgeny)
103105
- Activity "copy" button will use `outerHTML` and `textContent` for clipboard content, in PR [#5378](https://github.com/microsoft/BotFramework-WebChat/pull/5378), by [@compulim](https://github.com/compulim)
104-
- Bumped dependencies to the latest versions, by [@compulim](https://github.com/compulim) in PR [#5385](https://github.com/microsoft/BotFramework-WebChat/pull/5385) and [#5400](https://github.com/microsoft/BotFramework-WebChat/pull/5400)
106+
- Bumped dependencies to the latest versions, by [@compulim](https://github.com/compulim) in PR [#5385](https://github.com/microsoft/BotFramework-WebChat/pull/5385), [#5400](https://github.com/microsoft/BotFramework-WebChat/pull/5400), and [#5426](https://github.com/microsoft/BotFramework-WebChat/pull/5426)
105107
- Production dependencies
106108
- [`[email protected]`](https://npmjs.com/package/web-speech-cognitive-services)
109+
- [`[email protected]`](https://npmjs.com/package/react-dictate-button)
107110
- Enabled icon customization in Fluent theme through CSS variables, in PR [#5413](https://github.com/microsoft/BotFramework-WebChat/pull/5413), by [@OEvgeny](https://github.com/OEvgeny)
108111

109112
### Fixed

__tests__/hooks/useDictateState.js

Lines changed: 0 additions & 39 deletions
This file was deleted.
Lines changed: 77 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,77 @@
1+
// Adopted from https://github.com/testing-library/react-testing-library/blob/main/src/pure.js#L292C1-L329C2
2+
3+
/*!
4+
* The MIT License (MIT)
5+
* Copyright (c) 2017-Present Kent C. Dodds
6+
*
7+
* Permission is hereby granted, free of charge, to any person obtaining a copy
8+
* of this software and associated documentation files (the "Software"), to deal
9+
* in the Software without restriction, including without limitation the rights
10+
* to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
11+
* copies of the Software, and to permit persons to whom the Software is
12+
* furnished to do so, subject to the following conditions:
13+
*
14+
* The above copyright notice and this permission notice shall be included in all
15+
* copies or substantial portions of the Software.
16+
*
17+
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
18+
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
19+
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
20+
* AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
21+
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
22+
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
23+
* SOFTWARE.
24+
*/
25+
26+
function wrapUiIfNeeded(innerElement, wrapperComponent) {
27+
return wrapperComponent ? React.createElement(wrapperComponent, null, innerElement) : innerElement;
28+
}
29+
30+
export default function renderHook(
31+
/** @type {(props: RenderCallbackProps) => any} */ renderCallback,
32+
/** @type {{}} */ options = {}
33+
) {
34+
const { initialProps, ...renderOptions } = options;
35+
36+
if (renderOptions.legacyRoot && typeof ReactDOM.render !== 'function') {
37+
const error = new Error(
38+
'`legacyRoot: true` is not supported in this version of React. ' +
39+
'If your app runs React 19 or later, you should remove this flag. ' +
40+
'If your app runs React 18 or earlier, visit https://react.dev/blog/2022/03/08/react-18-upgrade-guide for upgrade instructions.'
41+
);
42+
Error.captureStackTrace(error, renderHook);
43+
throw error;
44+
}
45+
46+
const result = React.createRef();
47+
48+
function TestComponent({ renderCallbackProps }) {
49+
const pendingResult = renderCallback(renderCallbackProps);
50+
51+
React.useEffect(() => {
52+
result.current = pendingResult;
53+
});
54+
55+
return null;
56+
}
57+
58+
// A stripped down version of render() from `@testing-library/react`.
59+
const render = ({ renderCallbackProps }) => {
60+
const element = document.querySelector('main');
61+
62+
ReactDOM.render(wrapUiIfNeeded(React.createElement(TestComponent, renderCallbackProps), renderOptions.wrapper), element);
63+
64+
return { rerender: render, unmount: () => ReactDOM.unmountComponentAtNode(element) };
65+
};
66+
67+
const { rerender: baseRerender, unmount } = render(
68+
React.createElement(TestComponent, { renderCallbackProps: initialProps }),
69+
renderOptions
70+
);
71+
72+
function rerender(rerenderCallbackProps) {
73+
return baseRerender(React.createElement(TestComponent, { renderCallbackProps: rerenderCallbackProps }));
74+
}
75+
76+
return { result, rerender, unmount };
77+
}
Lines changed: 221 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,221 @@
1+
<!doctype html>
2+
<html lang="en-US">
3+
<head>
4+
<link href="/assets/index.css" rel="stylesheet" type="text/css" />
5+
<script crossorigin="anonymous" src="https://unpkg.com/[email protected]/umd/react.development.js"></script>
6+
<script crossorigin="anonymous" src="https://unpkg.com/[email protected]/umd/react-dom.development.js"></script>
7+
<script crossorigin="anonymous" src="/test-harness.js"></script>
8+
<script crossorigin="anonymous" src="/test-page-object.js"></script>
9+
<script crossorigin="anonymous" src="/__dist__/webchat-es5.js"></script>
10+
</head>
11+
<body>
12+
<main id="webchat"></main>
13+
<script type="importmap">
14+
{
15+
"imports": {
16+
"@testduet/wait-for": "https://unpkg.com/@testduet/wait-for@main/dist/wait-for.mjs",
17+
"jest-mock": "https://esm.sh/jest-mock",
18+
"react-dictate-button/internal": "https://unpkg.com/react-dictate-button@main/dist/react-dictate-button.internal.mjs"
19+
}
20+
}
21+
</script>
22+
<script type="module">
23+
import { waitFor } from '@testduet/wait-for';
24+
import { fn, spyOn } from 'jest-mock';
25+
import {
26+
SpeechGrammarList,
27+
SpeechRecognition,
28+
SpeechRecognitionAlternative,
29+
SpeechRecognitionErrorEvent,
30+
SpeechRecognitionEvent,
31+
SpeechRecognitionResult,
32+
SpeechRecognitionResultList
33+
} from 'react-dictate-button/internal';
34+
import { SpeechSynthesis, SpeechSynthesisEvent, SpeechSynthesisUtterance } from '../speech/js/index.js';
35+
import renderHook from './private/renderHook.js';
36+
37+
const {
38+
React: { createElement },
39+
ReactDOM: { render },
40+
testHelpers: { createDirectLineEmulator },
41+
WebChat: {
42+
Components: { BasicWebChat, Composer },
43+
hooks: { useDictateState },
44+
renderWebChat,
45+
testIds
46+
}
47+
} = window;
48+
49+
run(async function () {
50+
const speechSynthesis = new SpeechSynthesis();
51+
const ponyfill = {
52+
SpeechGrammarList,
53+
SpeechRecognition: fn().mockImplementation(() => {
54+
const speechRecognition = new SpeechRecognition();
55+
56+
spyOn(speechRecognition, 'abort');
57+
spyOn(speechRecognition, 'start');
58+
59+
return speechRecognition;
60+
}),
61+
speechSynthesis,
62+
SpeechSynthesisUtterance
63+
};
64+
65+
spyOn(speechSynthesis, 'speak');
66+
67+
const { directLine, store } = createDirectLineEmulator();
68+
const WebChatWrapper = ({ children }) =>
69+
createElement(
70+
Composer,
71+
{ directLine, store, webSpeechPonyfillFactory: () => ponyfill },
72+
createElement(BasicWebChat),
73+
children
74+
);
75+
76+
// WHEN: Render initially.
77+
const renderResult = renderHook(() => useDictateState()[0], {
78+
legacyRoot: true,
79+
wrapper: WebChatWrapper
80+
});
81+
82+
await pageConditions.uiConnected();
83+
84+
// THEN: `useDictateState` should returns IDLE.
85+
await waitFor(() => expect(renderResult).toHaveProperty('result.current', 0)); // IDLE
86+
87+
// WHEN: Microphone button is clicked and priming user gesture is done.
88+
await pageObjects.clickMicrophoneButton();
89+
90+
await waitFor(() => expect(speechSynthesis.speak).toHaveBeenCalledTimes(1));
91+
speechSynthesis.speak.mock.calls[0][0].dispatchEvent(
92+
new SpeechSynthesisEvent('end', { utterance: speechSynthesis.speak.mock.calls[0] })
93+
);
94+
95+
// THEN: `useDictateState` should returns STARTING.
96+
renderResult.rerender();
97+
// Dictate state "1" is for "automatic turning on microphone after current synthesis completed".
98+
await waitFor(() => expect(renderResult).toHaveProperty('result.current', 2));
99+
100+
// THEN: Should construct SpeechRecognition().
101+
expect(ponyfill.SpeechRecognition).toHaveBeenCalledTimes(1);
102+
103+
const { value: speechRecognition1 } = ponyfill.SpeechRecognition.mock.results[0];
104+
105+
// THEN: Should call SpeechRecognition.start().
106+
expect(speechRecognition1.start).toHaveBeenCalledTimes(1);
107+
108+
// WHEN: Recognition started and interims result is dispatched.
109+
speechRecognition1.dispatchEvent(new Event('start'));
110+
speechRecognition1.dispatchEvent(new Event('audiostart'));
111+
speechRecognition1.dispatchEvent(new Event('soundstart'));
112+
speechRecognition1.dispatchEvent(new Event('speechstart'));
113+
114+
// WHEN: Recognized interim result of "Hello".
115+
speechRecognition1.dispatchEvent(
116+
new SpeechRecognitionEvent('result', {
117+
results: new SpeechRecognitionResultList(
118+
new SpeechRecognitionResult(new SpeechRecognitionAlternative(0, 'Hello'))
119+
)
120+
})
121+
);
122+
123+
// THEN: `useDictateState` should returns DICTATING.
124+
renderResult.rerender();
125+
await waitFor(() => expect(renderResult).toHaveProperty('result.current', 3));
126+
127+
// WHEN: Recognized finalized result of "Hello, World!" and ended recognition.
128+
await (
129+
await directLine.actPostActivity(() =>
130+
speechRecognition1.dispatchEvent(
131+
new SpeechRecognitionEvent('result', {
132+
results: new SpeechRecognitionResultList(
133+
SpeechRecognitionResult.fromFinalized(new SpeechRecognitionAlternative(0.9, 'Hello, World!'))
134+
)
135+
})
136+
)
137+
)
138+
).resolveAll();
139+
140+
speechRecognition1.dispatchEvent(new Event('speechend'));
141+
speechRecognition1.dispatchEvent(new Event('soundend'));
142+
speechRecognition1.dispatchEvent(new Event('audioend'));
143+
speechRecognition1.dispatchEvent(new Event('end'));
144+
145+
// THEN: `useDictateState` should returns IDLE.
146+
renderResult.rerender();
147+
await waitFor(() => expect(renderResult).toHaveProperty('result.current', 0));
148+
149+
// WHEN: Bot replied.
150+
await directLine.emulateIncomingActivity({
151+
inputHint: 'expectingInput', // "expectingInput" should turn the microphone back on after synthesis completed.
152+
text: 'Aloha!',
153+
type: 'message'
154+
});
155+
await pageConditions.numActivitiesShown(2);
156+
157+
// THEN: Should call SpeechSynthesis.speak() again.
158+
await waitFor(() => expect(speechSynthesis.speak).toHaveBeenCalledTimes(2));
159+
160+
// THEN: Should start synthesize "Aloha!".
161+
expect(speechSynthesis.speak).toHaveBeenLastCalledWith(expect.any(SpeechSynthesisUtterance));
162+
expect(speechSynthesis.speak).toHaveBeenLastCalledWith(expect.objectContaining({ text: 'Aloha!' }));
163+
164+
// THEN: `useDictateState` should returns WILL_START.
165+
renderResult.rerender();
166+
await waitFor(() => expect(renderResult).toHaveProperty('result.current', 1));
167+
168+
// WHEN: Synthesis completed.
169+
speechSynthesis.speak.mock.calls[1][0].dispatchEvent(
170+
new SpeechSynthesisEvent('end', { utterance: speechSynthesis.speak.mock.calls[1] })
171+
);
172+
173+
// THEN: `useDictateState` should returns STARTING.
174+
renderResult.rerender();
175+
await waitFor(() => expect(renderResult).toHaveProperty('result.current', 2));
176+
177+
// WHEN: Recognition started and interims result is dispatched.
178+
const { value: speechRecognition2 } = ponyfill.SpeechRecognition.mock.results[1];
179+
180+
// THEN: Should call SpeechRecognition.start().
181+
expect(speechRecognition2.start).toHaveBeenCalledTimes(1);
182+
183+
// WHEN: Recognition started and interims result is dispatched.
184+
speechRecognition2.dispatchEvent(new Event('start'));
185+
speechRecognition2.dispatchEvent(new Event('audiostart'));
186+
speechRecognition2.dispatchEvent(new Event('soundstart'));
187+
speechRecognition2.dispatchEvent(new Event('speechstart'));
188+
189+
// WHEN: Recognized interim result of "Good".
190+
speechRecognition2.dispatchEvent(
191+
new SpeechRecognitionEvent('result', {
192+
results: new SpeechRecognitionResultList(
193+
new SpeechRecognitionResult(new SpeechRecognitionAlternative(0, 'Good'))
194+
)
195+
})
196+
);
197+
198+
// THEN: `useDictateState` should returns LISTENING.
199+
renderResult.rerender();
200+
await waitFor(() => expect(renderResult).toHaveProperty('result.current', 3));
201+
202+
// WHEN: Click on microphone button.
203+
await pageObjects.clickMicrophoneButton();
204+
205+
// THEN: `useDictateState` should returns STOPPING.
206+
renderResult.rerender();
207+
await waitFor(() => expect(renderResult).toHaveProperty('result.current', 4));
208+
209+
// WHEN: Recognition ended.
210+
speechRecognition2.dispatchEvent(new Event('speechend'));
211+
speechRecognition2.dispatchEvent(new Event('soundend'));
212+
speechRecognition2.dispatchEvent(new Event('audioend'));
213+
speechRecognition2.dispatchEvent(new Event('end'));
214+
215+
// THEN: `useDictateState` should returns STOPPING.
216+
renderResult.rerender();
217+
await waitFor(() => expect(renderResult).toHaveProperty('result.current', 0));
218+
});
219+
</script>
220+
</body>
221+
</html>

0 commit comments

Comments
 (0)