Skip to content

Commit b8e74e1

Browse files
aalmadakeveleigh
authored andcommitted
Added SpeechInputSource for integration of the KeywordRecognizer in the InputManager (#354)
* Implemented KeywordManager as a input source. Added all the required event pipeline. * Added missing meta files * Using the InputManager to handle the events. * Changed the KeywordManager scene to test keyword handlers on focused object and also a global listener. * Changed the KeywordManager scene to test keyword handlers on focused object and also a global listener. * Fixed method inheritance in KeywordManager * Made all the changes suggested til this date * Fixes to the SpeechInputSource test scene * Fixes to the SpeechInputSource test scene * Fix not stopping when deactivated #360 * Rolled back enable/disable handling * Set update() to protected virtual to allow inheritance. Should more methods also change? * Set Unity methods to private * Added to input documention * Renamed all PhraseRecognized to SpeechKeywordRecognized to avoid name clashing. * Removed LINQ from SpeechInputSource. Refactored SpeechInputSource.ProcessKeyBindings(). * Process key bindings only keyword recognizer is running. * Duplicate behavior from #360 in SpeechInputSource * Added a custom editor for SpeechInputSource showing a warning when no keywords are assigned. * Improved SpeechInputSource keyword list layout in the inspector to show only two columns. * Removed CanEditMultipleObjects from SpeechInputSourceEditor. * Added add and remove buttons to the SpeechInputSource custom editor. * Changed SpeechInputSource initialization to be more readable as in comment at https://github.com/Microsoft/HoloToolkit-Unity/pull/398/files#r92492988 * Added code comments to SpeechInputSourceEditor and KeywordAndKeyCodeDrawer. * Use TextMesh instead of GUI Text in the SpeechInputSource test scene. * Added a missing private keyword * Unload materials in SphereKeywords.cs and SphereGlobalKeywords.cs. Partial fix for #236 * A fix for #236 using a shader. Changing the color requires changing the color of all vertices but rendering requires just one draw call. * Use [PerRendererData] in the shader to efficiently change color without leaks. * Unity doesn't like init blocks * Added SetGlobalListener.cs * Fixed misspellings * Reverted changes to HoloToolkit_Default.mat * Reverted addition of [PerRendererData] to BlinnPhongConfigurable and related code * Use the SetGlobalListener.cs instead * Latest comments fixes
1 parent 9009654 commit b8e74e1

26 files changed

+1564
-6
lines changed

Assets/HoloToolkit/Input/README.md

Lines changed: 31 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,7 @@ Game objects that want to consume input events can implement one or many **input
1717
- **IManipulationHandler** for the Windows manipulation gesture.
1818
- **INavigationnHandler** for the Windows navigation gesture.
1919
- **ISourceStateHandler** for the source detected and source lost events.
20+
- **ISpeechHandler** for voice commands.
2021

2122
The **input manager** listens to the various events coming from the input sources, and also takes into account the gaze. Currently, that gaze is always coming from the GazeManager class, but this could be extended to support multiple gaze sources if the need arises.
2223

@@ -189,16 +190,30 @@ or in your Visual Studio Package.appxmanifest capabilities.
189190
**RecognizerStart** Set this to determine whether the keyword recognizer will start immediately or if it should wait for your code to tell it to start.
190191

191192
#### Voice
192-
##### KeywordManager.cs
193-
Allows you to specify keywords and methods in the Unity Inspector, instead of registering them explicitly in code.
194-
**IMPORTANT**: Please make sure to add the microphone capability in your app, in Unity under
193+
194+
**IMPORTANT**: Please make sure to add the Microphone capabilities in your app, in Unity under
195195
Edit -> Project Settings -> Player -> Settings for Windows Store -> Publishing Settings -> Capabilities
196196
or in your Visual Studio Package.appxmanifest capabilities.
197197

198+
##### KeywordManager.cs
199+
Allows you to specify keywords and methods in the Unity Inspector, instead of registering them explicitly in code.
200+
198201
**_KeywordsAndResponses_** Set the size as the number of keywords you'd like to listen for, then specify the keywords and method responses to complete the array.
199202

200203
**RecognizerStart** Set this to determine whether the keyword recognizer will start immediately or if it should wait for your code to tell it to start.
201204

205+
##### SpeechInputSource.cs
206+
Allows you to specify keywords and keyboard shortcuts in the Unity Inspector, instead of registering them explicitly in code. Keywords are handled by scripts that implement ISpeechHandler.cs.
207+
208+
Check out Assets/HoloToolkit/Input/Tests/Scripts/SphereKeywords.cs and Assets/HoloToolkit/Input/Tests/Scripts/SphereGlobalKeywords.cs for an example of implementing these features, which is used in the demo scene at Assets/HoloToolkit/Input/Tests/SpeechInputSource.unity.
209+
210+
**_KeywordsAndKeys_** Set the size as the number of keywords you'd like to listen for, then specify the keywords to complete the array.
211+
212+
**RecognizerStart** Set this to determine whether the keyword recognizer will start immediately or if it should wait for your code to tell it to start.
213+
214+
##### ISpeechHandler.cs
215+
Interface that a game object can implement to react to speech keywords.
216+
202217
### [Test Prefabs](TestPrefabs)
203218

204219
Prefabs used in the various test scenes, which you can use as inspiration to build your own.
@@ -276,6 +291,19 @@ Gazing on an object and saying "Select Object" will persistently select that obj
276291
after which the user can also adjust object size with "Make Smaller" and "Make Bigger" voice commands and finally clear
277292
currently selected object by saying "Clear Selection".
278293

294+
#### SpeechInputSource.unity
295+
296+
Shows how to use the SpeechInputSource.cs script to add keywords to your scene.
297+
298+
1. Select whether you want the recognizer to start automatically or when you manually start it.
299+
2. Specify the number of keywords you want.
300+
3. Type the word or phrase you'd like to register as the keyword and, if you want, set a key code to use in the Editor. You can also use an attached microphone with the Editor.
301+
4. Attach a script that implements ISpeechHandler.cs to the object in the scene that will require the gaze focus to execute the command. You should register this script with the InputManager.cs as a global listener to handle keywords that don't require a focused object.
302+
303+
When you start the scene, your keywords will automatically be registered on a KeywordRecognizer, and the recognizer will be started (or not) based on your Recognizer Start setting.
304+
305+
####
306+
279307
---
280308
##### [Go back up to the table of contents.](../../../README.md)
281309
---
Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
// Copyright (c) Microsoft Corporation. All rights reserved.
2+
// Licensed under the MIT License. See LICENSE in the project root for license information.
3+
4+
using UnityEngine.EventSystems;
5+
6+
namespace HoloToolkit.Unity.InputModule
7+
{
8+
/// <summary>
9+
/// Interface to implement to react to speech recognition.
10+
/// </summary>
11+
public interface ISpeechHandler : IEventSystemHandler
12+
{
13+
void OnSpeechKeywordRecognized(SpeechKeywordRecognizedEventData eventData);
14+
}
15+
}

Assets/HoloToolkit/Input/Scripts/InputEvents/ISpeechHandler.cs.meta

Lines changed: 12 additions & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.
Lines changed: 55 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,55 @@
1+
// Copyright (c) Microsoft Corporation. All rights reserved.
2+
// Licensed under the MIT License. See LICENSE in the project root for license information.
3+
4+
using System;
5+
using UnityEngine;
6+
using UnityEngine.EventSystems;
7+
using UnityEngine.Windows.Speech;
8+
9+
namespace HoloToolkit.Unity.InputModule
10+
{
11+
/// <summary>
12+
/// Describes an input event that involves keyword recognition.
13+
/// </summary>
14+
public class SpeechKeywordRecognizedEventData : InputEventData
15+
{
16+
/// <summary>
17+
/// A measure of correct recognition certainty.
18+
/// </summary>
19+
public ConfidenceLevel Confidence { get; private set; }
20+
21+
/// <summary>
22+
/// The time it took for the phrase to be uttered.
23+
/// </summary>
24+
public TimeSpan PhraseDuration { get; private set; }
25+
26+
/// <summary>
27+
/// The moment in time when uttering of the phrase began.
28+
/// </summary>
29+
public DateTime PhraseStartTime { get; private set; }
30+
31+
/// <summary>
32+
/// A semantic meaning of recognized phrase.
33+
/// </summary>
34+
public SemanticMeaning[] SemanticMeanings { get; private set; }
35+
36+
/// <summary>
37+
/// The text that was recognized.
38+
/// </summary>
39+
public string RecognizedText { get; private set; }
40+
41+
public SpeechKeywordRecognizedEventData(EventSystem eventSystem) : base(eventSystem)
42+
{
43+
}
44+
45+
public void Initialize(IInputSource inputSource, uint sourceId, ConfidenceLevel confidence, TimeSpan phraseDuration, DateTime phraseStartTime, SemanticMeaning[] semanticMeanings, string recognizedText)
46+
{
47+
BaseInitialize(inputSource, sourceId);
48+
Confidence = confidence;
49+
PhraseDuration = phraseDuration;
50+
PhraseStartTime = phraseStartTime;
51+
SemanticMeanings = semanticMeanings;
52+
RecognizedText = recognizedText;
53+
}
54+
}
55+
}

Assets/HoloToolkit/Input/Scripts/InputEvents/SpeechKeywordRecognizedEventData.cs.meta

Lines changed: 12 additions & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

Assets/HoloToolkit/Input/Scripts/InputManager.cs

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -33,6 +33,7 @@ public class InputManager : Singleton<InputManager>
3333
private ManipulationEventData manipulationEventData;
3434
private NavigationEventData navigationEventData;
3535
private HoldEventData holdEventData;
36+
private SpeechKeywordRecognizedEventData speechKeywordRecognizedEventData;
3637

3738
/// <summary>
3839
/// Indicates if input is currently enabled or not.
@@ -185,6 +186,7 @@ public void RegisterInputSource(IInputSource inputSource)
185186
inputSource.NavigationCompleted += InputSource_NavigationCompleted;
186187
inputSource.NavigationStarted += InputSource_NavigationStarted;
187188
inputSource.NavigationUpdated += InputSource_NavigationUpdated;
189+
inputSource.SpeechKeywordRecognized += InputSource_SpeechKeywordRecognized;
188190
}
189191

190192
/// <summary>
@@ -210,6 +212,7 @@ public void UnregisterInputSource(IInputSource inputSource)
210212
inputSource.NavigationCompleted -= InputSource_NavigationCompleted;
211213
inputSource.NavigationStarted -= InputSource_NavigationStarted;
212214
inputSource.NavigationUpdated -= InputSource_NavigationUpdated;
215+
inputSource.SpeechKeywordRecognized -= InputSource_SpeechKeywordRecognized;
213216
}
214217

215218
private void Start()
@@ -231,6 +234,7 @@ private void InitializeEventDatas()
231234
manipulationEventData = new ManipulationEventData(EventSystem.current);
232235
navigationEventData = new NavigationEventData(EventSystem.current);
233236
holdEventData = new HoldEventData(EventSystem.current);
237+
speechKeywordRecognizedEventData = new SpeechKeywordRecognizedEventData(EventSystem.current);
234238
}
235239

236240
protected override void OnDestroy()
@@ -653,5 +657,21 @@ private void InputSource_NavigationCanceled(object sender, NavigationEventArgs e
653657
// Pass handler through HandleEvent to perform modal/fallback logic
654658
HandleEvent(navigationEventData, OnNavigationCanceledEventHandler);
655659
}
660+
661+
private static readonly ExecuteEvents.EventFunction<ISpeechHandler> OnSpeechKeywordRecognizedEventHandler =
662+
delegate (ISpeechHandler handler, BaseEventData eventData)
663+
{
664+
SpeechKeywordRecognizedEventData casted = ExecuteEvents.ValidateEventData<SpeechKeywordRecognizedEventData>(eventData);
665+
handler.OnSpeechKeywordRecognized(casted);
666+
};
667+
668+
private void InputSource_SpeechKeywordRecognized(object sender, SpeechKeywordRecognizedEventArgs e)
669+
{
670+
// Create input event
671+
speechKeywordRecognizedEventData.Initialize(e.InputSource, e.SourceId, e.Confidence, e.PhraseDuration, e.PhraseStartTime, e.SemanticMeanings, e.RecognizedText);
672+
673+
// Pass handler through HandleEvent to perform modal/fallback logic
674+
HandleEvent(speechKeywordRecognizedEventData, OnSpeechKeywordRecognizedEventHandler);
675+
}
656676
}
657677
}

Assets/HoloToolkit/Input/Scripts/InputSources/BaseInputSource.cs

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,7 @@ public abstract class BaseInputSource : MonoBehaviour, IInputSource
2727
public event EventHandler<NavigationEventArgs> NavigationUpdated;
2828
public event EventHandler<NavigationEventArgs> NavigationCompleted;
2929
public event EventHandler<NavigationEventArgs> NavigationCanceled;
30+
public event EventHandler<SpeechKeywordRecognizedEventArgs> SpeechKeywordRecognized;
3031

3132
public abstract SupportedInputEvents SupportedEvents { get; }
3233

@@ -251,6 +252,15 @@ protected void RaiseNavigationCanceledEvent(NavigationEventArgs e)
251252
}
252253
}
253254

255+
protected void RaiseSpeechKeywordRecognizedEvent(SpeechKeywordRecognizedEventArgs e)
256+
{
257+
EventHandler<SpeechKeywordRecognizedEventArgs> handler = SpeechKeywordRecognized;
258+
if (handler != null)
259+
{
260+
handler(this, e);
261+
}
262+
}
263+
254264
#endregion
255265
}
256266
}

Assets/HoloToolkit/Input/Scripts/InputSources/IInputSource.cs

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,8 @@ public enum SupportedInputEvents
1616
SourceClicked = 2,
1717
Hold = 4,
1818
Manipulation = 8,
19-
Navigation = 16
19+
Navigation = 16,
20+
SpeechKeyword = 32
2021
}
2122

2223
/// <summary>
@@ -118,6 +119,11 @@ public interface IInputSource
118119
/// </summary>
119120
event EventHandler<NavigationEventArgs> NavigationCanceled;
120121

122+
/// <summary>
123+
/// Event triggered when a speech phrase is recognized.
124+
/// </summary>
125+
event EventHandler<SpeechKeywordRecognizedEventArgs> SpeechKeywordRecognized;
126+
121127
/// <summary>
122128
/// Events supported by the input source.
123129
/// </summary>
Lines changed: 49 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,49 @@
1+
// Copyright (c) Microsoft Corporation. All rights reserved.
2+
// Licensed under the MIT License. See LICENSE in the project root for license information.
3+
4+
using System;
5+
using UnityEngine;
6+
using UnityEngine.Windows.Speech;
7+
8+
namespace HoloToolkit.Unity.InputModule
9+
{
10+
/// <summary>
11+
/// Event args for a speech kwyword recognized event.
12+
/// </summary>
13+
public class SpeechKeywordRecognizedEventArgs : InputSourceEventArgs
14+
{
15+
/// <summary>
16+
/// A measure of correct recognition certainty.
17+
/// </summary>
18+
public ConfidenceLevel Confidence { get; private set; }
19+
20+
/// <summary>
21+
/// The time it took for the phrase to be uttered.
22+
/// </summary>
23+
public TimeSpan PhraseDuration { get; private set; }
24+
25+
/// <summary>
26+
/// The moment in time when uttering of the phrase began.
27+
/// </summary>
28+
public DateTime PhraseStartTime { get; private set; }
29+
30+
/// <summary>
31+
/// A semantic meaning of recognized phrase.
32+
/// </summary>
33+
public SemanticMeaning[] SemanticMeanings { get; private set; }
34+
35+
/// <summary>
36+
/// The text that was recognized.
37+
/// </summary>
38+
public string RecognizedText { get; private set; }
39+
40+
public SpeechKeywordRecognizedEventArgs(IInputSource inputSource, uint sourceId, ConfidenceLevel confidence, TimeSpan phraseDuration, DateTime phraseStartTime, SemanticMeaning[] semanticMeanings, string recognizedText) : base(inputSource, sourceId)
41+
{
42+
Confidence = confidence;
43+
PhraseDuration = phraseDuration;
44+
PhraseStartTime = phraseStartTime;
45+
SemanticMeanings = semanticMeanings;
46+
RecognizedText = recognizedText;
47+
}
48+
}
49+
}

Assets/HoloToolkit/Input/Scripts/InputSources/SpeechKeywordRecognizedEventArgs.cs.meta

Lines changed: 12 additions & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

0 commit comments

Comments
 (0)