|
| 1 | +# Sample: Using Microsoft Cognitive Speech Service in UniMRCP |
| 2 | + |
| 3 | +This sample demonstrates how to use cognitive speech service (both SR and TTS) in [UniMRCP](http://www.unimrcp.org/) plugins. |
| 4 | + |
| 5 | +## Prerequisites |
| 6 | + |
| 7 | +* A subscription key for the Speech service. See [Try the speech service for free](https://docs.microsoft.com/azure/cognitive-services/speech-service/get-started). |
| 8 | +* An Ubuntu 18.04 or Windows PC with Visual Studio |
| 9 | +* On Ubuntu, install these packages to build and run this sample: |
| 10 | + |
| 11 | + ```sh |
| 12 | + sudo apt update |
| 13 | + sudo apt install build-essential libssl1.0.0 libasound2 wget |
| 14 | + sudo apt install pkg-config automake libtool libtool-bin |
| 15 | + sudo apt install libpoco-dev rapidjson-dev |
| 16 | + ``` |
| 17 | + |
| 18 | +* On Windows, install and config [poco](https://pocoproject.org/) and `rapidjson`. The recommended method is to use [vcpkg](https://github.com/microsoft/vcpkg). |
| 19 | + |
| 20 | + ```bash |
| 21 | + vcpkg.exe install poco rapidjson |
| 22 | + ``` |
| 23 | + |
| 24 | +## Build the sample |
| 25 | + |
| 26 | +* Download and compile UniMRCP 1.6.0 on your PC, following the [installation instructions](http://www.unimrcp.org/index.php/project/get-started). |
| 27 | + |
| 28 | +* Clone or download the sample code to your development PC. |
| 29 | + |
| 30 | +* Copy the `ms-recog`, `ms-synth` and `ms-common` folders to `your_unimrcp_path/plugins` |
| 31 | + |
| 32 | +### On Linux |
| 33 | + |
| 34 | +* Download and extract the Speech SDK |
| 35 | + * **By downloading the Microsoft Cognitive Services Speech SDK, you acknowledge its license, see [Speech SDK license agreement](https://aka.ms/csspeech/license201809).** |
| 36 | + * Run the following commands after replacing the string `/your/path` with a directory (absolute path) of your choice: |
| 37 | + |
| 38 | + ```sh |
| 39 | + export SPEECHSDK_ROOT="/your/path" |
| 40 | + mkdir -p "$SPEECHSDK_ROOT" |
| 41 | + wget -O SpeechSDK-Linux.tar.gz https://aka.ms/csspeech/linuxbinary |
| 42 | + tar --strip 1 -xzf SpeechSDK-Linux.tar.gz -C "$SPEECHSDK_ROOT" |
| 43 | + ``` |
| 44 | + |
| 45 | +* Navigate to the directory of UniMRCP |
| 46 | +* Open the file `configure.ac`, and add the following lines in correct position: |
| 47 | + |
| 48 | + ```shell |
| 49 | + dnl MS recognizer plugin. |
| 50 | + UNI_PLUGIN_ENABLED(msrecog) |
| 51 | +
|
| 52 | + AM_CONDITIONAL([MSRECOG_PLUGIN],[test "${enable_msrecog_plugin}" = "yes"]) |
| 53 | +
|
| 54 | + dnl MS synthesizer plugin. |
| 55 | + UNI_PLUGIN_ENABLED(mssynth) |
| 56 | +
|
| 57 | + AM_CONDITIONAL([MSSYNTH_PLUGIN],[test "${enable_mssynth_plugin}" = "yes"]) |
| 58 | +
|
| 59 | + ... |
| 60 | +
|
| 61 | + plugins/ms-recog/Makefile |
| 62 | + plugins/ms-synth/Makefile |
| 63 | +
|
| 64 | + ... |
| 65 | +
|
| 66 | + echo MS recognizer plugin.......... : $enable_msrecog_plugin |
| 67 | + echo MS synthesizer plugin......... : $enable_mssynth_plugin |
| 68 | + ``` |
| 69 | + |
| 70 | +* Edit the file `plugins/Makefile.am` and add this: |
| 71 | + |
| 72 | + ```shell |
| 73 | + if MSRECOG_PLUGIN |
| 74 | + SUBDIRS += ms-recog |
| 75 | + endif |
| 76 | +
|
| 77 | + if MSSYNTH_PLUGIN |
| 78 | + SUBDIRS += ms-synth |
| 79 | + endif |
| 80 | + ``` |
| 81 | +
|
| 82 | +* Recompile and install the UniMRCP project. |
| 83 | +
|
| 84 | +### On Windows |
| 85 | +
|
| 86 | +* Add `ms-recog`, `ms-synth` and `ms-common` projects into the `unimrcp-2010.sln` solution. |
| 87 | + |
| 88 | +* Restore the nuget packages. |
| 89 | +
|
| 90 | +* Recompile the solution in `Release` and `x64` mode. |
| 91 | +
|
| 92 | +## Run the sample |
| 93 | +
|
| 94 | +* Edit the `unimrcpserver.xml` config file (in `/usr/local/unimrcp/conf` on Linux and `Porject_folder\x64\Release\conf`). |
| 95 | +
|
| 96 | + * In `<plugin-factory>` section, add configure for the plugins, and disable demo plugins: |
| 97 | +
|
| 98 | + ```xml |
| 99 | + <engine id="Demo-Synth-1" name="demosynth" enable="false"/> |
| 100 | + <engine id="MS-Synth-1" name="mssynth" enable="true"/> |
| 101 | + <engine id="Demo-Recog-1" name="demorecog" enable="false"/> |
| 102 | + <engine id="MS-Recog-1" name="msrecog" enable="true"/> |
| 103 | + ``` |
| 104 | +
|
| 105 | + * Add 16kHz/16bit Codec support for higher accuracy and quality. |
| 106 | +
|
| 107 | + ```xml |
| 108 | + <codecs own-preference="false">PCMU/97/16000 PCMA/98/16000 L16/99/16000 PCMU PCMA L16/96/8000 telephone-event/101/8000</codecs> |
| 109 | + ``` |
| 110 | +
|
| 111 | + * Other settings like IP address and port. |
| 112 | +
|
| 113 | +* Add a conf file `config.json` in the same folder of `unimrcpserver.xml`, the sample file is in `sample-conf` path of this project. Remember to replace `YourSubscriptionKey` and `YourServiceRegion` with your own key and region of you subscription. |
| 114 | +
|
| 115 | +### On Linux |
| 116 | +
|
| 117 | +To run the sample, you'll need to configure the loader's library path to point to the Speech SDK library. |
| 118 | +
|
| 119 | +* On an `x64` machine, run: |
| 120 | +
|
| 121 | + ```sh |
| 122 | + export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:$SPEECHSDK_ROOT/lib/x64" |
| 123 | + ``` |
| 124 | +
|
| 125 | +* On an `x86` machine, run: |
| 126 | +
|
| 127 | + ```sh |
| 128 | + export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:$SPEECHSDK_ROOT/lib/x86" |
| 129 | + ``` |
| 130 | +
|
| 131 | +Run the application: |
| 132 | +
|
| 133 | + ```sh |
| 134 | + /usr/local/unimrcp/bin/unimrcpserver |
| 135 | + ``` |
| 136 | +
|
| 137 | +### On Windows |
| 138 | +
|
| 139 | +Run the executable file `Porject_folder\x64\Release\bin\unimrcpserver.exe`. |
| 140 | +
|
| 141 | +## Build and Run with docker |
| 142 | +
|
| 143 | +You can use docker to build and deploy the UniMRCP with Microsoft SR/TTS plugins easily with docker. |
| 144 | +
|
| 145 | +To build the docker, run: |
| 146 | +
|
| 147 | + ```shell |
| 148 | + ./docker/build.sh |
| 149 | + ``` |
| 150 | +
|
| 151 | +Then, run following script to start the UniMRCP server: |
| 152 | +
|
| 153 | +*Note: it's better to use the host network to run the container, for easy configuration and high network efficiency.* |
| 154 | +
|
| 155 | + ```shell |
| 156 | + docker run -dt -v ~/conf:/usr/local/unimrcp/conf --network=host unimrcp_ms:latest --name unimrcp_ms |
| 157 | + ``` |
| 158 | +
|
| 159 | +The `~/conf` folder contains configure files of UniMRCP and the Microsoft plugins. You can find the sample configure files in `sample-conf` of this project. |
| 160 | +Remember to replace `YourSubscriptionKey` and `YourServiceRegion` with your own key and region of you subscription. |
| 161 | +
|
| 162 | +## Test |
| 163 | +
|
| 164 | +For easily test, you can use `umc` in the UniMRCP project. |
| 165 | +In `umc` console, type `run recog` and `run synth` for testing SR and TTS, respectively. |
| 166 | +
|
| 167 | +*Tips: remember to correct configurations for codec. The configs in `sample-conf` are tested, you can use them after configuring the IP.* |
| 168 | +
|
| 169 | +If you want to test the performance of the MRCP as well as the plugins, a easy way is to adopt `umc` to call SR/TTS continuously and concurrently. |
| 170 | +Change the `UmcConsole::RunCmdLine()` method in `umcconsole.cpp` with |
| 171 | +
|
| 172 | + ```c++ |
| 173 | + bool UmcConsole::RunCmdLine() |
| 174 | + { |
| 175 | + for(int i = 0; i < 100; i++) |
| 176 | + { |
| 177 | + char cmdline[1024] = "run recog\0"; |
| 178 | + std::this_thread::sleep_for(std::chrono::milliseconds(1000)); |
| 179 | + ProcessCmdLine(cmdline); |
| 180 | + } |
| 181 | + std::this_thread::sleep_for(std::chrono::seconds(20)); |
| 182 | + return true; |
| 183 | + } |
| 184 | + ``` |
| 185 | + |
| 186 | +Also append `#include <thread>` and `#include <chrono>` in the header. |
| 187 | +You can change the repeat time and interval to run your own test. |
| 188 | +Then, you can use the python scripts in `test_scripts` to analyze the log file to get the latency. |
| 189 | +
|
| 190 | +
|
| 191 | +Alternatively, you can use FreeSWITCH to make a real call to test the plugins. See [this]("./test_with_freeswitch.md) for details. |
| 192 | +
|
| 193 | +## References |
| 194 | +
|
| 195 | +* [UniMRCP](http://www.unimrcp.org/) |
| 196 | +* [Speech SDK API reference for C++](https://aka.ms/csspeech/cppref) |
0 commit comments