Skip to content

Commit cca6ea6

Browse files
mamaheuxphilippewarrenMariamFdil
authored
Add Piper TTS and, Whisper ASR (#80)
* Modified the process of CachedVoiceGenerator * Add the piper_ros package. * Added voice type * Added gender to launch file * Add ONNX Runtime install steps. * Update the demos strings to make them compatible with piper_ros. * Remove non-talk related changes. * Update the talk node to use piper. Update launch files. * Add a parameter to use the GPU or not. * Add the models license. * Format * Add the VAD. Add whisper_speecg_to_text.py. * Fix control_panel and launch files. * Update audio_utils. * Fix the Jetson Orin name in the main readme. Fix the device name for the cpu. Fix high CPU usage after the transcription. * Update talk_node.py * Update WeatherForecastState.cpp * Update README.md * Update vad.launch * Update README.md * Update jetson_configuration.sh * Update 01_COMPUTER_CONFIGURATION.md * Update jetson_configuration.sh * Update jetson_configuration.sh * Fix Jetson setup. * Update 01_COMPUTER_CONFIGURATION.md --------- Co-authored-by: philippewarren <philippewarren31@gmail.com> Co-authored-by: MariamFdil <fdil.mariam@gmail.com>
1 parent 992a673 commit cca6ea6

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

56 files changed

+2983
-141
lines changed

README.md

Lines changed: 25 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -24,31 +24,31 @@ interacting with people.
2424

2525
## Features
2626

27-
| Category | Type | Description |
28-
| ---------------- | ----------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
29-
| Power | Power Adapter | 19 V |
30-
| | Battery | 1x [RRC2054-2](https://www.rrc-ps.com/en/battery-packs/standard-battery-packs/products/RRC2054-2) |
31-
| | Battery Charger | 1x [RRC-PMM240](https://www.rrc-ps.com/en/battery-packs/standard-battery-packs/products/RRC-PMM240) |
32-
| Sensors | Microphone Array | 16x [xSoundsMicrophones](https://github.com/introlab/xSoundsMicrophones), 1x [16SoundsUSB](https://github.com/introlab/16SoundsUSB) |
33-
| | RGB-D Camera | 1x [Intel RealSense D435i](https://www.intelrealsense.com/depth-camera-d435i/) |
34-
| | Wide Angle Camera | 1x [Arducam AR0230](https://www.uctronics.com/arducam-1080p-hd-wide-angle-wdr-usb-camera-module-for-computer-2mp-1-2-7-cmos-ar0230-100-degree-mini-uvc-usb2-0-spy-webcam-board-with-3-3ft-1m-cable-for-windows-linux-mac-os-android.html) |
35-
| | Touchscreen | 1x 7 inch 1024x600 capacitive touchscreen |
36-
| | Current/Voltage | [INA220](https://www.ti.com/product/INA220) |
37-
| | Light Sensors | 4x [Adafruit ALS-PT19 ](https://www.adafruit.com/product/2748) |
38-
| | Buttons | 4x buttons |
39-
| Actuators | Stewart Platform | Displacement range: ±3 cm (x, y and z), ±20° (x and y), ±30° (z). Motor: [Dynamixel XL430-W250](https://emanual.robotis.com/docs/en/dxl/x/xl430-w250/) | |
40-
| | Rotating Base | Displacement range: illimited. Motor: [Dynamixel XL430-W250](https://emanual.robotis.com/docs/en/dxl/x/xl430-w250/) |
41-
| | Speakers | 4x [Dayton Audio DMA45-8](https://www.daytonaudio.com/product/1613/dma45-8-1-1-2-dual-magnet-aluminum-cone-full-range-driver-8-ohm), 2x [MAX9744](https://www.adafruit.com/product/1752) |
42-
| | Cooling | 2x [Noctua NF-A4x20 5V](https://noctua.at/en/products/fan/nf-a4x20-5v) |
43-
| | Touchscreen | 1x 7 inch 1024x600 capacitive touchscreen |
44-
| | LED | Battery status, volume level, led strip |
45-
| Network | WiFi | Intel Dual Band Wireless-AC 8265 NGW |
46-
| | Ethernet | 100 Mbps |
47-
| Processing | Computer | [NVIDIA Jetson AGX Xavier Developer Kit](https://developer.nvidia.com/embedded/jetson-agx-xavier-developer-kit) or [NVIDIA Jetson AGX Xavier Orin Kit](https://developer.nvidia.com/embedded/jetson-agx-orin-developer-kit) |
48-
| | Motor MCU | [Teensy 4.0](https://www.pjrc.com/store/teensy40.html) |
49-
| | Battery MCU | [Teensy LC](https://www.pjrc.com/teensy/teensyLC.html) |
50-
| Perceptions | | SLAM, Object detection, person pose estimation, face recognition, sound classification, speaker identification, robot name detection, speech to text, person identification, music beat detection, source source localization, ego noise reduction |
51-
| Behaviors | | Telepresence, emotions, talking, greeting, face following, dancing, exploring, sound following |
27+
| Category | Type | Description |
28+
| ---------------- | ----------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
29+
| Power | Power Adapter | 19 V |
30+
| | Battery | 1x [RRC2054-2](https://www.rrc-ps.com/en/battery-packs/standard-battery-packs/products/RRC2054-2) |
31+
| | Battery Charger | 1x [RRC-PMM240](https://www.rrc-ps.com/en/battery-packs/standard-battery-packs/products/RRC-PMM240) |
32+
| Sensors | Microphone Array | 16x [xSoundsMicrophones](https://github.com/introlab/xSoundsMicrophones), 1x [16SoundsUSB](https://github.com/introlab/16SoundsUSB) |
33+
| | RGB-D Camera | 1x [Intel RealSense D435i](https://www.intelrealsense.com/depth-camera-d435i/) |
34+
| | Wide Angle Camera | 1x [Arducam AR0230](https://www.uctronics.com/arducam-1080p-hd-wide-angle-wdr-usb-camera-module-for-computer-2mp-1-2-7-cmos-ar0230-100-degree-mini-uvc-usb2-0-spy-webcam-board-with-3-3ft-1m-cable-for-windows-linux-mac-os-android.html) |
35+
| | Touchscreen | 1x 7 inch 1024x600 capacitive touchscreen |
36+
| | Current/Voltage | [INA220](https://www.ti.com/product/INA220) |
37+
| | Light Sensors | 4x [Adafruit ALS-PT19 ](https://www.adafruit.com/product/2748) |
38+
| | Buttons | 4x buttons |
39+
| Actuators | Stewart Platform | Displacement range: ±3 cm (x, y and z), ±20° (x and y), ±30° (z). Motor: [Dynamixel XL430-W250](https://emanual.robotis.com/docs/en/dxl/x/xl430-w250/) | |
40+
| | Rotating Base | Displacement range: illimited. Motor: [Dynamixel XL430-W250](https://emanual.robotis.com/docs/en/dxl/x/xl430-w250/) |
41+
| | Speakers | 4x [Dayton Audio DMA45-8](https://www.daytonaudio.com/product/1613/dma45-8-1-1-2-dual-magnet-aluminum-cone-full-range-driver-8-ohm), 2x [MAX9744](https://www.adafruit.com/product/1752) |
42+
| | Cooling | 2x [Noctua NF-A4x20 5V](https://noctua.at/en/products/fan/nf-a4x20-5v) |
43+
| | Touchscreen | 1x 7 inch 1024x600 capacitive touchscreen |
44+
| | LED | Battery status, volume level, led strip |
45+
| Network | WiFi | Intel Dual Band Wireless-AC 8265 NGW |
46+
| | Ethernet | 100 Mbps |
47+
| Processing | Computer | [NVIDIA Jetson AGX Xavier Developer Kit](https://developer.nvidia.com/embedded/jetson-agx-xavier-developer-kit) or [NVIDIA Jetson AGX Xavier Orin Developer Kit](https://developer.nvidia.com/embedded/jetson-agx-orin-developer-kit) |
48+
| | Motor MCU | [Teensy 4.0](https://www.pjrc.com/store/teensy40.html) |
49+
| | Battery MCU | [Teensy LC](https://www.pjrc.com/teensy/teensyLC.html) |
50+
| Perceptions | | SLAM, object detection, person pose estimation, face recognition, sound classification, speaker identification, robot name detection, speech to text, person identification, music beat detection, source source localization, ego noise reduction, vad |
51+
| Behaviors | | Telepresence, emotions, talking, greeting, face following, dancing, exploring, sound following |
5252

5353
## Repository Structure
5454

documentation/assembly/01_COMPUTER_CONFIGURATION.md

Lines changed: 47 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -385,9 +385,9 @@ git clone -b ros1 https://github.com/ros-planning/navigation_msgs.git
385385
git clone -b noetic https://github.com/ros-perception/vision_opencv.git
386386
git clone -b noetic-devel https://github.com/ros-perception/image_common.git
387387

388-
clone_git -b 1.7.1 https://github.com/ros-perception/perception_pcl.git
389-
clone_git -b noetic-devel https://github.com/ros-perception/pcl_msgs.git
390-
clone_git -b noetic-devel https://github.com/ros-perception/image_transport_plugins.git
388+
git clone -b 1.7.1 https://github.com/ros-perception/perception_pcl.git
389+
git clone -b noetic-devel https://github.com/ros-perception/pcl_msgs.git
390+
git clone -b noetic-devel https://github.com/ros-perception/image_transport_plugins.git
391391

392392

393393
cd ~/ros_catkin_ws
@@ -438,7 +438,35 @@ sudo apt install -y \
438438
gstreamer1.0-plugins-bad \
439439
gstreamer1.0-plugins-ugly \
440440
gstreamer1.0-libav \
441-
gstreamer1.0-tools
441+
gstreamer1.0-tools \
442+
libspdlog-dev \
443+
scons
444+
445+
# Install onnxruntime
446+
cd ~/deps
447+
git clone https://github.com/microsoft/onnxruntime.git -b v1.14.1 --recurse-submodules
448+
cd onnxruntime
449+
./build.sh --config Release --update --build --parallel --build_wheel --build_shared_lib --use_tensorrt --cuda_home /usr/local/cuda --cudnn_home /usr/lib/aarch64-linux-gnu --tensorrt_home /usr/lib/aarch64-linux-gnu
450+
cd build/Linux/Release
451+
sudo make install
452+
453+
# Install ComputeLibrary
454+
cd ~/deps
455+
git clone --depth 1 -b v22.11 https://github.com/ARM-software/ComputeLibrary.git
456+
cd ComputeLibrary
457+
scons Werror=1 -j8 debug=0 neon=1 opencl=0 os=linux arch=armv8a build=native
458+
mv build lib
459+
460+
# Install onDNN
461+
cd ~/deps
462+
git clone --depth 1 -b v3.2.1 https://github.com/oneapi-src/oneDNN.git
463+
cd oneDNN
464+
mkdir build
465+
cd build
466+
export ACL_ROOT_DIR=~/deps/ComputeLibrary
467+
cmake .. -DCMAKE_BUILD_TYPE=Release -DCMAKE_CXX_FLAGS="-march=native -ffast-math" -DCMAKE_C_FLAGS="-march=native -ffast-math -DDNNL_AARCH64_USE_ACL=ON"
468+
cmake --build . -j4
469+
sudo cmake --install .
442470
```
443471

444472
### M. Install Python Dependencies
@@ -464,6 +492,21 @@ sudo apt install -y \
464492
sudo -H pip3 install 'cython>=0.29.22,<0.30.0'
465493
sudo -H pip3 install -r ~/t-top_ws/src/t-top/tools/setup_scripts/files/requirements.txt
466494

495+
#Install CTranslate2
496+
cd ~/deps
497+
git clone --depth 1 -b v3.20.0 https://github.com/OpenNMT/CTranslate2.git --recurse-submodule
498+
cd CTranslate2
499+
mkdir build
500+
cd build
501+
cmake .. -DCMAKE_BUILD_TYPE=Release -DCMAKE_CXX_FLAGS="-march=native -ffast-math" -DCMAKE_C_FLAGS="-march=native -ffast-math" -DWITH_MKL=OFF -DWITH_CUDA=ON -DWITH_CUDNN=ON -DWITH_OPENBLAS=ON -DWITH_DNNL=ON -DWITH_RUY=ON
502+
cmake --build . -j4
503+
sudo cmake --install .
504+
sudo ldconfig
505+
cd ../python
506+
sudo -H pip3 install -r install_requirements.txt
507+
python3 setup.py bdist_wheel
508+
sudo -H pip3 install dist/*.whl --no-deps --force-reinstall
509+
467510
# Install PyTorch for Jetson
468511
cd ~/deps
469512
wget https://developer.download.nvidia.com/compute/redist/jp/v50/pytorch/torch-1.12.0a0+2c916ef.nv22.3-cp38-cp38-linux_aarch64.whl

0 commit comments

Comments
 (0)