Skip to content

use newer voicevox for supporting ちび式じい#537

Closed
mqcmd196 wants to merge 12 commits intojsk-ros-pkg:masterfrom
mqcmd196:PR/new-voicevox
Closed

use newer voicevox for supporting ちび式じい#537
mqcmd196 wants to merge 12 commits intojsk-ros-pkg:masterfrom
mqcmd196:PR/new-voicevox

Conversation

@mqcmd196
Copy link
Member

@mqcmd196 mqcmd196 commented Apr 24, 2025

close #535

@mqcmd196 mqcmd196 marked this pull request as ready for review April 24, 2025 08:13
@mqcmd196 mqcmd196 requested a review from iory April 24, 2025 08:13
@mqcmd196 mqcmd196 requested a review from sawada10 April 24, 2025 08:14
@mqcmd196
Copy link
Member Author

@sawada10

Please try this.

@mqcmd196
Copy link
Member Author

NOTE: The patch is not needed on Ubuntu 22.04, 24.04.

@sawada10
Copy link
Contributor

ありがとうございます。
以下の手順で実行したのですが、エラーが出ています。
明日もう一度落ち着いてデバッグ仕様とは思うのすが、一旦今の状況を共有します。

<実行手順>

  1. これまでの実行してきたものの残りを削除する
    catkin clean voicevoxに加え、.gitignore以下のものは自動生成かと思い、一旦ホームディレクトリに退避させた

  2. buildしてlaunchを立ち上げる

catkin build voicevox
roslaunch sound_play soundplay_node.launch
source ~/kashiwagi_ws/devel/setup.bash
roslaunch roslaunch voicevox voicevox_texttospeech.launch

以下のようなエラーが出る

$ roslaunch voicevox voicevox_texttospeech.launch
... logging to /home/leus/.ros/log/d877b714-2103-11f0-bcf4-4fc2871093a1/roslaunch-leus-ThinkPad-T480s-486405.log
Checking log directory for disk usage. This may take a while.
Press Ctrl-C to interrupt
Done checking log file disk usage. Usage is <1GB.

started roslaunch server http://leus-ThinkPad-T480s:43669/

SUMMARY
========

PARAMETERS
 * /rosdistro: noetic
 * /rosversion: 1.16.0
 * /sound_play_jp/default_voice: 2
 * /voicevox_server/cpu_num_threads: 1

NODES
  /
    sound_play_jp (sound_play/soundplay_node.py)
    voicevox_server (voicevox/server.py)

auto-starting new master
process[master]: started with pid [486420]
ROS_MASTER_URI=http://localhost:11311

setting /run_id to d877b714-2103-11f0-bcf4-4fc2871093a1
process[rosout-1]: started with pid [486437]
started core service [/rosout]
process[voicevox_server-2]: started with pid [486444]
process[sound_play_jp-3]: started with pid [486445]
/home/leus/kashiwagi_ws/src/jsk_3rdparty/3rdparty/voicevox/lib
/opt/ros/noetic/lib/sound_play/soundplay_node.py:330: PyGIDeprecationWarning: Since version 3.11, calling threads_init is no longer needed. See: https://wiki.gnome.org/PyGObject/Threading
  GObject.threads_init()
/opt/ros/noetic/lib/sound_play/soundplay_node.py:331: PyGIDeprecationWarning: GObject.MainLoop is deprecated; use GLib.MainLoop instead
  self.g_loop = threading.Thread(target=GObject.MainLoop().run)
[rosrun] Couldn't find executable named run.py below /home/leus/kashiwagi_ws/src/jsk_3rdparty/3rdparty/voicevox
[rosrun] Found the following, but they're either not files,
[rosrun] or not executable:
[rosrun]   /home/leus/kashiwagi_ws/src/jsk_3rdparty/3rdparty/voicevox/node_scripts/voicevox_engine/run.py
[rosrun]   /home/leus/kashiwagi_ws/src/jsk_3rdparty/3rdparty/voicevox/node_scripts/run.py
[voicevox_server-2] process has finished cleanly
log file: /home/leus/.ros/log/d877b714-2103-11f0-bcf4-4fc2871093a1/voicevox_server-2*.log
[voicevox_server-2] restarting process
process[voicevox_server-2]: started with pid [486492]
[INFO] [1745496089.167511]: Loading from plugin definitions
/home/leus/kashiwagi_ws/src/jsk_3rdparty/3rdparty/voicevox/lib
[INFO] [1745496089.677032]: sound_play node is ready to play sound
[rosrun] Couldn't find executable named run.py below /home/leus/kashiwagi_ws/src/jsk_3rdparty/3rdparty/voicevox
[rosrun] Found the following, but they're either not files,
[rosrun] or not executable:
[rosrun]   /home/leus/kashiwagi_ws/src/jsk_3rdparty/3rdparty/voicevox/node_scripts/voicevox_engine/run.py
[rosrun]   /home/leus/kashiwagi_ws/src/jsk_3rdparty/3rdparty/voicevox/node_scripts/run.py
[voicevox_server-2] process has finished cleanly
log file: /home/leus/.ros/log/d877b714-2103-11f0-bcf4-4fc2871093a1/voicevox_server-2*.log
[voicevox_server-2] restarting process
process[voicevox_server-2]: started with pid [486524]
/home/leus/kashiwagi_ws/src/jsk_3rdparty/3rdparty/voicevox/lib
[rosrun] Couldn't find executable named run.py below /home/leus/kashiwagi_ws/src/jsk_3rdparty/3rdparty/voicevox
[rosrun] Found the following, but they're either not files,
[rosrun] or not executable:
[rosrun]   /home/leus/kashiwagi_ws/src/jsk_3rdparty/3rdparty/voicevox/node_scripts/voicevox_engine/run.py
[rosrun]   /home/leus/kashiwagi_ws/src/jsk_3rdparty/3rdparty/voicevox/node_scripts/run.py
[voicevox_server-2] process has finished cleanly
log file: /home/leus/.ros/log/d877b714-2103-11f0-bcf4-4fc2871093a1/voicevox_server-2*.log
[voicevox_server-2] restarting process
process[voicevox_server-2]: started with pid [486552]
/home/leus/kashiwagi_ws/src/jsk_3rdparty/3rdparty/voicevox/lib
[rosrun] Couldn't find executable named run.py below /home/leus/kashiwagi_ws/src/jsk_3rdparty/3rdparty/voicevox
[rosrun] Found the following, but they're either not files,
[rosrun] or not executable:
[rosrun]   /home/leus/kashiwagi_ws/src/jsk_3rdparty/3rdparty/voicevox/node_scripts/voicevox_engine/run.py
[rosrun]   /home/leus/kashiwagi_ws/src/jsk_3rdparty/3rdparty/voicevox/node_scripts/run.py
[voicevox_server-2] process has finished cleanly
log file: /home/leus/.ros/log/d877b714-2103-11f0-bcf4-4fc2871093a1/voicevox_server-2*.log
[voicevox_server-2] restarting process
process[voicevox_server-2]: started with pid [486579]
^C/home/leus/kashiwagi_ws/src/jsk_3rdparty/3rdparty/voicevox/lib
[sound_play_jp-3] killing on exit
[voicevox_server-2] killing on exit
Traceback (most recent call last):
  File "/home/leus/kashiwagi_ws/src/jsk_3rdparty/3rdparty/voicevox/node_scripts/server.py", line 30, in <module>
    main()
  File "/home/leus/kashiwagi_ws/src/jsk_3rdparty/3rdparty/voicevox/node_scripts/server.py", line 15, in main
    rospy.init_node("voicevox_server")
  File "/opt/ros/noetic/lib/python3/dist-packages/rospy/client.py", line 323, in init_node
    raise rospy.exceptions.ROSInitException("init_node interrupted before it could complete")
rospy.exceptions.ROSInitException: init_node interrupted before it could complete
[rosout-1] killing on exit
[master] killing on exit
shutting down processing monitor...
... shutting down processing monitor complete
done

  1. ファイル権限を変える
    /home/leus/kashiwagi_ws/src/jsk_3rdparty/3rdparty/voicevox/node_scripts/voicevox_engine/run.py/home/leus/kashiwagi_ws/src/jsk_3rdparty/3rdparty/voicevox/node_scripts/run.pyのファイル権限の問題かと思い、以下を実行
chmod +x /home/leus/kashiwagi_ws/src/jsk_3rdparty/3rdparty/voicevox/node_scripts/voicevox_engine/run.py
chmod +x /home/leus/kashiwagi_ws/src/jsk_3rdparty/3rdparty/voicevox/node_scripts/run.py
  1. もう一度launchを立ち上げる
$ roslaunch voicevox voicevox_texttospeech.launch
... logging to /home/leus/.ros/log/02b927e2-2104-11f0-bcf4-4fc2871093a1/roslaunch-leus-ThinkPad-T480s-487161.log
Checking log directory for disk usage. This may take a while.
Press Ctrl-C to interrupt
Done checking log file disk usage. Usage is <1GB.

started roslaunch server http://leus-ThinkPad-T480s:36035/

SUMMARY
========

PARAMETERS
 * /rosdistro: noetic
 * /rosversion: 1.16.0
 * /sound_play_jp/default_voice: 2
 * /voicevox_server/cpu_num_threads: 1

NODES
  /
    sound_play_jp (sound_play/soundplay_node.py)
    voicevox_server (voicevox/server.py)

auto-starting new master
process[master]: started with pid [487176]
ROS_MASTER_URI=http://localhost:11311

setting /run_id to 02b927e2-2104-11f0-bcf4-4fc2871093a1
process[rosout-1]: started with pid [487193]
started core service [/rosout]
process[voicevox_server-2]: started with pid [487200]
process[sound_play_jp-3]: started with pid [487201]
/home/leus/kashiwagi_ws/src/jsk_3rdparty/3rdparty/voicevox/lib
/opt/ros/noetic/lib/sound_play/soundplay_node.py:330: PyGIDeprecationWarning: Since version 3.11, calling threads_init is no longer needed. See: https://wiki.gnome.org/PyGObject/Threading
  GObject.threads_init()
/opt/ros/noetic/lib/sound_play/soundplay_node.py:331: PyGIDeprecationWarning: GObject.MainLoop is deprecated; use GLib.MainLoop instead
  self.g_loop = threading.Thread(target=GObject.MainLoop().run)
[rosrun] You have chosen a non-unique executable, please pick one of the following:
1) /home/leus/kashiwagi_ws/src/jsk_3rdparty/3rdparty/voicevox/node_scripts/voicevox_engine/run.py
2) /home/leus/kashiwagi_ws/src/jsk_3rdparty/3rdparty/voicevox/node_scripts/run.py
3) /home/leus/kashiwagi_ws/src/jsk_3rdparty/3rdparty/voicevox/node_scripts/voicevox_engine/run.py
4) /home/leus/kashiwagi_ws/src/jsk_3rdparty/3rdparty/voicevox/node_scripts/run.py
#? [INFO] [1745496159.660435]: Loading from plugin definitions
[INFO] [1745496160.167585]: sound_play node is ready to play sound
[Text2Wave] Speak using voice_name (四国めたん-あまあま)..
[Text2Wave] Using cached sound file (/home/leus/.ros/voicevox_texttospeech/cache/c0e89a293bd36c7a768e4e9d2c5475a8--0.wav) for こんにちは
[Text2Wave] Speak using voice_name (四国めたん-あまあま)..
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/urllib3/connection.py", line 159, in _new_conn
    conn = connection.create_connection(
  File "/usr/lib/python3/dist-packages/urllib3/util/connection.py", line 84, in create_connection
    raise err
  File "/usr/lib/python3/dist-packages/urllib3/util/connection.py", line 74, in create_connection
    sock.connect(sa)
ConnectionRefusedError: [Errno 111] Connection refused

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 666, in urlopen
    httplib_response = self._make_request(
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 388, in _make_request
    conn.request(method, url, **httplib_request_kw)
  File "/usr/lib/python3.8/http/client.py", line 1256, in request
    self._send_request(method, url, body, headers, encode_chunked)
  File "/usr/lib/python3.8/http/client.py", line 1302, in _send_request
    self.endheaders(body, encode_chunked=encode_chunked)
  File "/usr/lib/python3.8/http/client.py", line 1251, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "/usr/lib/python3.8/http/client.py", line 1011, in _send_output
    self.send(msg)
  File "/usr/lib/python3.8/http/client.py", line 951, in send
    self.connect()
  File "/usr/lib/python3/dist-packages/urllib3/connection.py", line 187, in connect
    conn = self._new_conn()
  File "/usr/lib/python3/dist-packages/urllib3/connection.py", line 171, in _new_conn
    raise NewConnectionError(
urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x7f94122c4970>: Failed to establish a new connection: [Errno 111] Connection refused

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/requests/adapters.py", line 439, in send
    resp = conn.urlopen(
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 720, in urlopen
    retries = retries.increment(
  File "/usr/lib/python3/dist-packages/urllib3/util/retry.py", line 438, in increment
    raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='localhost', port=50021): Max retries exceeded with url: /audio_query?speaker=0&text=%E3%81%93%E3%82%93%E3%81%B0%E3%82%93%E3%81%AF (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f94122c4970>: Failed to establish a new connection: [Errno 111] Connection refused'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/leus/kashiwagi_ws/src/jsk_3rdparty/3rdparty/voicevox/bin/text2wave", line 148, in <module>
    request_synthesis(speech_text,
  File "/home/leus/kashiwagi_ws/src/jsk_3rdparty/3rdparty/voicevox/bin/text2wave", line 111, in request_synthesis
    response = requests.post(url, headers=headers,
  File "/usr/lib/python3/dist-packages/requests/api.py", line 116, in post
    return request('post', url, data=data, json=json, **kwargs)
  File "/usr/lib/python3/dist-packages/requests/api.py", line 60, in request
    return session.request(method=method, url=url, **kwargs)
  File "/usr/lib/python3/dist-packages/requests/sessions.py", line 535, in request
    resp = self.send(prep, **send_kwargs)
  File "/usr/lib/python3/dist-packages/requests/sessions.py", line 648, in send
    r = adapter.send(request, **kwargs)
  File "/usr/lib/python3/dist-packages/requests/adapters.py", line 516, in send
    raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPConnectionPool(host='localhost', port=50021): Max retries exceeded with url: /audio_query?speaker=0&text=%E3%81%93%E3%82%93%E3%81%B0%E3%82%93%E3%81%AF (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f94122c4970>: Failed to establish a new connection: [Errno 111] Connection refused'))
[ERROR] [1745496216.473435]: Sound synthesis failed.Is festival installed?Is a festival voice installed?Try running "rosdep satisfy sound_play|sh".Refer to http://wiki.ros.org/sound_play/Troubleshooting
[ERROR] [1745496216.475280]: Failed to generate wavfile.
[ERROR] [1745496216.476659]: Exception in callback: 'こんばんは---四国めたん-あまあま'
[INFO] [1745496216.477999]: Traceback (most recent call last):
  File "/opt/ros/noetic/lib/sound_play/soundplay_node.py", line 192, in callback
    sound = self.select_sound(data)
  File "/opt/ros/noetic/lib/sound_play/soundplay_node.py", line 162, in select_sound
    sound = self.voicesounds[voice_key]
KeyError: 'こんばんは---四国めたん-あまあま'

(ちび式じい-ノーマルで試しても同じ状況)

  1. 接続を確かめる
    httpのconnection errorと出ているので本当にvoicevoxnのサーバに接続できていないのかを確かめる
$ curl http://127.0.0.1:50021/speakers
curl: (7) Failed to connect to 127.0.0.1 port 50021: 接続を拒否されました

=> 接続できていないとわかる

  1. コミットを戻して再度実行する
    サーバへのアクセス上限を超えた?ネットワークが良くない?などを疑い、もともと動いていた矢野倉先生のブランチに戻して、どうなるのかを確かめた。
    以下のように、catkin clean voicevox.gitignore以下のファイル削除などを行い、buildsourceもやって、再度launchしたところ以下のように問題なく動くことがわかった
$ roslaunch voicevox voicevox_texttospeech.launch 
... logging to /home/leus/.ros/log/7230b6a0-2102-11f0-bcf4-4fc2871093a1/roslaunch-leus-ThinkPad-T480s-483014.log
Checking log directory for disk usage. This may take a while.
Press Ctrl-C to interrupt
Done checking log file disk usage. Usage is <1GB.

started roslaunch server http://leus-ThinkPad-T480s:46451/

SUMMARY
========

PARAMETERS
 * /rosdistro: noetic
 * /rosversion: 1.16.0
 * /sound_play_jp/default_voice: 2
 * /voicevox_server/cpu_num_threads: 1

NODES
  /
    sound_play_jp (sound_play/soundplay_node.py)
    voicevox_server (voicevox/server.py)

auto-starting new master
process[master]: started with pid [483029]
ROS_MASTER_URI=http://localhost:11311

setting /run_id to 7230b6a0-2102-11f0-bcf4-4fc2871093a1
process[rosout-1]: started with pid [483046]
started core service [/rosout]
process[voicevox_server-2]: started with pid [483053]
process[sound_play_jp-3]: started with pid [483054]
/home/leus/kashiwagi_ws/src/jsk_3rdparty/3rdparty/voicevox/node_scripts/server.py:9: DeprecationWarning: the imp module is deprecated in favour of importlib; see the module's documentation for alternative uses
  import imp
/opt/ros/noetic/lib/sound_play/soundplay_node.py:330: PyGIDeprecationWarning: Since version 3.11, calling threads_init is no longer needed. See: https://wiki.gnome.org/PyGObject/Threading
  GObject.threads_init()
/opt/ros/noetic/lib/sound_play/soundplay_node.py:331: PyGIDeprecationWarning: GObject.MainLoop is deprecated; use GLib.MainLoop instead
  self.g_loop = threading.Thread(target=GObject.MainLoop().run)
[INFO] [1745495487.805842]: Loading from plugin definitions
[INFO] [1745495488.314096]: sound_play node is ready to play sound
/home/leus/kashiwagi_ws/src/jsk_3rdparty/3rdparty/voicevox/node_scripts/server.py:126: DeprecationWarning: 
        on_event is deprecated, use lifespan event handlers instead.

        Read more about it in the
        [FastAPI docs for Lifespan Events](https://fastapi.tiangolo.com/advanced/events/).
        
  @app.on_event("startup")
INFO:     Started server process [483053]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
INFO:     Uvicorn running on http://127.0.0.1:50021 (Press CTRL+C to quit)
INFO:     127.0.0.1:52928 - "GET /speakers HTTP/1.1" 200 OK
[Text2Wave] Speak using voice_name (四国めたん-あまあま)..
INFO:     127.0.0.1:58726 - "POST /audio_query?speaker=0&text=%E3%81%93%E3%82%93%E3%81%AB%E3%81%A1%E3%81%AF HTTP/1.1" 200 OK
INFO:     127.0.0.1:40414 - "POST /synthesis?speaker=0&text=%E3%81%93%E3%82%93%E3%81%AB%E3%81%A1%E3%81%AF HTTP/1.1" 200 OK
^C[sound_play_jp-3] killing on exit
[voicevox_server-2] killing on exit
INFO:     Shutting down
INFO:     Waiting for application shutdown.
INFO:     Application shutdown complete.
INFO:     Finished server process [483053]
[rosout-1] killing on exit
[master] killing on exit
shutting down processing monitor...
... shutting down processing monitor complete
done

@k-okada
Copy link
Member

k-okada commented Apr 24, 2025

@mqcmd196 @iory なんかこれ,使いたいキャラクタが発生するたびに,結構な手間掛かりそうだな,と思ってみていたんだけど
https://github.com/k-okada/jsk_3rdparty/tree/c5c9a1fea57a5c9fe850bfe683f385c4317ab07c/3rdparty/voicevox3

みたいにして動くんじゃないだろうか?
voicevoxはpython3.8 をサポートしていた時代のパッケージを持ってくれば,cmake<4.0 の修正だけですんなり動く気がして,
これで,launch/voicevox_texttospeech.launch を立ち上げて,
client.say('こんにちは', voice='42') で,ちび式じいの声になっている気がするけどどうだろうか.

話者の一覧はrosrun voicevox3 list_speakers.py
voice='四国めたん-あまあま'みたいな文字列指定は未対応です.

あんまり良くわかっていないんだけど,
もともとあった node_scripts/server.py は voicevox_engine/run.py を実行しているのと同じ?
Makefile.model はGPU用?CPUでも問題なさそうだけど,文章が長いと必要なのかな.
Makefile.open_jtalk_dic は run.py を実行すると,中でダウンロードしていそうだけど,高速化を狙っている?あるいは、ドキュメントにあるaquestalk風記法とか、そういう時に必要だったりするもの?

@iory
Copy link
Member

iory commented Apr 24, 2025

もともとあった node_scripts/server.py は voicevox_engine/run.py を実行しているのと同じ?

はい、これはほぼrun.pyのものをとってきていてrun.pyにrequestを投げれば良いのでこれで良いかと思います。

Makefile.open_jtalk_dic は run.py を実行すると,中でダウンロードしていそうだけど,

これは形態素解析のためにそもそもopen_jtalkのdictが必要で、これは先んじてdownloadしています。

Makefile.model はGPU用?CPUでも問題なさそうだけど,文章が長いと必要なのかな.

これはvoicevoxが古いversionだとonnxruntime.soが含まれていないためそれをdownloadしてくるものでした。現在のversionではreleaseのzipの中に含まれているため必要なくなります。

voice='四国めたん-あまあま'みたいな文字列指定は未対応です.

これだけ別途対応を追加するだけで良さそうですね。

@mqcmd196
Copy link
Member Author

mqcmd196 commented Apr 25, 2025

最初その直し方をしようとして,https://github.com/VOICEVOX/voicevox_engine/releases/tag/0.18.0 を見て,ちび式じい0.18.0からでPython 3.11だから無理か,と思っていたんですが,今見たらそれはハミング?という機能みたいですね.声自体はhttps://github.com/VOICEVOX/voicevox_engine/releases/tag/0.13.3 から入っていたのか... よく見るべきでした.

@mqcmd196
Copy link
Member Author

https://github.com/k-okada/jsk_3rdparty/tree/c5c9a1fea57a5c9fe850bfe683f385c4317ab07c/3rdparty/voicevox3

みたいにして動くんじゃないだろうか?

動作を確認しました.動作には catkin clean voicevox みたいにして,古いvoicevoxを消すことが必要です.

@mqcmd196 mqcmd196 mentioned this pull request Apr 25, 2025
@sawada10
Copy link
Contributor

https://github.com/k-okada/jsk_3rdparty/tree/c5c9a1fea57a5c9fe850bfe683f385c4317ab07c/3rdparty/voicevox3
みたいにして動くんじゃないだろうか?

ありがとうございます
自分の環境でも動きました

@mqcmd196
Copy link
Member Author

close via #538

@mqcmd196 mqcmd196 closed this Apr 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

voicevoxで使えるキャラクターの声を増やしたい

4 participants