Add Emotion Analyzer (new) by ayaha-n · Pull Request #527 · jsk-ros-pkg/jsk_3rdparty

ayaha-n · 2025-04-14T12:06:01Z

@mqcmd196
eye_statusなどが入っていたようで失礼しました．
新しくプルリクし直したので，ご確認いただければ幸いです．

Hume AI を用いたemotion_analyzerを作成しました．
使用方法はreadmeに記載の通りで，

text to emotion
audio (wav file) to emotion
audio (record from /audio) to emotion
ができます．

ただ，audio (record from /audio) to emotionのときが少し挙動が変で，
ReSpeakerを用いて

roslaunch emotion_analyzer emotion_analyzer.launch api_key:=<api_key>
roslaunch emotion_analyzer capture.launch 
rosservice call /analyze_audio "audio_file: ''"

としたときは

result: "{\"prosody\": null, \"burst\": [{\"name\": \"Admiration\", \"score\": 0.04438357800245285},\
  \ {\"name\": \"Adoration\", \"score\": 0.03663531318306923}, {\"name\": \"Aesthetic\
  \ Appreciation\", \"score\": 0.0275471992790699}, {\"name\": \"Amusement\", \"score\"\
  : 0.15527257323265076}, {\"name\": \"Anger\", \"score\": 0.01609768718481064}, {\"\
  name\": \"Anxiety\", \"score\": 0.09838512539863586}, {\"name\": \"Awe\", \"score\"\
  : 0.0516996867954731}, {\"name\": \"Awkwardness\", \"score\": 0.17967936396598816},\
  \ {\"name\": \"Boredom\", \"score\": 0.18468798696994781}, {\"name\": \"Calmness\"\
  , \"score\": 0.11194272339344025}, {\"name\": \"Concentration\", \"score\": 0.039446331560611725},\
  \ {\"name\": \"Contemplation\", \"score\": 0.07749783992767334}, {\"name\": \"Confusion\"\
  , \"score\": 0.07849768549203873}, {\"name\": \"Contempt\", \"score\": 0.10015680640935898},\
  \ {\"name\": \"Contentment\", \"score\": 0.07816632837057114}, {\"name\": \"Craving\"\
  , \"score\": 0.043796356767416}, {\"name\": \"Determination\", \"score\": 0.022754769772291183},\
  \ {\"name\": \"Disappointment\", \"score\": 0.15245404839515686}, {\"name\": \"\
  Disgust\", \"score\": 0.0362621434032917}, {\"name\": \"Distress\", \"score\": 0.16373476386070251},\
  \ {\"name\": \"Doubt\", \"score\": 0.11785085499286652}, {\"name\": \"Ecstasy\"\
  , \"score\": 0.1383388191461563}, {\"name\": \"Embarrassment\", \"score\": 0.10717830061912537},\
  \ {\"name\": \"Empathic Pain\", \"score\": 0.0768926814198494}, {\"name\": \"Entrancement\"\
  , \"score\": 0.040821533650159836}, {\"name\": \"Envy\", \"score\": 0.021487215533852577},\
  \ {\"name\": \"Excitement\", \"score\": 0.07412480562925339}, {\"name\": \"Fear\"\
  , \"score\": 0.06701570749282837}, {\"name\": \"Guilt\", \"score\": 0.03692568093538284},\
  \ {\"name\": \"Horror\", \"score\": 0.039663150906562805}, {\"name\": \"Interest\"\
  , \"score\": 0.09799767285585403}, {\"name\": \"Joy\", \"score\": 0.13371771574020386},\
  \ {\"name\": \"Love\", \"score\": 0.06643084436655045}, {\"name\": \"Nostalgia\"\
  , \"score\": 0.045610249042510986}, {\"name\": \"Pain\", \"score\": 0.1008995845913887},\
  \ {\"name\": \"Pride\", \"score\": 0.034173380583524704}, {\"name\": \"Realization\"\
  , \"score\": 0.07668226957321167}, {\"name\": \"Relief\", \"score\": 0.10585669428110123},\
  \ {\"name\": \"Romance\", \"score\": 0.0844399556517601}, {\"name\": \"Sadness\"\
  , \"score\": 0.08523412048816681}, {\"name\": \"Satisfaction\", \"score\": 0.2191394865512848},\
  \ {\"name\": \"Desire\", \"score\": 0.14677052199840546}, {\"name\": \"Shame\",\
  \ \"score\": 0.07419771701097488}, {\"name\": \"Surprise (negative)\", \"score\"\
  : 0.020901966840028763}, {\"name\": \"Surprise (positive)\", \"score\": 0.038737643510103226},\
  \ {\"name\": \"Sympathy\", \"score\": 0.042055580765008926}, {\"name\": \"Tiredness\"\
  , \"score\": 0.17382484674453735}, {\"name\": \"Triumph\", \"score\": 0.03647517040371895}]}"

と録音→分析ができているようなのですが，PC内蔵マイクを用いようと

roslaunch emotion_analyzer emotion_analyzer.launch api_key:=<api_key>
roslaunch emotion_analyzer capture.launch device:=hw:0,6 channels:=2 sample_rate:=48000
rosservice call /analyze_audio "audio_file: ''"

のようにすると

result: "{\"prosody\": null, \"burst\": null}"

となってしまって，録音の保存先/home/leus/tmp/hoge.wavを見に行っても，再生できない，もしくは，再生できても私が喋った音声とは異なる雑音のようになる，という感じです．

arecord -lすると

**** List of CAPTURE Hardware Devices ****
card 0: sofhdadsp [sof-hda-dsp], device 0: HDA Analog (*) []
  Subdevices: 1/1
  Subdevice #0: subdevice #0
card 0: sofhdadsp [sof-hda-dsp], device 1: HDA Digital (*) []
  Subdevices: 1/1
  Subdevice #0: subdevice #0
card 0: sofhdadsp [sof-hda-dsp], device 6: DMIC (*) []
  Subdevices: 0/1
  Subdevice #0: subdevice #0
card 0: sofhdadsp [sof-hda-dsp], device 7: DMIC16kHz (*) []
  Subdevices: 1/1
  Subdevice #0: subdevice #0
card 1: ArrayUAC10 [ReSpeaker 4 Mic Array (UAC1.0)], device 0: USB Audio [USB Audio]
  Subdevices: 0/1
  Subdevice #0: subdevice #0

のような感じになって，(1,0),(0,6)以外は（(0,6)もたまに）capture.launchすると

... logging to /home/leus/.ros/log/603feb96-1919-11f0-a47f-6f63903075d1/roslaunch-leus-ThinkPad-P16s-Gen-2-18937.log
Checking log directory for disk usage. This may take a while.
Press Ctrl-C to interrupt
Done checking log file disk usage. Usage is <1GB.

started roslaunch server http://leus-ThinkPad-P16s-Gen-2:40087/

SUMMARY
========

PARAMETERS
 * /audio_capture/bitrate: 128
 * /audio_capture/channels: 2
 * /audio_capture/depth: 16
 * /audio_capture/device: hw:0,7
 * /audio_capture/dst: appsink
 * /audio_capture/format: wave
 * /audio_capture/sample_format: S16LE
 * /audio_capture/sample_rate: 48000
 * /rosdistro: noetic
 * /rosversion: 1.16.0

NODES
  /
    audio_capture (audio_capture/audio_capture)

ROS_MASTER_URI=http://localhost:11311

process[audio_capture-1]: started with pid [18966]
[ERROR] [1744625746.458648274]: gstreamer: Internal data stream error.
[audio_capture-1] process has died [pid 18966, exit code 1, cmd /opt/ros/noetic/lib/audio_capture/audio_capture audio:=audio __name:=audio_capture __log:=/home/leus/.ros/log/603feb96-1919-11f0-a47f-6f63903075d1/audio_capture-1.log].
log file: /home/leus/.ros/log/603feb96-1919-11f0-a47f-6f63903075d1/audio_capture-1*.log
all processes on machine have died, roslaunch will exit
shutting down processing monitor...
... shutting down processing monitor complete
done

のようにエラーが出ます．

mqcmd196

Nice feature.

Please check ayaha-n#3 and merge if you don't have any problems.
Please check and fix your code following the comments in each line.

emotion_analyzer/launch/emotion_analyzer.launch

emotion_analyzer/scripts/analyze_audio_service.py

mqcmd196 · 2025-04-15T09:30:01Z

Waiting for some rosdep keys to be merged

mqcmd196 · 2025-04-15T09:34:15Z

@a-ichikura @sawada10
Please check this specification meets your usage.

emotion_analyzer/CMakeLists.txt

Fix emotion analyzer

ayaha-n · 2025-04-16T05:48:50Z

@iory @mqcmd196 Thanks for your comments. I fixed it:

delete unnecessary comments
translate comments from Japanese to English
check audio format
warn if there is no audio data

I further have to do these:

check if all the comments are written in English
change README to inform that you need to set the audio format: roslaunch audio_capture capture.launch format:=wave
check the sample launch again
check the case using ReSpeaker

mqcmd196 · 2025-04-16T06:08:19Z

I'm sorry but I wrongly edited rosdep key. The patch would fix the issue

fix python3-soundfile key syntax ros/rosdistro#45200

sawada10 · 2025-04-21T11:17:14Z

@a-ichikura @sawada10 Please check if this specification meets your needs.

Sorry for the late reply, and thank you for all the maintenance work.

@ayaha-n As you mentioned during the today's labo meeting, is the current issue that when voice information is used as input, the analysis is based on the audio from two seconds earlier (i.e., it's not real-time)?

@a-ichikura and I used it during the mamoru experiment, but since we were using text data at the time, I don't think that specific use case is particularly relevant here.
Looking ahead, I’m vaguely thinking it would be useful if the robot could analyze the state of the person it’s talking to and adjust its conversation or interaction style accordingly.
Since it’s unlikely that someone’s emotions would change drastically every two seconds, I think the current implementation is still quite usable.

ayaha-n · 2025-04-22T06:25:42Z

@sawada10 Thanks for your reply.

@ayaha-n As you mentioned during today's lab meeting, is the current issue that when voice information is used as input, the analysis is based on the audio from two seconds earlier (i.e., it's not real-time)?

Yes, I thought it would be better if it starts analyzing audio after the request, but when you want to analyze audio from a microphone, it would be unlikely to have only 2 seconds of audio. In this case, streaming style will be used, so the present implementation analyzing the audio after the request AND 2 seconds before the request is fine.

emotion_analyzer/README.md

mqcmd196 · 2025-04-29T23:56:13Z

@sawada10 @a-ichikura
Finally, please ensure that this package meets the requirements of your application. If you need any new feature, please send the PR to this repository and enhance emotion_analyzer package.

a-ichikura · 2025-04-30T11:59:42Z

I checked the text_to_emotion function, then it exactly meets our demand.
Thank you for developing!

mqcmd196

CI green. LGTM

nagata added 2 commits April 14, 2025 20:58

Add emotion_analyzer package again

bf39ef9

Add emotion_analyzer package again

e230037

mqcmd196 mentioned this pull request Apr 15, 2025

Add Emotion Analyzer #526

Closed

mqcmd196 added 6 commits April 15, 2025 15:39

[emotion_analyzer] make catkin buildable

90922fa

[emotion_analyzer] not allow empty string API Key

5d5e5c3

[emotion_analyzer] update README.md

0a0b623

[emotion_analyzer] use audio_common's audio capturing

88a92a0

[emotion_analyzer] enable to use from PC microphone

3a6d721

[emotion_analyzer] ignoring requirements.txt for catkin_virtualenv

0236012

mqcmd196 mentioned this pull request Apr 15, 2025

Fix emotion analyzer ayaha-n/jsk_3rdparty#3

Merged

mqcmd196 requested changes Apr 15, 2025

View reviewed changes

emotion_analyzer/launch/emotion_analyzer.launch Outdated Show resolved Hide resolved

emotion_analyzer/scripts/analyze_audio_service.py Outdated Show resolved Hide resolved

emotion_analyzer/scripts/analyze_audio_service.py Show resolved Hide resolved

mqcmd196 requested review from a-ichikura, iory and sawada10 April 15, 2025 09:33

iory requested changes Apr 15, 2025

View reviewed changes

emotion_analyzer/CMakeLists.txt Outdated Show resolved Hide resolved

ayaha-n and others added 2 commits April 16, 2025 13:54

Merge pull request #3 from mqcmd196/emotion_analyzer_new

5fb4b94

Fix emotion analyzer

fixed emotion_analyzer

5826bd3

translated comments into English

31d20b3

updated README

5edabeb

mqcmd196 requested changes Apr 23, 2025

View reviewed changes

emotion_analyzer/README.md Outdated Show resolved Hide resolved

ayaha-n and others added 2 commits April 24, 2025 13:22

Update README.md

008ccac

Merge branch 'master' into emotion_analyzer_new

53eee13

mqcmd196 and others added 3 commits May 1, 2025 09:53

Merge branch 'master' into emotion_analyzer_new

f298976

[emotion_analyzer][ci] skip python2 test

7dc2a45

[emotion_analyzer] Hume AI requires the Python equal / newer than 3.8

2f9a5e4

github-actions bot added the github-action label May 1, 2025

[emotion_analyzer] downgrade cmake

04a4e46

mqcmd196 approved these changes May 1, 2025

View reviewed changes

mqcmd196 requested a review from k-okada May 1, 2025 15:28

k-okada merged commit 3bd763d into jsk-ros-pkg:master May 10, 2025
16 checks passed

Conversation

ayaha-n commented Apr 14, 2025

Uh oh!

mqcmd196 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mqcmd196 commented Apr 15, 2025

Uh oh!

mqcmd196 commented Apr 15, 2025

Uh oh!

Uh oh!

ayaha-n commented Apr 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mqcmd196 commented Apr 16, 2025

Uh oh!

sawada10 commented Apr 21, 2025

Uh oh!

ayaha-n commented Apr 22, 2025

Uh oh!

Uh oh!

mqcmd196 commented Apr 29, 2025

Uh oh!

a-ichikura commented Apr 30, 2025

Uh oh!

mqcmd196 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

ayaha-n commented Apr 16, 2025 •

edited

Loading