Skip to content

Conversation

@MattEqualsCoder
Copy link
Collaborator

@MattEqualsCoder MattEqualsCoder commented Mar 28, 2025

I still need to do my own proofreading of the code here, but wanted to get a PR out there finally. The goal here was to create an application and nuget package that could be used as close as possible with the setup we had before. I basically created classes that mirrored what we had before for building grammar, then added a way to build that into C# System.Speech native grammar for Windows users.

For Linux users, this utilizes the PySpeechService application I created, which is a Python application that uses gRPC to communicate back and forth to tracker. For text to speech, this uses Piper and for speech recognition it uses Vosk.

Still have some things I want to do before merging in:

  • Add SMZ3 documentation for Linux users
  • Review and potentially clean up code
  • Remove temp changes to the GitHub action
  • Test on Fedora and clean Arch & Linux Mint installs to verify setup steps
  • Create config PR for the speech replacements
  • Probably need to clean up some warnings

Vivelin
Vivelin previously approved these changes Mar 28, 2025
</ItemGroup>

<ItemGroup>
<PackageReference Include="PySpeechServiceClient" Version="0.1.0" />
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is only available on Linux, maybe we should have a compiler symbol for this (if one isn't built-in already) and surrounding the code where it's used. Then we can avoid adding the dependency itself entirely on Windows if it'll never be used there anyway.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, it is used for the grammar construction even on Windows, but I could possibly pull that out into a separate project/repo so that just the actual services themselves can be done that way.

That being said, I do plan on making a Windows version just for possible testing for pronunciations if desired for Pink's streams. It's just a lower priority.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I split it out so that grammar building is in its on nuget package so that it could be used on Windows whereas the actual services be Linux only, and I marked all of the functions as Linux only for now until I get it built for Mac and/or Windows. However, I was having issues getting OS dependent compiler constants working for actually being able to make the built code different. Might give it another go.

@MattEqualsCoder
Copy link
Collaborator Author

Okay, I think I've got everything sorted out the best I can for right now. I want to do some end-to-end testing with SMZ3 on various OSes, and I want to do a bit of testing just to make sure this didn't break Windows at all. Once that's done, I think this is ready to merge in and have a new build created.

CPColin
CPColin previously approved these changes Mar 31, 2025
@MattEqualsCoder
Copy link
Collaborator Author

Okay, I think this one is about as done as it's going to be for now. I tested SMZ3 with it in Linux Mint, Fedora, and Arch. Also should have fixed the issue Pink ran into with items not being tracked together, and fixed a potential issue that might come up with two identical messages back to back.

@MattEqualsCoder MattEqualsCoder merged commit e9b6dd8 into main Apr 2, 2025
2 checks passed
@MattEqualsCoder MattEqualsCoder deleted the linux-speech-recognition branch April 2, 2025 03:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants