Skip to content

modernize: dont write audio to tmp file#45

Merged
JarbasAl merged 3 commits intodevfrom
modernize
Jan 9, 2026
Merged

modernize: dont write audio to tmp file#45
JarbasAl merged 3 commits intodevfrom
modernize

Conversation

@JarbasAl
Copy link
Member

@JarbasAl JarbasAl commented Jan 9, 2026

dont write audio to tmp file
drop dependency on speech_recognition package

Summary by CodeRabbit

  • New Features

    • STT endpoint accepts sample_rate and sample_width query parameters for flexible audio handling.
    • Transcribe function now accepts optional sample_rate (default 16000) and sample_width (default 2).
  • Chores

    • Server implementation updated to FastAPI.
    • Dependency constraints for plugin manager updated.
    • Deprecated Python version classifiers removed.
  • Deprecations

    • Gradio interface initialization now emits a deprecation warning.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 9, 2026

Warning

Rate limit exceeded

@JarbasAl has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 20 minutes and 15 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between d1adb73 and 2e933b5.

📒 Files selected for processing (2)
  • ovos_stt_http_server/__init__.py
  • ovos_stt_http_server/gradio_app.py
📝 Walkthrough

Walkthrough

Refactors audio handling to construct AudioData directly from request payloads and replaces internal bytes2audiodata usage with AudioData imports; updates transcribe signature and Gradio deprecation warning; relaxes ovos-plugin-manager version constraint; and updates package metadata to reference FastAPI and remove old classifiers.

Changes

Cohort / File(s) Summary
Audio data handling & endpoints
ovos_stt_http_server/__init__.py, ovos_stt_http_server/gradio_app.py
Removed internal bytes2audiodata; use AudioData from ovos_plugin_manager.utils.audio to build audio from raw request bytes. /stt now reads sr/sw (sample rate/width) and constructs AudioData; transcribe signature extended with sample_rate and sample_width. Gradio binding emits a deprecation warning.
Dependency constraint
requirements/requirements.txt
Bumped ovos-plugin-manager constraint from >=2.1.0,<2.2.0 to >=2.1.1,<3.0.0.
Package metadata
setup.py
Updated project description to reference FastAPI; removed obsolete Python version classifiers and the linguistic topic classifier.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐇 I munched some bytes, then hopped with glee,

Wrapped them in AudioData, neat as can be.
No temp-file crumbs or conversion maze,
Faster paths now brighten my days —
A little rabbit's coding praise!

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly matches the primary change: removing the bytes2audiodata function that wrote audio to temporary files and replacing it with direct AudioData construction.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🤖 Fix all issues with AI agents
In @ovos_stt_http_server/__init__.py:
- Around line 125-126: The query params `sample_rate` and `sample_width` are
being read as strings; convert them to integers before creating `AudioData` by
wrapping the `request.query_params.get("sample_rate", 16000)` and
`request.query_params.get("sample_width", 2)` calls with `int(...)` (or parse
and fall back to the defaults if parsing fails) so `sr` and `sw` are ints when
passed to `AudioData`.
- Around line 128-131: The code constructs an AudioData object named audio but
then calls model.detect_language with raw audio_bytes, losing sample rate/width
info; change calls to model.detect_language(audio) (and the second occurrence
later near the other call) so detect_language receives the AudioData instance
consistent with model.process_audio which expects AudioData.

In @requirements/requirements.txt:
- Line 4: Update the pinned Gradio dependency in requirements.txt from
gradio~=3.28 to a patched release (e.g., gradio==5.39.0 or at minimum
gradio>=4.26.0) to remediate multiple CVEs; change the line with "gradio~=3.28"
to the chosen safe version, then regenerate any lockfiles or dependency pins
(pip-compile/Pipfile.lock/poetry.lock) and run tests/build to ensure
compatibility.
- Line 1: Update the package constraint for ovos-plugin-manager in
requirements.txt to require at least v2.1.1 and allow the full 2.x series by
using ">=2.1.1,<3.0.0"; also scan usages of the AudioData class (e.g., type
hints in process_audio methods and any instantiation like AudioData(audio_bytes,
sample_rate, sample_width)) to ensure the call signature and attributes match
the v2.1.1 API and adjust argument order or names if the AudioData constructor
or typing changed.
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ff0ff80 and 29c66b3.

📒 Files selected for processing (4)
  • ovos_stt_http_server/__init__.py
  • ovos_stt_http_server/gradio_app.py
  • requirements/requirements.txt
  • setup.py
🧰 Additional context used
🧬 Code graph analysis (1)
ovos_stt_http_server/gradio_app.py (1)
ovos_stt_http_server/__init__.py (3)
  • ModelContainer (30-54)
  • process_audio (53-54)
  • process_audio (94-96)
🪛 GitHub Actions: Run Unit Tests
setup.py

[error] 1-1: Command failed or potential issue detected: 'python build_test/setup.py bdist_wheel sdist'. CI emitted an error annotation: Are there relative paths in setup.py?

🪛 OSV Scanner (2.3.1)
requirements/requirements.txt

[CRITICAL] 1-1: gradio 3.50.2: undefined

(PYSEC-2023-249)


[CRITICAL] 1-1: gradio 3.50.2: undefined

(PYSEC-2023-255)


[CRITICAL] 1-1: gradio 3.50.2: undefined

(PYSEC-2024-184)


[CRITICAL] 1-1: gradio 3.50.2: undefined

(PYSEC-2024-196)


[CRITICAL] 1-1: gradio 3.50.2: undefined

(PYSEC-2024-197)


[CRITICAL] 1-1: gradio 3.50.2: undefined

(PYSEC-2024-198)


[CRITICAL] 1-1: gradio 3.50.2: undefined

(PYSEC-2024-199)


[CRITICAL] 1-1: gradio 3.50.2: undefined

(PYSEC-2024-213)


[CRITICAL] 1-1: gradio 3.50.2: undefined

(PYSEC-2024-214)


[CRITICAL] 1-1: gradio 3.50.2: undefined

(PYSEC-2024-215)


[CRITICAL] 1-1: gradio 3.50.2: undefined

(PYSEC-2024-216)


[CRITICAL] 1-1: gradio 3.50.2: undefined

(PYSEC-2024-217)


[CRITICAL] 1-1: gradio 3.50.2: undefined

(PYSEC-2024-218)


[CRITICAL] 1-1: gradio 3.50.2: undefined

(PYSEC-2024-219)


[CRITICAL] 1-1: gradio 3.50.2: undefined

(PYSEC-2024-220)


[CRITICAL] 1-1: gradio 3.50.2: undefined

(PYSEC-2024-255)


[CRITICAL] 1-1: gradio 3.50.2: Gradio's dropdown component pre-process step does not limit the values to those in the dropdown list

(GHSA-26jh-r8g2-6fpr)


[CRITICAL] 1-1: gradio 3.50.2: Gradio uses insecure communication between the FRP client and server

(GHSA-279j-x4gx-hfrh)


[CRITICAL] 1-1: gradio 3.50.2: Gradio's Component Server does not properly consider _is_server_fn for functions

(GHSA-34rf-p3r3-58x2)


[CRITICAL] 1-1: gradio 3.50.2: Gradio has a one-level read path traversal in /custom_component

(GHSA-37qc-qgx6-9xjv)


[CRITICAL] 1-1: gradio 3.50.2: Gradios's CORS origin validation is not performed when the request has a cookie

(GHSA-3c67-5hwx-f6wx)


[CRITICAL] 1-1: gradio 3.50.2: Gradio Local File Inclusion vulnerability

(GHSA-3f95-mxq2-2f63)


[CRITICAL] 1-1: gradio 3.50.2: gradio Server Side Request Forgery vulnerability

(GHSA-3gf9-wv65-gwh9)


[CRITICAL] 1-1: gradio 3.50.2: Gradio applications running locally vulnerable to 3rd party websites accessing routes and uploading files

(GHSA-48cq-79qq-6f7x)


[CRITICAL] 1-1: gradio 3.50.2: Gradio has several components with post-process steps allow arbitrary file leaks

(GHSA-4q3c-cj7g-jcwf)


[CRITICAL] 1-1: gradio 3.50.2: Gradio vulnerable to SSRF in the path parameter of /queue/join

(GHSA-576c-3j53-r9jj)


[CRITICAL] 1-1: gradio 3.50.2: Gradio DOS in multipart boundry while uploading the file

(GHSA-5cpq-9538-jm2j)


[CRITICAL] 1-1: gradio 3.50.2: Gradio makes the /file secure against file traversal and server-side request forgery attacks

(GHSA-6qm2-wpxq-7qh2)


[CRITICAL] 1-1: gradio 3.50.2: Local file inclusion in gradio

(GHSA-6v6g-j5fq-hpvw)


[CRITICAL] 1-1: gradio 3.50.2: Gradio's is_in_or_equal function may be bypassed

(GHSA-77xq-6g77-h274)


[CRITICAL] 1-1: gradio 3.50.2: Gradio Vulnerable to Open Redirect

(GHSA-7v2w-h4gh-w5cv)


[CRITICAL] 1-1: gradio 3.50.2: Gradio's CORS origin validation accepts the null origin

(GHSA-89v2-pqfv-c5r9)


[CRITICAL] 1-1: gradio 3.50.2: Gradio lacks integrity checking on the downloaded FRP client

(GHSA-8c87-gvhj-xm8m)


[CRITICAL] 1-1: gradio 3.50.2: Gradio Allows Unauthorized File Copy via Path Manipulation

(GHSA-8jw3-6x8j-v96g)


[CRITICAL] 1-1: gradio 3.50.2: Server-Side Request Forgery in gradio

(GHSA-973g-55hp-3frw)


[CRITICAL] 1-1: gradio 3.50.2: Gradio Path Traversal vulnerability

(GHSA-f3h9-8phc-6gvh)


[CRITICAL] 1-1: gradio 3.50.2: Open redirect in gradio

(GHSA-g6c9-f4xm-9j4x)


[CRITICAL] 1-1: gradio 3.50.2: gradio vulnerable to Path Traversal

(GHSA-g9cj-cfpp-4g2x)


[CRITICAL] 1-1: gradio 3.50.2: Gradio Exposure of Sensitive Information to an Unauthorized Actor vulnerability

(GHSA-gqvf-3hgp-5hxv)


[CRITICAL] 1-1: gradio 3.50.2: Gradio has an XSS on every Gradio server via upload of HTML files, JS files, or SVG files

(GHSA-gvv6-33j7-884g)


[CRITICAL] 1-1: gradio 3.50.2: In Gradio, the enable_monitoring flag set to False does not disable monitoring

(GHSA-hm3c-93pg-4cxw)


[CRITICAL] 1-1: gradio 3.50.2: Gradio apps vulnerable to timing attacks to guess password

(GHSA-hmx6-r76c-85g9)


[CRITICAL] 1-1: gradio 3.50.2: Gradio Blocked Path ACL Bypass Vulnerability

(GHSA-j2jg-fq62-7c3h)


[CRITICAL] 1-1: gradio 3.50.2: Gradio performs a non-constant-time comparison when comparing hashes

(GHSA-j757-pf57-f8r4)


[CRITICAL] 1-1: gradio 3.50.2: Gradio allows users to access arbitrary files

(GHSA-m842-4qm8-7gpq)


[CRITICAL] 1-1: gradio 3.50.2: Gradio Path Traversal vulnerability

(GHSA-prpg-p95c-32fv)


[CRITICAL] 1-1: gradio 3.50.2: gradio Server-Side Request Forgery vulnerability

(GHSA-qh6x-j82h-vpf9)


[CRITICAL] 1-1: gradio 3.50.2: gradio Server-Side Request Forgery vulnerability

(GHSA-r364-m2j9-mf4h)


[CRITICAL] 1-1: gradio 3.50.2: Gradio allows credential leakage on Windows

(GHSA-rvfh-h6c7-fc3c)


[CRITICAL] 1-1: gradio 3.50.2: Gradio has a race condition in update_root_in_config may redirect user traffic

(GHSA-xh2x-3mrm-fwqm)

🔇 Additional comments (4)
setup.py (1)

50-50: LGTM! Accurate description update.

The description correctly reflects the migration to FastAPI as evidenced by the code changes.

ovos_stt_http_server/__init__.py (1)

23-23: LGTM! Import updated to remove speech_recognition dependency.

The import correctly switches to AudioData from ovos_plugin_manager.utils.audio, aligning with the PR objective to remove the speech_recognition dependency.

ovos_stt_http_server/gradio_app.py (2)

6-7: LGTM! Dependencies updated correctly.

The imports now use AudioData from ovos_plugin_manager.utils.audio, successfully removing the dependency on the internal bytes2audiodata function.


26-26: LGTM! Deprecation warning added.

The deprecation notice appropriately informs users that the Gradio interface will be removed in a future release.

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
ovos_stt_http_server/__init__.py (2)

135-135: Add default value to prevent AttributeError.

If the valid_langs query parameter is missing, .get("valid_langs") returns None, and calling .split(",") on None will raise an AttributeError.

🐛 Proposed fix
-        valid = request.query_params.get("valid_langs").split(",")
+        valid = request.query_params.get("valid_langs", "").split(",")

Alternatively, handle the missing parameter explicitly with proper error response:

-        valid = request.query_params.get("valid_langs").split(",")
+        valid_langs_param = request.query_params.get("valid_langs")
+        if not valid_langs_param:
+            return PlainTextResponse("Missing valid_langs parameter", status_code=400)
+        valid = valid_langs_param.split(",")

138-139: Pass AudioData object to detect_language.

Similar to the issue in the /stt endpoint, raw audio_bytes is passed to detect_language instead of an AudioData object. For consistency with the modernization objective, construct an AudioData object with the appropriate sample rate and sample width parameters.

🐛 Proposed fix
+        sr = int(request.query_params.get("sample_rate", 16000))
+        sw = int(request.query_params.get("sample_width", 2))
         audio_bytes = await request.body()
-        lang, prob = model.detect_language(audio_bytes, valid_langs=valid)
+        audio = AudioData(audio_bytes, sr, sw)
+        lang, prob = model.detect_language(audio, valid_langs=valid)
         return {"lang": lang, "conf": prob}
🤖 Fix all issues with AI agents
In @ovos_stt_http_server/__init__.py:
- Around line 128-131: The code constructs an AudioData object (audio =
AudioData(audio_bytes, sr, sw) ) but incorrectly passes raw audio_bytes to
detect_language; update the detect_language call to accept the AudioData
instance (audio) instead of audio_bytes so it matches the call to
model.process_audio(audio, lang) and returns lang, prob as before; ensure you
still handle the "auto" branch (lang, prob = model.detect_language(audio)) and
return model.process_audio(audio, lang).
🧹 Nitpick comments (2)
ovos_stt_http_server/__init__.py (2)

14-14: Remove unused import.

The NamedTemporaryFile import is no longer needed since the refactoring eliminates temporary file usage.

♻️ Proposed fix
-from tempfile import NamedTemporaryFile

125-126: Add validation for numeric query parameters.

The int() conversion will raise a ValueError if non-numeric values are provided for sample_rate or sample_width. Consider adding input validation or error handling.

♻️ Proposed fix
-        sr = int(request.query_params.get("sample_rate", 16000))
-        sw = int(request.query_params.get("sample_width", 2))
+        try:
+            sr = int(request.query_params.get("sample_rate", 16000))
+            sw = int(request.query_params.get("sample_width", 2))
+        except ValueError:
+            return PlainTextResponse("Invalid sample_rate or sample_width parameter", status_code=400)
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 29c66b3 and d1adb73.

📒 Files selected for processing (1)
  • ovos_stt_http_server/__init__.py

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 9, 2026

Note

Docstrings generation - SUCCESS
Generated docstrings for this pull request at #46

coderabbitai bot added a commit that referenced this pull request Jan 9, 2026
Docstrings generation was requested by @JarbasAl.

* #45 (comment)

The following files were modified:

* `ovos_stt_http_server/__init__.py`
* `ovos_stt_http_server/gradio_app.py`
* 📝 Add docstrings to `modernize`

Docstrings generation was requested by @JarbasAl.

* #45 (comment)

The following files were modified:

* `ovos_stt_http_server/__init__.py`
* `ovos_stt_http_server/gradio_app.py`

* Update gradio_app.py

---------

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-authored-by: JarbasAI <33701864+JarbasAl@users.noreply.github.com>
@JarbasAl JarbasAl merged commit f0f2360 into dev Jan 9, 2026
4 checks passed
@JarbasAl JarbasAl mentioned this pull request Jan 9, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant