Skip to content

Visualization#7

Open
wizenink wants to merge 2 commits intomasterfrom
visualization
Open

Visualization#7
wizenink wants to merge 2 commits intomasterfrom
visualization

Conversation

@wizenink
Copy link
Owner

No description provided.

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @wizenink, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the stft-rs crate by introducing a powerful new visualization feature. Users can now easily generate high-quality spectrogram images from their processed audio data, providing a clear visual representation of frequency content over time. This addition makes the crate more versatile for audio analysis, research, and educational purposes.

Highlights

  • New Visualization Feature: Introduced a new visualization feature that allows for generating image representations of spectrograms. This feature is opt-in and can be enabled via cargo run --example visualization --features visualization.
  • Spectrogram Image Generation: Added functionality to convert Spectrum data into image files (e.g., PNG) with customizable settings such as color maps (Viridis, Magma, Inferno, Plasma, Grayscale), output dimensions, and decibel range for color mapping.
  • New Example: A new example (examples/visualization.rs) has been added to demonstrate the usage of the visualization feature. It showcases generating spectrograms for chirp, multi-tone, and Mel-scale signals, and compares different color maps.
  • Dependency Additions: New optional dependencies image and colorgrad have been added to support the image generation capabilities. This also led to numerous transitive dependency additions in Cargo.lock.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new visualization feature to generate spectrogram images from Spectrum data. The implementation is well-structured, adding a visualization module with a SpectrumExt trait and a corresponding example. My review includes a few suggestions to improve the API consistency and fix an issue in the example code.

  • In src/visualization.rs, I've suggested a change to improve type consistency for the db_range configuration, which will make the API more robust, especially for f64 data.
  • In examples/visualization.rs, I've identified that the code for saving the mel spectrogram is commented out, which makes the example's output misleading. This should be addressed to ensure the example works as described.

Comment on lines 174 to 176
// mel_spec
// .save_image_with("mel_speech.png", &vis_config)
// .unwrap();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

This code to save the mel spectrogram image is commented out, but the main function prints that mel_speech.png is generated. This is misleading for users running the example.

This is likely because SpectrumExt is not implemented for MelSpectrum. To fix this, you could implement the SpectrumExt trait for MelSpectrum<T>. The implementation would be slightly different as MelSpectrum can already contain dB-scaled data.

pub colormap: ColorMap,
pub width: Option<u32>, // None = 1 pixel per frame
pub height: Option<u32>, // None = 1 pixel per freq bin
pub db_range: (f32, f32), // (min_db, max_db) for color mapping
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

For better precision and consistency, especially when dealing with f64 data, consider using (f64, f64) for db_range. This allows all intermediate decibel calculations to be done in f64 before the final normalization and conversion to color. This change will require a small adjustment in the to_image_with function.

Suggested change
pub db_range: (f32, f32), // (min_db, max_db) for color mapping
pub db_range: (f64, f64), // (min_db, max_db) for color mapping

// Flip Y axis to have low freq at the bottom

let db = mag_db[frame * self.freq_bins + bin];
let normalized = ((db - min_db as f64) / range as f64).clamp(0.0, 1.0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Following the change of db_range to (f64, f64), the explicit casts to f64 are no longer necessary here, simplifying the code.

Suggested change
let normalized = ((db - min_db as f64) / range as f64).clamp(0.0, 1.0);
let normalized = ((db - min_db) / range).clamp(0.0, 1.0);

@wizenink wizenink added this to the 0.5.0 milestone Nov 12, 2025
@wizenink wizenink changed the base branch from master to staging-0.5.0 November 13, 2025 09:59
@wizenink wizenink modified the milestones: 0.5.0, 0.5.1 Nov 24, 2025
Base automatically changed from staging-0.5.0 to master November 24, 2025 18:48
@wizenink wizenink modified the milestones: 0.5.1, 0.5.2 Dec 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant

Comments