Skip to content

feat(seaborn): implement spectrogram-mel#8412

Merged
MarkusNeusinger merged 6 commits into
mainfrom
implementation/spectrogram-mel/seaborn
Jun 3, 2026
Merged

feat(seaborn): implement spectrogram-mel#8412
MarkusNeusinger merged 6 commits into
mainfrom
implementation/spectrogram-mel/seaborn

Conversation

@github-actions
Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot commented Jun 3, 2026

Implementation: spectrogram-mel - python/seaborn

Implements the python/seaborn version of spectrogram-mel.

File: plots/spectrogram-mel/implementations/python/seaborn.py

Parent Issue: #4672


🤖 impl-generate workflow

github-actions Bot added 2 commits June 3, 2026 17:57
Regen from quality 93. Addressed:
- Canvas: figsize=(16,9) dpi=300 → figsize=(8,4.5) dpi=400 (exact 3200×1800)
- Save: plot.png → plot-{THEME}.png; removed bbox_inches='tight'
- Theme: hardcoded dark → ANYPLOT_THEME env var + theme-adaptive chrome tokens
- Colors: #ffcc66 → BRAND #009E73; mako cmap → imprint_seq (green→blue)
- Title: updated to 'spectrogram-mel · python · seaborn · anyplot.ai'
- Change request: replaced C-major ascending arpeggio with IPA vowel sequence
  (/a/ /e/ /i/ /o/ /u/ /a/) using formant resonance synthesis; shows
  characteristic F1/F2/F3 bands — aligns with ASR application in spec
- Kept: two-panel waveform+spectrogram layout, harmonic annotations (F0/2F0/3F0),
  manual mel filterbank, seaborn heatmap/lineplot/despine idioms
@claude
Copy link
Copy Markdown
Contributor

claude Bot commented Jun 3, 2026

AI Review - Attempt 1/3

Image Description

Light render (plot-light.png): Two-panel plot on a warm off-white #FAF8F1 background. The top panel is a waveform (seaborn lineplot in brand green #009E73) with a semi-transparent fill, IPA vowel labels (/a/, /e/, /i/, /o/, /u/, /a/) annotated in green above the waveform, and dashed vertical lines marking segment boundaries. The bottom panel is a mel-spectrogram heatmap rendered with the Imprint sequential colormap (green → blue), showing clear horizontal harmonic bands in the 100–500 Hz range and distinct energy patches corresponding to different vowel segments. The colorbar on the right is labeled "Power (dB)" with a −80 to 0 dB scale. Frequency axis shows Hz labels (100, 200, 500, 1k, 2k, 4k, 8k); time axis shows 0.0–4.0 s. F0/2F0/3F0 markers appear in the lower-right of the spectrogram. Title "spectrogram-mel · python · seaborn · anyplot.ai" is dark text, clearly readable. All text is readable against the light background.

Dark render (plot-dark.png): Same layout on a warm near-black #1A1A17 background. Title, axis labels ("Frequency (Hz)", "Time (s)", "Amp."), tick labels, and colorbar label all appear in light off-white text, clearly readable. Data colors are identical to the light render: brand green waveform, same imprint_seq green-to-blue colormap on the spectrogram, green IPA labels. The colorbar ticks are light-colored. F0/2F0/3F0 annotation boxes use the elevated dark background (#242420). No dark-on-dark failures detected; all text is legible against the dark background. Chrome correctly flips while data colors remain constant.

Both paragraphs are required. A review that only describes one render is invalid.

Score: 87/100

Category Score Max
Visual Quality 28 30
Design Excellence 14 20
Spec Compliance 15 15
Data Quality 15 15
Code Quality 9 10
Library Mastery 6 10
Total 87 100

Visual Quality (28/30)

  • VQ-01: Text Legibility (7/8) — All font sizes explicitly set; waveform-panel tick labels (6pt) and "Amp." label (8pt) are slightly small for mobile readability but acceptable given the narrow panel height
  • VQ-02: No Overlap (6/6) — No overlapping elements across either panel in either theme
  • VQ-03: Element Visibility (5/6) — Spectrogram energy distribution clearly visible; imprint_seq green-to-blue provides less luminance variation than a dedicated spectrogram cmap, making subtle spectral detail in the 2–8 kHz range harder to distinguish
  • VQ-04: Color Accessibility (2/2) — CVD-safe imprint_seq; good contrast throughout
  • VQ-05: Layout & Canvas (4/4) — Well-proportioned dual-panel layout (1:5 ratio), canvas 3200×1800, generous margins
  • VQ-06: Axis Labels & Title (2/2) — "Time (s)", "Frequency (Hz)", "Power (dB)" all include units
  • VQ-07: Palette Compliance (2/2) — imprint_seq correctly used for continuous spectrogram data; waveform uses #009E73 (first Imprint position); both backgrounds are theme-correct; all chrome is theme-adaptive

Design Excellence (14/20)

  • DE-01: Aesthetic Sophistication (6/8) — Strong design: dual-panel layout, IPA vowel annotations, harmonic markers (F0/2F0/3F0), segment boundary lines, custom colormap — clearly above library defaults and shows intentional design hierarchy
  • DE-02: Visual Refinement (4/6) — Spines removed via sns.despine, subtle grid (alpha=0.15), spine colors set to INK_SOFT, clean heatmap with rasterized rendering; some refinement evident
  • DE-03: Data Storytelling (4/6) — Dual-panel layout guides the viewer from waveform to spectral content; IPA labels and segment boundaries create focal points; harmonic annotations annotate acoustic structure; viewer is led through the phonetic analysis

Spec Compliance (15/15)

  • SC-01: Plot Type (5/5) — Correct mel-spectrogram with manual mel filterbank, STFT, and dB conversion
  • SC-02: Required Features (4/4) — dB scale ✓, mel-scaled y-axis ✓, Hz labels at perceptually relevant positions ✓, colorbar labeled in dB ✓, synthesized audio ✓, parameters n_fft=2048/hop_length=512/n_mels=128 ✓
  • SC-03: Data Mapping (3/3) — Time on x-axis, frequency on y-axis, power mapped to color — all correct
  • SC-04: Title & Legend (3/3) — Title "spectrogram-mel · python · seaborn · anyplot.ai" exactly correct; colorbar correctly labeled in dB

Data Quality (15/15)

  • DQ-01: Feature Coverage (6/6) — Six distinct IPA vowels with different formant profiles demonstrate the mel-spectrogram's ability to distinguish phonemes; harmonic structure visible; full frequency range covered
  • DQ-02: Realistic Context (5/5) — IPA vowel sequence is a well-known, neutral audio analysis scenario used in speech recognition research; scientifically grounded
  • DQ-03: Appropriate Scale (4/4) — sample_rate=22050 Hz ✓, duration=4s ✓, F0=130 Hz (realistic male voice) ✓, IPA formant frequencies match phonetic literature ✓, −80 to 0 dB range standard for mel-spectrograms ✓

Code Quality (9/10)

  • CQ-01: KISS Structure (3/3) — Flat: imports → theme tokens → data synthesis → STFT → filterbank → plot → save; no functions or classes
  • CQ-02: Reproducibility (2/2) — np.random.seed(42) set
  • CQ-03: Clean Imports (2/2) — All imports used (os, sys, matplotlib, numpy, pandas, seaborn, LinearSegmentedColormap, scipy.signal.stft)
  • CQ-04: Code Elegance (1/2) — Mel filterbank uses nested Python loops (outer: 128 mel bands, inner: STFT bins) that could be replaced with numpy broadcasting/slicing; otherwise readable and appropriate
  • CQ-05: Output & API (1/1) — Saves as plot-{THEME}.png without bbox_inches='tight' as required

Library Mastery (6/10)

  • LM-01: Idiomatic Usage (4/5) — sns.heatmap for the mel-spectrogram matrix, sns.lineplot for the waveform, sns.set_theme with rc params, sns.despine — correct and idiomatic seaborn patterns
  • LM-02: Distinctive Features (2/5) — sns.heatmap provides seaborn-specific colorbar management, DataFrame indexing, and tick control (vs matplotlib imshow); however, seaborn's distinctive statistical computing features (violin, pairplot, kde) are not leveraged, as this is a signal-processing visualization

Score Caps Applied

  • None

Strengths

  • Exemplary spec compliance: all required features present including manual mel filterbank, dB conversion, Hz-labeled frequency axis, and synthesized audio
  • Strong dual-panel design with waveform + spectrogram connected by segment boundary lines and IPA vowel labels — tells a clear acoustic story
  • Perfect palette compliance: imprint_seq correctly used for continuous spectrogram data, full theme-adaptive chrome in both renders
  • Scientifically accurate IPA vowel formant frequencies produce realistic spectral patterns that differentiate vowels clearly
  • Harmonic annotations (F0/2F0/3F0) add meaningful acoustic interpretation

Weaknesses

  • Waveform panel tick labels (6pt) and y-label (8pt) are on the small side; increase waveform tick labels to 7–8pt and "Amp." label to 9–10pt for better mobile readability
  • imprint_seq (green→blue) has limited luminance variation, reducing perceptual contrast in the spectrogram; within palette constraints, consider reversing the colormap (blue→green so high energy = green) or tuning vmin/vmax to spread the color range more dynamically
  • Mel filterbank loops are nested Python loops (O(n_mels × n_fft)); replace with numpy vectorized operations for cleaner, more Pythonic code: use np.arange broadcasting or np.linspace slicing instead of the double for-loop over bins
  • LM-02: The implementation does not leverage seaborn's statistical differentiation — consider adding a kde or violin subplot showing the amplitude distribution per vowel to showcase seaborn's unique capabilities

Issues Found

  1. VQ-01 MINOR: Waveform panel text too small — 6pt tick labels, 8pt y-label.
    • Fix: Increase waveform y-tick labelsize to 7–8pt and ylabel fontsize to 9–10pt
  2. VQ-03 MINOR: imprint_seq green→blue has limited luminance span for spectrograms — subtle spectral features above 2 kHz are hard to distinguish
    • Fix: Reverse colormap to ["#4467A3", "#009E73"] so high energy maps to green (brighter on light bg); OR tighten vmin to −60 dB to stretch the color range over the informative dynamic range
  3. CQ-04 MINOR: Nested Python loops in mel filterbank are inefficient and verbose
    • Fix: Replace with numpy vectorized filterbank construction using np.arange broadcasting
  4. LM-02 MINOR: No seaborn-distinctive statistical feature leveraged
    • Fix: Add a small marginal panel or inset using sns.kdeplot or sns.violinplot showing amplitude distribution per vowel

AI Feedback for Next Attempt

Increase waveform panel text sizes (tick labels to 7–8pt, y-label to 9–10pt). Consider reversing the imprint_seq colormap direction or tightening the vmin/vmax range to improve perceptual contrast across the spectrogram's dynamic range. Replace the nested Python mel filterbank loops with numpy vectorized operations for cleaner code. Add one seaborn-distinctive statistical element (e.g., a small sns.kdeplot marginal panel or per-vowel amplitude distribution) to better leverage seaborn's unique capabilities beyond matplotlib.

Verdict: REJECTED

@github-actions github-actions Bot added quality:87 Quality score 87/100 ai-rejected Quality not OK, triggers update labels Jun 3, 2026
@github-actions github-actions Bot added ai-attempt-1 First repair attempt and removed ai-rejected Quality not OK, triggers update labels Jun 3, 2026
Attempt 1/3 - fixes based on AI review
@github-actions
Copy link
Copy Markdown
Contributor Author

github-actions Bot commented Jun 3, 2026

🔧 Repair Attempt 1/4

Applied fixes based on AI review feedback.

Status: Repair completed, re-triggering review...


🤖 impl-repair

@claude
Copy link
Copy Markdown
Contributor

claude Bot commented Jun 3, 2026

AI Review - Attempt 2/3

Image Description

Light render (plot-light.png): Three-panel layout on warm off-white #FAF8F1 background. Top waveform panel shows the 4-second synthesized vowel signal in brand green #009E73 with a subtle fill — six IPA vowel labels (/a/, /e/, /i/, /o/, /u/, /a/) annotated at 7pt and dashed segment boundaries at low alpha. Middle mel-spectrogram (the main panel) uses the Imprint-derived blue→green colormap: low-energy regions in #4467A3 blue, high-energy formant bands in #009E73 green. Six distinct horizontal formant bands are clearly visible, transitioning through the vowel sequence. F0/2F0/3F0 harmonic marker annotations appear in the lower-right with badge backgrounds. Colorbar on the right labeled "Power (dB)" ranges -80 to 0 dB. Frequency y-axis shows Hz ticks at 100, 200, 500, 1k, 2k, 4k, 8k. Bottom KDE panel shows amplitude distributions per vowel — /a/ (green) is dominant, /e/ (purple) and /i/ (blue) partially visible, /o/ (ochre) and /u/ (red) barely distinguishable. All primary text is readable against the light background. 7pt annotation text is small but legible at full resolution.

Dark render (plot-dark.png): Same three-panel structure on warm near-black #1A1A17 background. Title, axis labels, tick labels all rendered in light #F0EFE8/#B8B7B0 tokens — fully readable against the dark surface. No dark-on-dark failure detected anywhere. Spectrogram data colors are identical to the light render (blue→green formant bands unchanged — only chrome flipped). Waveform brand green #009E73 remains prominent on the dark surface. Colorbar and KDE legend text are light-colored and readable. KDE distributions show the same visibility pattern as light render. Both panels confirm correct theme adaptation.

Both paragraphs are required. A review that only describes one render is invalid.

Score: 87/100

Category Score Max
Visual Quality 26 30
Design Excellence 14 20
Spec Compliance 15 15
Data Quality 15 15
Code Quality 10 10
Library Mastery 7 10
Total 87 100

Visual Quality (26/30)

  • VQ-01: Text Legibility (7/8) — Title, axis labels, ticks readable in both themes. Minor: 7pt waveform vowel labels and F0/2F0/3F0 annotations borderline small for mobile scaling.
  • VQ-02: No Overlap (5/6) — No major overlaps. Minor: harmonic annotations clustered near frequency tick labels in spectrogram lower-right.
  • VQ-03: Element Visibility (4/6) — Spectrogram formant bands and waveform clearly visible. KDE /o/ (ochre #BD8233) and /u/ (red #AE3030) barely distinguishable at fill alpha=0.3.
  • VQ-04: Color Accessibility (2/2) — Imprint-derived cmap. No red-green sole encoding.
  • VQ-05: Layout & Canvas (4/4) — Canvas gate passed. Three-panel height_ratios=[1,4,2] well-proportioned. No clipping or overflow.
  • VQ-06: Axis Labels & Title (2/2) — 'Frequency (Hz)', 'Time (s)', colorbar 'Power (dB)' — all correct with units.
  • VQ-07: Palette Compliance (2/2) — Imprint-derived sequential cmap for spectrogram. Categorical KDE starts with #009E73. Backgrounds correct. (Note: colormap direction is reversed from canonical green→blue; blue→green is used instead — acceptable design choice for spectrogram.)

Design Excellence (14/20)

  • DE-01: Aesthetic Sophistication (6/8) — Well above default: three-panel scientific dashboard, custom Imprint colormap, scientific harmonic annotations, intentional panel hierarchy.
  • DE-02: Visual Refinement (4/6) — Spines removed via sns.despine on all panels. Colorbar and legend styled with Imprint tokens. Subtle grid (alpha=0.15). Boundary dashed lines at low alpha.
  • DE-03: Data Storytelling (4/6) — Three panels narrate waveform → spectral content → amplitude statistics. IPA vowel labels + boundaries on both waveform and spectrogram link the narrative.

Spec Compliance (15/15)

  • SC-01: Plot Type (5/5) — Correct mel-spectrogram as center panel.
  • SC-02: Required Features (4/4) — Power-to-dB, mel filterbank, Hz-labeled y-axis, dB colorbar, standard parameters (n_fft=2048, hop_length=512, n_mels=128), synthesized audio.
  • SC-03: Data Mapping (3/3) — X: time in seconds, Y: mel-scaled frequency with Hz labels, Color: power in dB.
  • SC-04: Title & Legend (3/3) — 'spectrogram-mel · python · seaborn · anyplot.ai' correct format. KDE legend with vowel categories.

Data Quality (15/15)

  • DQ-01: Feature Coverage (6/6) — Waveform, mel-spectrogram with formant bands, KDE per vowel. Temporal progression, formant structure, and harmonic series all shown.
  • DQ-02: Realistic Context (5/5) — IPA vowel synthesis with realistic formant frequencies (male voice). Neutral linguistic dataset. Standard audio ML parameters.
  • DQ-03: Appropriate Scale (4/4) — -80 to 0 dB dynamic range appropriate for speech. 128 mel bands standard. 22050 Hz sample rate.

Code Quality (10/10)

  • CQ-01: KISS Structure (3/3) — No functions or classes. Procedural throughout.
  • CQ-02: Reproducibility (2/2) — np.random.seed(42). Deterministic synthesis.
  • CQ-03: Clean Imports (2/2) — All imports used.
  • CQ-04: Code Elegance (2/2) — No fake interactivity. Fully vectorized mel filterbank with numpy broadcasting (no nested Python loops).
  • CQ-05: Output & API (1/1) — Saves as plot-{THEME}.png. No bbox_inches='tight'.

Library Mastery (7/10)

  • LM-01: Idiomatic Usage (4/5) — sns.heatmap, sns.lineplot, sns.kdeplot, sns.set_theme, sns.despine all idiomatic. Minor: sns.set_context could unify font scaling.
  • LM-02: Distinctive Features (3/5) — sns.heatmap for the spectrogram is the distinctively seaborn approach. sns.kdeplot with fill=True and hue for multi-series distributions. Three-panel seaborn integration.

Score Caps Applied

  • None

Strengths

  • Sophisticated three-panel layout (waveform + mel-spectrogram + KDE) tells a coherent audio analysis story
  • Fully vectorized mel filterbank with numpy broadcasting — no nested Python loops
  • Scientifically rigorous IPA vowel synthesis with realistic formant frequencies for a male voice
  • Correct theme-adaptive chrome applied throughout all panels, colorbar, and legend
  • Harmonic annotations (F0/2F0/3F0) on the spectrogram add genuine scientific insight
  • Idiomatic seaborn usage across all three panel types

Weaknesses

  • KDE panel: /o/ (#BD8233 ochre) and /u/ (#AE3030 red) distributions barely distinguishable — masked by dominant /a/ green at fill alpha=0.3; increase linewidth or reduce fill alpha on dominant series
  • Waveform IPA vowel labels and harmonic annotations use 7pt fontsize — borderline small for mobile (~400px) scaling; increase to 8–9pt
  • imprint_seq colormap reversed from canonical direction (blue→green vs spec's green→blue); while intuitive for a spectrogram, document the intentional reversal

Issues Found

  1. VQ-03 LOW: KDE /o/ and /u/ distributions barely visible at fill alpha=0.3 with all series heavily overlapping
    • Fix: Reduce fill=True alpha on the dominant /a/ series, or increase linewidth to 1.8–2.0 to make smaller distributions more visible
  2. VQ-01 MINOR: 7pt annotation text (waveform vowel labels, F0/2F0/3F0 markers) borderline at mobile scale
    • Fix: Increase annotation fontsize to 8–9pt; harmonic label text in particular benefits from the extra size

AI Feedback for Next Attempt

This is a strong, above-average implementation scoring 87/100. Core concerns to address: (1) KDE visibility — the /o/ and /u/ amplitude distributions are barely distinguishable; adjust alpha or linewidth to differentiate the less-dominant series from the dominant /a/. (2) Minor annotation size — bump waveform vowel labels and F0/2F0/3F0 text from 7pt to 8–9pt. The mel-spectrogram itself, the formant synthesis, the harmonic annotations, the multi-panel narrative, and the theme adaptation are all excellent.

Verdict: APPROVED

@github-actions github-actions Bot added the ai-approved Quality OK, ready for merge label Jun 3, 2026
@MarkusNeusinger MarkusNeusinger merged commit d5da3e3 into main Jun 3, 2026
@MarkusNeusinger MarkusNeusinger deleted the implementation/spectrogram-mel/seaborn branch June 3, 2026 18:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ai-approved Quality OK, ready for merge ai-attempt-1 First repair attempt quality:87 Quality score 87/100

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant