MIPT, autumn 2025
| # | Date | Description | Materials |
|---|---|---|---|
| 1 | 10.09 | Introduction. Speech Processing Tasks | slides, recording |
| 2 | 17.09 | Digital Signal Processing, RIR, AEC | slides, recording, seminar |
| 3 | 24.09 | STFT, Keyword Spotting | slides, recording, seminar, HW |
| 4 | 01.10 | Speech Recognition: CTC, Beam Search, Rescoring | slides, recording |
| 5 | 08.10 | Speech Recognition: Encoder-Decoder, Streaming, RNN-T, Decoder-only | slides, recording, seminar, HW |
| 6 | 15.10 | Self-Supervised Learning: wav2vec2.0, HuBERT, BEST-RQ, GigaAM | slides, recording |
| 7 | 22.10 | Speech Recognition: Semi-Supervised Learning, Data | slides, recording, seminar |
| 8 | 29.10 | Speaker Recognition | slides, recording, seminar, HW |
| 9 | 05.11 | Voice Activity Detection, Speaker Diarization, Speaker-attributed ASR | slides, recording, seminar |
| 10 | 12.11 | Audio-Conditioned LLMs | slides, recording, seminar, HW |
| 11 | 19.11 | Text-to-Speech: Conventional Models | slides, recording |
| 12 | 26.11 | Text-to-Speech: Codecs, Vocoders | slides, seminar |
| 13 | 03.12 | Text-to-Speech: Recent Advancements | slides, recording, HW |
| 14 | 17.12 | Speech-to-Speech LLMs | slides, recording |