|
1 | 1 | --- |
2 | | -title: Audio Analyzer |
3 | | -description: Audio Analyzer is a tool designed to obtain detailed information about audio files. |
| 2 | +title: "Audio Analyzer" |
| 3 | +description: "Learn how to use the Audio Analyzer to get detailed information about your audio files." |
4 | 4 | --- |
5 | 5 |
|
6 | | - |
| 6 | +import { Aside, Steps } from '@astrojs/starlight/components'; |
7 | 7 |
|
8 | | -## On what kind of occasion can audio analyzer be useful? |
| 8 | +The **Audio Analyzer** is a powerful tool that provides detailed information about your audio files, including sample rate, frequency distribution, and more. This information is crucial for training high-quality voice models. |
9 | 9 |
|
10 | | -If you want to perform a training session correctly, it is advisable to know the frequency (Sample Rate) of the audio that is being used. Currently applio is compatible and has pretraineds in `32k, 40k and 48k`, these values refer to the hertz rate at which the pretraineds are created to use (32000hz, 40000hz, 48000hz). This clearly means that you will have to use audio in the mentioned frequencies to have an adequate result, especially when you have clean and quality audio. |
11 | | -- You can observe the audio frequency in a reliable software such as audacity, fl studio, [Spek](https://github.com/alexkay/spek/releases/download/v0.8.5/spek-0.8.5-beta.zip) etc, But if you need to have more precise details about it, use the tool. |
| 10 | + |
12 | 11 |
|
13 | | -## Use Audio Analyzer Tool |
| 12 | +## Why is the Sample Rate Important? |
14 | 13 |
|
15 | | -### Upload your Audio |
16 | | -To proceed to use the Analyzer tool, go to the extra section, upload your audio, and click "get information about audio". |
| 14 | +Applio's pre-trained models are available in three sample rates: **32k**, **40k**, and **48k** (corresponding to 32,000 Hz, 40,000 Hz, and 48,000 Hz). For the best training results, the sample rate of your dataset should match the sample rate of the pre-trained model you are using. |
17 | 15 |
|
18 | | -### Check the information given |
| 16 | +While you can check the sample rate in audio editors like Audacity, the Audio Analyzer provides a more detailed analysis of your audio's frequency content. |
19 | 17 |
|
20 | | -When you get your audio you will see several information about it |
| 18 | +## How to Use the Audio Analyzer |
21 | 19 |
|
22 | | - |
| 20 | +<Steps> |
| 21 | +1. Navigate to the **Extras** tab in Applio. |
| 22 | +2. Upload your audio file using the **Upload Audio** box. |
| 23 | +3. Click the **Get Information About Audio** button. |
| 24 | +</Steps> |
23 | 25 |
|
24 | | -## How these graphs can help |
| 26 | +Once the analysis is complete, you will see a detailed breakdown of your audio file, including a spectrogram and several spectral feature graphs. |
25 | 27 |
|
26 | | -These three graphs provide valuable information about the audio you are about to use, allowing you to fine-tune your settings prior to training for optimal results, like the Spectrogram and Spectral Features, these provide crucial information. The spectrogram displays the full set of frequencies present in the audio, allowing you to identify unwanted sounds, such as background noise or unwanted frequencies. |
| 28 | + |
27 | 29 |
|
28 | | -This also applies to the Spectral Features, with the three data thresholds that are provided, a lot of information is shared about the audio, which helps to further examine its characteristics in low, mid and high frequencies, for example; |
| 30 | +## Understanding the Graphs |
29 | 31 |
|
30 | | -- **Spectral Centroid:** This graph shows both low and high frequencies within the audio, _represented as mentioned_, in the graph the higher frequencies will be upwards, while the lower frequencies will be downwards. |
31 | | -- **Spectral Bandwidth:** This can somewhat represent the "variety of things (in this case taking context of the general content of the audio)", normally this would not take on much importance except for convert something other than a voice. |
32 | | -- **Spectral Rolloff:** Basically, the rolloff takes all the audio context of the above-mentioned graphs under a specific volume threshold (in this case of the audio). |
| 32 | +The graphs provided by the Audio Analyzer can help you identify issues with your audio and fine-tune your training settings for optimal results. |
33 | 33 |
|
34 | | -Finally, you can also get the frequency using the audio analyzer, both for the spectrogram section as the Spectral Features, in the spectrogram you can observe the values and duplicate them, as with the Spectral Features, the numbers shown around the graph will help determine the frequency. |
| 34 | +### Spectrogram |
| 35 | + |
| 36 | +The spectrogram is a visual representation of the frequencies in your audio over time. It can help you identify unwanted noise, such as background hiss or electrical hum, which you can then remove using an audio editor. |
| 37 | + |
| 38 | +### Spectral Features |
| 39 | + |
| 40 | +The spectral feature graphs provide a more detailed look at the frequency content of your audio. |
| 41 | + |
| 42 | +- **Spectral Centroid:** This graph represents the "center of mass" of the spectrum. A higher spectral centroid indicates that the audio has more high-frequency content, while a lower spectral centroid indicates more low-frequency content. This can help you understand the overall brightness or darkness of the audio. |
| 43 | +- **Spectral Bandwidth:** This graph shows the range of frequencies in the audio. A wider bandwidth indicates a more complex sound with a wider range of frequencies, while a narrower bandwidth indicates a simpler sound. |
| 44 | +- **Spectral Rolloff:** This graph shows the frequency below which a certain percentage of the total spectral energy lies. It's another way to measure the "skewness" of the spectral distribution and can be useful for distinguishing between different types of sounds. |
| 45 | + |
| 46 | +By understanding these graphs, you can make more informed decisions about your audio processing and training settings, leading to better voice models. |
0 commit comments