- 
                Notifications
    You must be signed in to change notification settings 
- Fork 317
Description
Prerequisites
- Write a descriptive title.
- Make sure you are able to repro it on the latest released version
- Search the existing issues, especially the pinned issues.
Exception report
Last 5 Keys:
 & Space " C :
Exception:
System.ArgumentOutOfRangeException: The value must be greater than or equal to zero and less than the console's buffer size in that dimension.
Parameter name: left
Actual value was -2.
   at System.Console.SetCursorPosition(Int32 left, Int32 top)
   at Microsoft.PowerShell.Internal.VirtualTerminal.set_CursorLeft(Int32 value)
   at Microsoft.PowerShell.PSConsoleReadLine.ReallyRender(RenderData renderData, String defaultColor)
   at Microsoft.PowerShell.PSConsoleReadLine.ForceRender()
   at Microsoft.PowerShell.PSConsoleReadLine.Insert(Char c)
   at Microsoft.PowerShell.PSConsoleReadLine.SelfInsert(Nullable`1 key, Object arg)
   at Microsoft.PowerShell.PSConsoleReadLine.ProcessOneKey(ConsoleKeyInfo key, Dictionary`2 dispatchTable, Boolean ignoreIfNoAction, Object arg)
   at Microsoft.PowerShell.PSConsoleReadLine.InputLoop()
   at Microsoft.PowerShell.PSConsoleReadLine.ReadLine(Runspace runspace, EngineIntrinsics engineIntrinsics)Screenshot
Environment data
PS HostName: Visual Studio Code Host
PSReadLine EditMode: WindowsSteps to reproduce
import os
import librosa
import numpy as np
import sidekit
Set your dataset directory
audio_dir = 'path-to-dataset'
Initialize a list to store MFCC features
mfcc_features_list = []
session_list = []  # Initialize a list to store session identifiers
Step 1: Extract MFCC Features
print("Extracting MFCC features...")
for file_name in os.listdir(audio_dir):
if file_name.endswith('.WAV'):
file_path = os.path.join(audio_dir, file_name)
    # Load the audio file
    audio_signal, sample_rate = librosa.load(file_path, sr=16000)
    
    # Extract MFCC features
    mfcc_features = librosa.feature.mfcc(y=audio_signal, sr=sample_rate, n_mfcc=13)
    
    # Append the MFCC features to the list
    mfcc_features_list.append(mfcc_features)
    session_list.append(file_name)  # Append the file name as the session identifier
Determine the maximum number of frames (time steps)
max_frames = max([mfcc.shape[1] for mfcc in mfcc_features_list])
Pad the MFCC features so that they all have the same shape
padded_mfcc_features_list = [np.pad(mfcc, ((0, 0), (0, max_frames - mfcc.shape[1])), mode='constant') for mfcc in mfcc_features_list]
Convert the list to a numpy array
features = np.array(padded_mfcc_features_list)
Concatenate features along the time axis
concatenated_features = np.concatenate(features, axis=1)
Step 2: Train the UBM Model
print("Training the UBM model...")
ubm = sidekit.Mixture()
distrib_nb = 512  # Number of distributions (mixtures)
llr = []  # Initialize log-likelihood ratio list
ubm.EM_split(concatenated_features, distrib_nb, session_list, llr)  # Train the UBM model using the EM algorithm
np.save('ubm_model.npy', ubm)  # Save the trained UBM model to a file
Step 3: Train the TV Matrix
print("Training the TV Matrix...")
tv_matrix = sidekit.TotalVariability()
tv_matrix.factor_analyze(features, ubm)
np.save('tv_matrix.npy', tv_matrix)  # Save the trained TV matrix to a file
Step 4: Extract and Save I-Vectors
print("Extracting and saving I-vectors...")
enrolled_ivectors = []
labels = ['Alice', 'Bob', 'Charlie']  # Replace with actual names
for feature in features:  # Iterate over each speaker's features
ivector = sidekit.IVector()
ivector.compute(feature, ubm, tv_matrix)
enrolled_ivectors.append(ivector)
np.save('enrolled_ivectors.npy', np.array(enrolled_ivectors))
np.save('labels.npy', labels)
print("All models and vectors have been generated and saved.")
Expected behavior
this code should train my model so i ca run it and test it on the dataset i have
Actual behavior
it does not work and gave the above error
