voice recognition system

### Prerequisites

- [x] Write a descriptive title.
- [x] Make sure you are able to repro it on the [latest released version](https://www.powershellgallery.com/packages/PSReadLine)
- [x] Search the existing issues, especially the pinned issues.

### Exception report

```console
Last 5 Keys:
 & Space " C :

Exception:
System.ArgumentOutOfRangeException: The value must be greater than or equal to zero and less than the console's buffer size in that dimension.
Parameter name: left
Actual value was -2.
   at System.Console.SetCursorPosition(Int32 left, Int32 top)
   at Microsoft.PowerShell.Internal.VirtualTerminal.set_CursorLeft(Int32 value)
   at Microsoft.PowerShell.PSConsoleReadLine.ReallyRender(RenderData renderData, String defaultColor)
   at Microsoft.PowerShell.PSConsoleReadLine.ForceRender()
   at Microsoft.PowerShell.PSConsoleReadLine.Insert(Char c)
   at Microsoft.PowerShell.PSConsoleReadLine.SelfInsert(Nullable`1 key, Object arg)
   at Microsoft.PowerShell.PSConsoleReadLine.ProcessOneKey(ConsoleKeyInfo key, Dictionary`2 dispatchTable, Boolean ignoreIfNoAction, Object arg)
   at Microsoft.PowerShell.PSConsoleReadLine.InputLoop()
   at Microsoft.PowerShell.PSConsoleReadLine.ReadLine(Runspace runspace, EngineIntrinsics engineIntrinsics)
```

### Screenshot

![Image](https://github.com/user-attachments/assets/268c9ca1-ef16-4344-bc64-becbdd15262e)

### Environment data

```console
PS HostName: Visual Studio Code Host
PSReadLine EditMode: Windows
```

### Steps to reproduce

import os
import librosa
import numpy as np
import sidekit

# Set your dataset directory
audio_dir = 'path-to-dataset'

# Initialize a list to store MFCC features
mfcc_features_list = []
session_list = []  # Initialize a list to store session identifiers

# Step 1: Extract MFCC Features
print("Extracting MFCC features...")
for file_name in os.listdir(audio_dir):
    if file_name.endswith('.WAV'):
        file_path = os.path.join(audio_dir, file_name)
        
        # Load the audio file
        audio_signal, sample_rate = librosa.load(file_path, sr=16000)
        
        # Extract MFCC features
        mfcc_features = librosa.feature.mfcc(y=audio_signal, sr=sample_rate, n_mfcc=13)
        
        # Append the MFCC features to the list
        mfcc_features_list.append(mfcc_features)
        session_list.append(file_name)  # Append the file name as the session identifier

# Determine the maximum number of frames (time steps)
max_frames = max([mfcc.shape[1] for mfcc in mfcc_features_list])

# Pad the MFCC features so that they all have the same shape
padded_mfcc_features_list = [np.pad(mfcc, ((0, 0), (0, max_frames - mfcc.shape[1])), mode='constant') for mfcc in mfcc_features_list]

# Convert the list to a numpy array
features = np.array(padded_mfcc_features_list)

# Concatenate features along the time axis
concatenated_features = np.concatenate(features, axis=1)

# Step 2: Train the UBM Model
print("Training the UBM model...")
ubm = sidekit.Mixture()
distrib_nb = 512  # Number of distributions (mixtures)
llr = []  # Initialize log-likelihood ratio list
ubm.EM_split(concatenated_features, distrib_nb, session_list, llr)  # Train the UBM model using the EM algorithm
np.save('ubm_model.npy', ubm)  # Save the trained UBM model to a file

# Step 3: Train the TV Matrix
print("Training the TV Matrix...")
tv_matrix = sidekit.TotalVariability()
tv_matrix.factor_analyze(features, ubm)
np.save('tv_matrix.npy', tv_matrix)  # Save the trained TV matrix to a file

# Step 4: Extract and Save I-Vectors
print("Extracting and saving I-vectors...")
enrolled_ivectors = []
labels = ['Alice', 'Bob', 'Charlie']  # Replace with actual names

for feature in features:  # Iterate over each speaker's features
    ivector = sidekit.IVector()
    ivector.compute(feature, ubm, tv_matrix)
    enrolled_ivectors.append(ivector)

np.save('enrolled_ivectors.npy', np.array(enrolled_ivectors))
np.save('labels.npy', labels)

print("All models and vectors have been generated and saved.")


### Expected behavior

this code should train my model so i ca run it and test it on the dataset i have

### Actual behavior

it does not work and gave the above error

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

voice recognition system #4603

Prerequisites

Exception report

Screenshot

Environment data

Steps to reproduce

Set your dataset directory

Initialize a list to store MFCC features

Step 1: Extract MFCC Features

Determine the maximum number of frames (time steps)

Pad the MFCC features so that they all have the same shape

Convert the list to a numpy array

Concatenate features along the time axis

Step 2: Train the UBM Model

Step 3: Train the TV Matrix

Step 4: Extract and Save I-Vectors

Expected behavior

Actual behavior

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

voice recognition system #4603

Description

Prerequisites

Exception report

Screenshot

Environment data

Steps to reproduce

Set your dataset directory

Initialize a list to store MFCC features

Step 1: Extract MFCC Features

Determine the maximum number of frames (time steps)

Pad the MFCC features so that they all have the same shape

Convert the list to a numpy array

Concatenate features along the time axis

Step 2: Train the UBM Model

Step 3: Train the TV Matrix

Step 4: Extract and Save I-Vectors

Expected behavior

Actual behavior

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions