Skip to content

voice recognition system #4603

@Dareen511

Description

@Dareen511

Prerequisites

  • Write a descriptive title.
  • Make sure you are able to repro it on the latest released version
  • Search the existing issues, especially the pinned issues.

Exception report

Last 5 Keys:
 & Space " C :

Exception:
System.ArgumentOutOfRangeException: The value must be greater than or equal to zero and less than the console's buffer size in that dimension.
Parameter name: left
Actual value was -2.
   at System.Console.SetCursorPosition(Int32 left, Int32 top)
   at Microsoft.PowerShell.Internal.VirtualTerminal.set_CursorLeft(Int32 value)
   at Microsoft.PowerShell.PSConsoleReadLine.ReallyRender(RenderData renderData, String defaultColor)
   at Microsoft.PowerShell.PSConsoleReadLine.ForceRender()
   at Microsoft.PowerShell.PSConsoleReadLine.Insert(Char c)
   at Microsoft.PowerShell.PSConsoleReadLine.SelfInsert(Nullable`1 key, Object arg)
   at Microsoft.PowerShell.PSConsoleReadLine.ProcessOneKey(ConsoleKeyInfo key, Dictionary`2 dispatchTable, Boolean ignoreIfNoAction, Object arg)
   at Microsoft.PowerShell.PSConsoleReadLine.InputLoop()
   at Microsoft.PowerShell.PSConsoleReadLine.ReadLine(Runspace runspace, EngineIntrinsics engineIntrinsics)

Screenshot

Image

Environment data

PS HostName: Visual Studio Code Host
PSReadLine EditMode: Windows

Steps to reproduce

import os
import librosa
import numpy as np
import sidekit

Set your dataset directory

audio_dir = 'path-to-dataset'

Initialize a list to store MFCC features

mfcc_features_list = []
session_list = [] # Initialize a list to store session identifiers

Step 1: Extract MFCC Features

print("Extracting MFCC features...")
for file_name in os.listdir(audio_dir):
if file_name.endswith('.WAV'):
file_path = os.path.join(audio_dir, file_name)

    # Load the audio file
    audio_signal, sample_rate = librosa.load(file_path, sr=16000)
    
    # Extract MFCC features
    mfcc_features = librosa.feature.mfcc(y=audio_signal, sr=sample_rate, n_mfcc=13)
    
    # Append the MFCC features to the list
    mfcc_features_list.append(mfcc_features)
    session_list.append(file_name)  # Append the file name as the session identifier

Determine the maximum number of frames (time steps)

max_frames = max([mfcc.shape[1] for mfcc in mfcc_features_list])

Pad the MFCC features so that they all have the same shape

padded_mfcc_features_list = [np.pad(mfcc, ((0, 0), (0, max_frames - mfcc.shape[1])), mode='constant') for mfcc in mfcc_features_list]

Convert the list to a numpy array

features = np.array(padded_mfcc_features_list)

Concatenate features along the time axis

concatenated_features = np.concatenate(features, axis=1)

Step 2: Train the UBM Model

print("Training the UBM model...")
ubm = sidekit.Mixture()
distrib_nb = 512 # Number of distributions (mixtures)
llr = [] # Initialize log-likelihood ratio list
ubm.EM_split(concatenated_features, distrib_nb, session_list, llr) # Train the UBM model using the EM algorithm
np.save('ubm_model.npy', ubm) # Save the trained UBM model to a file

Step 3: Train the TV Matrix

print("Training the TV Matrix...")
tv_matrix = sidekit.TotalVariability()
tv_matrix.factor_analyze(features, ubm)
np.save('tv_matrix.npy', tv_matrix) # Save the trained TV matrix to a file

Step 4: Extract and Save I-Vectors

print("Extracting and saving I-vectors...")
enrolled_ivectors = []
labels = ['Alice', 'Bob', 'Charlie'] # Replace with actual names

for feature in features: # Iterate over each speaker's features
ivector = sidekit.IVector()
ivector.compute(feature, ubm, tv_matrix)
enrolled_ivectors.append(ivector)

np.save('enrolled_ivectors.npy', np.array(enrolled_ivectors))
np.save('labels.npy', labels)

print("All models and vectors have been generated and saved.")

Expected behavior

this code should train my model so i ca run it and test it on the dataset i have

Actual behavior

it does not work and gave the above error

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions