Skip to content

fobe-projects/arduino-voice-command

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

3 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Arduino Voice Command

License Version

An Arduino library for voice command recognition powered by Edge Impulse. This library contains signal processing code and machine learning models to classify real-time audio data and recognize common voice commands.

🎯 Features

  • Real-time Voice Recognition: Recognizes 6 voice commands
  • Supported Commands:
    • backward - Move backward
    • down - Move down
    • go - Start/Go forward
    • left - Turn left
    • right - Turn right
    • up - Move up
  • Optimized Model: Compiled with TensorFlow Lite Micro and EON Compiler
  • Low Memory Footprint: Only ~10KB arena size required
  • High Sample Rate: 16kHz audio sampling

πŸ“‹ Requirements

Hardware Requirements

  • Recommended Boards:
    • Arduino Nano 33 BLE Sense (with microphone)
    • Arduino Portenta H7
    • Arduino Nicla Vision
  • Minimum Specifications:
    • ARM Cortex core
    • At least 64KB RAM
    • Microphone input

Software Dependencies

  • Arduino IDE 1.8.0 or higher / Arduino CLI
  • Required libraries (auto-installed via Arduino Library Manager):
    • Arduino_LSM9DS1 - 9-axis inertial sensor
    • PDM - PDM microphone support
    • Arduino_OV767X - Camera module support

πŸš€ Installation

Method 1: Arduino Library Manager (Recommended)

  1. Open Arduino IDE
  2. Go to Tools β†’ Manage Libraries...
  3. Search for voice_command
  4. Click Install

Method 2: Manual Installation

  1. Download the ZIP file from this repository
  2. In Arduino IDE, select Sketch β†’ Include Library β†’ Add .ZIP Library...
  3. Select the downloaded ZIP file

Method 3: Git Clone

cd ~/Documents/Arduino/libraries/
git clone https://github.com/fobe-projects/arduino-voice-command.git

πŸ“– Usage Example

Basic Voice Recognition

#include <voice_command.h>
#include <PDM.h>

// Audio buffer
#define SAMPLE_BUFFER_SIZE 16000
int16_t sample_buffer[SAMPLE_BUFFER_SIZE];
volatile int samples_read = 0;

void setup() {
    Serial.begin(115200);
    
    // Initialize PDM microphone
    PDM.onReceive(pdm_data_ready_inference_callback);
    PDM.setBufferSize(4096);
    
    if (!PDM.begin(1, EI_CLASSIFIER_FREQUENCY)) {
        Serial.println("Failed to start PDM!");
        while (1);
    }
    
    Serial.println("Voice command recognition started");
}

void loop() {
    // Wait for audio sampling to complete
    if (samples_read >= SAMPLE_BUFFER_SIZE) {
        samples_read = 0;
        
        // Run inference
        signal_t signal;
        signal.total_length = SAMPLE_BUFFER_SIZE;
        signal.get_data = &get_audio_signal_data;
        
        ei_impulse_result_t result = { 0 };
        
        EI_IMPULSE_ERROR res = run_classifier(&signal, &result, false);
        
        if (res != EI_IMPULSE_OK) {
            Serial.print("Inference failed: ");
            Serial.println(res);
            return;
        }
        
        // Print prediction results
        Serial.println("Predictions:");
        for (size_t ix = 0; ix < EI_CLASSIFIER_LABEL_COUNT; ix++) {
            Serial.print("  ");
            Serial.print(result.classification[ix].label);
            Serial.print(": ");
            Serial.println(result.classification[ix].value, 4);
        }
        
        // Find command with highest confidence
        float max_confidence = 0;
        const char* detected_command = "";
        
        for (size_t ix = 0; ix < EI_CLASSIFIER_LABEL_COUNT; ix++) {
            if (result.classification[ix].value > max_confidence) {
                max_confidence = result.classification[ix].value;
                detected_command = result.classification[ix].label;
            }
        }
        
        // Execute action if confidence exceeds threshold
        if (max_confidence > 0.6) {
            Serial.print("Detected command: ");
            Serial.println(detected_command);
            handleVoiceCommand(detected_command);
        }
    }
    
    delay(10);
}

// PDM data callback
void pdm_data_ready_inference_callback() {
    int bytesAvailable = PDM.available();
    int bytesRead = PDM.read((char *)&sample_buffer[samples_read], bytesAvailable);
    samples_read += bytesRead / 2; // 16-bit samples
}

// Get audio signal data
int get_audio_signal_data(size_t offset, size_t length, float *out_ptr) {
    for (size_t i = 0; i < length; i++) {
        out_ptr[i] = (float)sample_buffer[offset + i];
    }
    return 0;
}

// Handle detected voice command
void handleVoiceCommand(const char* command) {
    if (strcmp(command, "go") == 0) {
        Serial.println("Action: Move forward");
        // Add your code here
    }
    else if (strcmp(command, "backward") == 0) {
        Serial.println("Action: Move backward");
        // Add your code here
    }
    else if (strcmp(command, "left") == 0) {
        Serial.println("Action: Turn left");
        // Add your code here
    }
    else if (strcmp(command, "right") == 0) {
        Serial.println("Action: Turn right");
        // Add your code here
    }
    else if (strcmp(command, "up") == 0) {
        Serial.println("Action: Move up");
        // Add your code here
    }
    else if (strcmp(command, "down") == 0) {
        Serial.println("Action: Move down");
        // Add your code here
    }
}

πŸ”§ Configuration Parameters

Main configuration parameters are defined in src/model-parameters/model_metadata.h:

Parameter Value Description
EI_CLASSIFIER_FREQUENCY 16000 Sample frequency (Hz)
EI_CLASSIFIER_RAW_SAMPLE_COUNT 16000 Number of samples per inference
EI_CLASSIFIER_LABEL_COUNT 6 Number of classification labels
EI_CLASSIFIER_THRESHOLD 0.6 Recognition threshold
EI_CLASSIFIER_INTERVAL_MS 0.0625 Sampling interval (ms)
EI_CLASSIFIER_TFLITE_LARGEST_ARENA_SIZE 9990 TensorFlow Lite arena size (bytes)

πŸ“Š Model Information

  • Model Type: TensorFlow Lite Micro (Quantized)
  • Inference Engine: EON Compiler (Compiled)
  • Input Format: INT8 quantized
  • Output Format: INT8 quantized
  • Feature Extraction: MFCC (Mel-frequency cepstral coefficients)
  • Project ID: 818488
  • Deployment Version: 10
  • Edge Impulse Studio Version: 1.78.1

πŸ› οΈ Development & Debugging

View Inference Performance

Serial.print("Inference time: ");
Serial.print(result.timing.dsp);
Serial.print(" ms (DSP), ");
Serial.print(result.timing.classification);
Serial.println(" ms (classification)");

Adjust Confidence Threshold

If false positives occur, increase the threshold:

// Increase to 0.8 to reduce false positives
if (max_confidence > 0.8) {
    // Handle command
}

Enable Detailed Logging

Define before including voice_command.h:

#define EI_DEBUG 1
#include <voice_command.h>

πŸ“š API Reference

Main Functions

run_classifier(signal_t *signal, ei_impulse_result_t *result, bool debug)

Run classifier inference

  • Parameters:
    • signal: Input signal structure
    • result: Output result structure
    • debug: Enable debug output
  • Returns: EI_IMPULSE_ERROR error code

run_classifier_continuous(signal_t *signal, ei_impulse_result_t *result, bool debug)

Continuous inference mode with sliding window support

Data Structures

ei_impulse_result_t

typedef struct {
    ei_impulse_result_classification_t classification[EI_CLASSIFIER_LABEL_COUNT];
    ei_impulse_result_timing_t timing;
} ei_impulse_result_t;

signal_t

typedef struct {
    size_t total_length;
    get_signal_data_fn get_data;
} signal_t;

🀝 Contributing

Issues and pull requests are welcome!

  1. Fork this repository
  2. Create a feature branch (git checkout -b feature/AmazingFeature)
  3. Commit your changes (git commit -m 'Add some AmazingFeature')
  4. Push to the branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

πŸ“ License

This project uses the following licenses:

  • Edge Impulse SDK: Apache License 2.0 (see src/edge-impulse-sdk/LICENSE)
  • Model Files: Edge Impulse Commercial License (see src/model-parameters/model_metadata.h)
  • TensorFlow Lite: Apache License 2.0 (see src/edge-impulse-sdk/tensorflow/LICENSE)

Important Note: The machine learning models included in this library require an active Edge Impulse subscription to use. Please refer to the Edge Impulse Terms of Service for details.

πŸ”— Related Links

❓ FAQ

Q: What if recognition accuracy is low?

A: Try the following methods:

  • Ensure the microphone is working properly and unobstructed
  • Test in a quiet environment
  • Speak commands clearly
  • Adjust the confidence threshold

Q: Does it support commands in other languages?

A: The current model is trained with English commands. To support other languages, you need to retrain the model on the Edge Impulse platform.

Q: Can I add custom commands?

A: Yes. You need to collect new training data on the Edge Impulse platform, retrain the model, and export a new Arduino library.

Q: What if I run out of memory?

A: Ensure you're using an Arduino board with at least 64KB of RAM, or optimize the model size in Edge Impulse Studio.

πŸ‘¨β€πŸ’» Authors

  • EdgeImpulse Inc. - Library Maintainer
  • chihosin - Project Owner

πŸ™ Acknowledgments

  • Edge Impulse team for the machine learning toolchain
  • TensorFlow Lite Micro team
  • Arduino community

Note: Before using this library, please ensure you have read and agreed to the Edge Impulse Terms of Service. Commercial use requires an active Edge Impulse subscription.

About

Edge Impulse speech command library

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published