Arduino Voice Command

An Arduino library for voice command recognition powered by Edge Impulse. This library contains signal processing code and machine learning models to classify real-time audio data and recognize common voice commands.

🎯 Features

Real-time Voice Recognition: Recognizes 6 voice commands
Supported Commands:
- backward - Move backward
- down - Move down
- go - Start/Go forward
- left - Turn left
- right - Turn right
- up - Move up
Optimized Model: Compiled with TensorFlow Lite Micro and EON Compiler
Low Memory Footprint: Only ~10KB arena size required
High Sample Rate: 16kHz audio sampling

📋 Requirements

Hardware Requirements

Recommended Boards:
- Arduino Nano 33 BLE Sense (with microphone)
- Arduino Portenta H7
- Arduino Nicla Vision
Minimum Specifications:
- ARM Cortex core
- At least 64KB RAM
- Microphone input

Software Dependencies

Arduino IDE 1.8.0 or higher / Arduino CLI
Required libraries (auto-installed via Arduino Library Manager):
- Arduino_LSM9DS1 - 9-axis inertial sensor
- PDM - PDM microphone support
- Arduino_OV767X - Camera module support

🚀 Installation

Method 1: Arduino Library Manager (Recommended)

Open Arduino IDE
Go to Tools → Manage Libraries...
Search for voice_command
Click Install

Method 2: Manual Installation

Download the ZIP file from this repository
In Arduino IDE, select Sketch → Include Library → Add .ZIP Library...
Select the downloaded ZIP file

Method 3: Git Clone

cd ~/Documents/Arduino/libraries/
git clone https://github.com/fobe-projects/arduino-voice-command.git

📖 Usage Example

Basic Voice Recognition

#include <voice_command.h>
#include <PDM.h>

// Audio buffer
#define SAMPLE_BUFFER_SIZE 16000
int16_t sample_buffer[SAMPLE_BUFFER_SIZE];
volatile int samples_read = 0;

void setup() {
    Serial.begin(115200);
    
    // Initialize PDM microphone
    PDM.onReceive(pdm_data_ready_inference_callback);
    PDM.setBufferSize(4096);
    
    if (!PDM.begin(1, EI_CLASSIFIER_FREQUENCY)) {
        Serial.println("Failed to start PDM!");
        while (1);
    }
    
    Serial.println("Voice command recognition started");
}

void loop() {
    // Wait for audio sampling to complete
    if (samples_read >= SAMPLE_BUFFER_SIZE) {
        samples_read = 0;
        
        // Run inference
        signal_t signal;
        signal.total_length = SAMPLE_BUFFER_SIZE;
        signal.get_data = &get_audio_signal_data;
        
        ei_impulse_result_t result = { 0 };
        
        EI_IMPULSE_ERROR res = run_classifier(&signal, &result, false);
        
        if (res != EI_IMPULSE_OK) {
            Serial.print("Inference failed: ");
            Serial.println(res);
            return;
        }
        
        // Print prediction results
        Serial.println("Predictions:");
        for (size_t ix = 0; ix < EI_CLASSIFIER_LABEL_COUNT; ix++) {
            Serial.print("  ");
            Serial.print(result.classification[ix].label);
            Serial.print(": ");
            Serial.println(result.classification[ix].value, 4);
        }
        
        // Find command with highest confidence
        float max_confidence = 0;
        const char* detected_command = "";
        
        for (size_t ix = 0; ix < EI_CLASSIFIER_LABEL_COUNT; ix++) {
            if (result.classification[ix].value > max_confidence) {
                max_confidence = result.classification[ix].value;
                detected_command = result.classification[ix].label;
            }
        }
        
        // Execute action if confidence exceeds threshold
        if (max_confidence > 0.6) {
            Serial.print("Detected command: ");
            Serial.println(detected_command);
            handleVoiceCommand(detected_command);
        }
    }
    
    delay(10);
}

// PDM data callback
void pdm_data_ready_inference_callback() {
    int bytesAvailable = PDM.available();
    int bytesRead = PDM.read((char *)&sample_buffer[samples_read], bytesAvailable);
    samples_read += bytesRead / 2; // 16-bit samples
}

// Get audio signal data
int get_audio_signal_data(size_t offset, size_t length, float *out_ptr) {
    for (size_t i = 0; i < length; i++) {
        out_ptr[i] = (float)sample_buffer[offset + i];
    }
    return 0;
}

// Handle detected voice command
void handleVoiceCommand(const char* command) {
    if (strcmp(command, "go") == 0) {
        Serial.println("Action: Move forward");
        // Add your code here
    }
    else if (strcmp(command, "backward") == 0) {
        Serial.println("Action: Move backward");
        // Add your code here
    }
    else if (strcmp(command, "left") == 0) {
        Serial.println("Action: Turn left");
        // Add your code here
    }
    else if (strcmp(command, "right") == 0) {
        Serial.println("Action: Turn right");
        // Add your code here
    }
    else if (strcmp(command, "up") == 0) {
        Serial.println("Action: Move up");
        // Add your code here
    }
    else if (strcmp(command, "down") == 0) {
        Serial.println("Action: Move down");
        // Add your code here
    }
}

🔧 Configuration Parameters

Main configuration parameters are defined in src/model-parameters/model_metadata.h:

Parameter	Value	Description
`EI_CLASSIFIER_FREQUENCY`	16000	Sample frequency (Hz)
`EI_CLASSIFIER_RAW_SAMPLE_COUNT`	16000	Number of samples per inference
`EI_CLASSIFIER_LABEL_COUNT`	6	Number of classification labels
`EI_CLASSIFIER_THRESHOLD`	0.6	Recognition threshold
`EI_CLASSIFIER_INTERVAL_MS`	0.0625	Sampling interval (ms)
`EI_CLASSIFIER_TFLITE_LARGEST_ARENA_SIZE`	9990	TensorFlow Lite arena size (bytes)

📊 Model Information

Model Type: TensorFlow Lite Micro (Quantized)
Inference Engine: EON Compiler (Compiled)
Input Format: INT8 quantized
Output Format: INT8 quantized
Feature Extraction: MFCC (Mel-frequency cepstral coefficients)
Project ID: 818488
Deployment Version: 10
Edge Impulse Studio Version: 1.78.1

🛠️ Development & Debugging

View Inference Performance

Serial.print("Inference time: ");
Serial.print(result.timing.dsp);
Serial.print(" ms (DSP), ");
Serial.print(result.timing.classification);
Serial.println(" ms (classification)");

Adjust Confidence Threshold

If false positives occur, increase the threshold:

// Increase to 0.8 to reduce false positives
if (max_confidence > 0.8) {
    // Handle command
}

Enable Detailed Logging

Define before including voice_command.h:

#define EI_DEBUG 1
#include <voice_command.h>

📚 API Reference

Main Functions

`run_classifier(signal_t signal, ei_impulse_result_t result, bool debug)`

Run classifier inference

Parameters:
- signal: Input signal structure
- result: Output result structure
- debug: Enable debug output
Returns: EI_IMPULSE_ERROR error code

`run_classifier_continuous(signal_t signal, ei_impulse_result_t result, bool debug)`

Continuous inference mode with sliding window support

Data Structures

`ei_impulse_result_t`

typedef struct {
    ei_impulse_result_classification_t classification[EI_CLASSIFIER_LABEL_COUNT];
    ei_impulse_result_timing_t timing;
} ei_impulse_result_t;

`signal_t`

typedef struct {
    size_t total_length;
    get_signal_data_fn get_data;
} signal_t;

🤝 Contributing

Issues and pull requests are welcome!

Fork this repository
Create a feature branch (git checkout -b feature/AmazingFeature)
Commit your changes (git commit -m 'Add some AmazingFeature')
Push to the branch (git push origin feature/AmazingFeature)
Open a Pull Request

📝 License

This project uses the following licenses:

Edge Impulse SDK: Apache License 2.0 (see src/edge-impulse-sdk/LICENSE)
Model Files: Edge Impulse Commercial License (see src/model-parameters/model_metadata.h)
TensorFlow Lite: Apache License 2.0 (see src/edge-impulse-sdk/tensorflow/LICENSE)

Important Note: The machine learning models included in this library require an active Edge Impulse subscription to use. Please refer to the Edge Impulse Terms of Service for details.

🔗 Related Links

Edge Impulse - Machine Learning Platform
Edge Impulse Forum - Community Support
Arduino Official Site - Arduino Platform
Project Documentation - Detailed Documentation

❓ FAQ

Q: What if recognition accuracy is low?

A: Try the following methods:

Ensure the microphone is working properly and unobstructed
Test in a quiet environment
Speak commands clearly
Adjust the confidence threshold

Q: Does it support commands in other languages?

A: The current model is trained with English commands. To support other languages, you need to retrain the model on the Edge Impulse platform.

Q: Can I add custom commands?

A: Yes. You need to collect new training data on the Edge Impulse platform, retrain the model, and export a new Arduino library.

Q: What if I run out of memory?

A: Ensure you're using an Arduino board with at least 64KB of RAM, or optimize the model size in Edge Impulse Studio.

👨‍💻 Authors

EdgeImpulse Inc. - Library Maintainer
chihosin - Project Owner

🙏 Acknowledgments

Edge Impulse team for the machine learning toolchain
TensorFlow Lite Micro team
Arduino community

Note: Before using this library, please ensure you have read and agreed to the Edge Impulse Terms of Service. Commercial use requires an active Edge Impulse subscription.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
src		src
LICENSE		LICENSE
README.md		README.md
library.properties		library.properties

License

fobe-projects/arduino-voice-command

Folders and files

Latest commit

History

Repository files navigation