SHL- Grammar Scoring Engine for Voice Samples

🎤 Predict Grammar Scores from Spoken Audio

🧠 Objective

Build a machine learning model that can automatically evaluate spoken audio and assign a grammar score (1–5) based on sentence structure and syntax quality.

🗂️ Dataset Overview

Mind Map - Dataset
├── Audio Files (.wav)
│   ├── audios_train/
│   └── audios_test/
├── train.csv
│   └── filename + grammar score
├── test.csv
│   └── filename only
└── sample_submission.csv
    └── sample format for output

---

## ⚙️ Workflow / Pipeline

Mind Map - Workflow

🎧 Audio to Text └── Using Whisper for transcription
✨ Text Cleaning └── Remove punctuation, lowercase, clean spaces
🧮 Feature Extraction └── TF-IDF Vectorizer (max 1000 features)
🌲 Model Training └── Random Forest Regressor
📊 Evaluation └── Pearson Correlation
🧪 Prediction on test set └── Generate submission.csv


---

## 📈 Evaluation Metric
**Pearson Correlation** used to evaluate prediction quality.

📌 Final Public Score: 0.519


---

## 📁 Files Included
- `Untitled0.ipynb` - Main notebook with code and explanations
- `submission.csv` - Output file with predictions for test set

---

## 💡 Future Enhancements

Mind Map - Improvements ├── Use advanced models (e.g. BERT, XGBoost) ├── Handle diverse accents ├── Use grammar-checking NLP tools └── Add audio-based features (e.g. fluency, pause detection)


---

## 👤 Author
**Crafted with care by [Avin Raj]** ✨

📬 For queries or collaborations, feel free to reach out!

Name		Name	Last commit message	Last commit date
Latest commit History 36 Commits
README.md		README.md
Untitled0.ipynb		Untitled0.ipynb
submission.csv		submission.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

SHL- Grammar Scoring Engine for Voice Samples

🎤 Predict Grammar Scores from Spoken Audio

🧠 Objective

🗂️ Dataset Overview

About

Uh oh!

Releases

Packages

Languages

Avinraj01/SHL-Grammar-Scoring-Engine-for-Voice-Samples

Folders and files

Latest commit

History

Repository files navigation

SHL- Grammar Scoring Engine for Voice Samples

🎤 Predict Grammar Scores from Spoken Audio

🧠 Objective

🗂️ Dataset Overview

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages