Speaker Identification and Gender Classification

This repository contains the implementation of a Machine Learning pipeline for Speaker Identification and Gender Classification using audio features.

🚀 Project Overview

The goal of this project is to develop robust models that can:

Classify Gender: Determine whether a speaker is male or female.
Identify Speakers: Distinguish between different speakers based on their voice characteristics.

The project utilizes comprehensive audio signal processing techniques and state-of-the-art machine learning algorithms, ranging from classical classifiers (SVM, KNN, XGBoost) to Neural Networks.

📂 Repository Structure

├── data/                   # Data directory (raw and processed)
├── notebooks/              # Jupyter notebooks for experimentation
├── scripts/                # Executable scripts for training and evaluation
├── src/                    # Source code for the project
│   ├── data/               # Data loading and cleaning
│   ├── features/           # Audio processing and feature extraction
│   ├── models/             # Model definitions (Sklearn, Keras, etc.)
│   └── visualization/      # Plotting and evaluation utilities
├── requirements.txt        # Project dependencies
├── setup.py                # Package setup script
└── README.md               # Project documentation

🛠️ Installation

Clone the repository:

git clone https://github.com/your-username/Speaker-ID-Gender-Classification.git
cd Speaker-ID-Gender-Classification

Create a virtual environment (recommended):

python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

Install dependencies:

pip install -r requirements.txt
pip install -e .

📊 Methodology

Feature Extraction

We extract a rich set of audio features including:

Spectral Features: MFCC, Spectral Centroid, Bandwidth, Contrast, Roll-off.
Temporal Features: Zero Crossing Rate, RMS Energy.
Prosodic Features: Fundamental Frequency (F0), Jitter, Shimmer.

Processing Pipeline

Silence Removal: Trimming silence using spectral centroid based windowing.
Noise Reduction: Spectral subtraction to enhance signal quality.
Filtering: Bandpass filter (80Hz - 5000Hz) to isolate human speech frequencies.
Resampling: Standardizing sample rate to 44.1kHz.

Models

We experiment with multiple architectures:

Support Vector Machine (SVM): RBF kernel for non-linear separation.
K-Nearest Neighbors (KNN): Baseline distance-based classifier.
XGBoost / AdaBoost: Ensemble methods for improved robustness.
Multi-Layer Perceptron (MLP): Deep learning approach using Keras/TensorFlow.

🏃‍♂️ Usage

1. Download Data

The dataset is hosted on Google Drive. Run the setup script to download and structure the data:

python src/data/download.py

2. Train Gender Classifier

To train and evaluate the gender classification model:

python scripts/train_gender.py --model svm

Available models: svm, knn, xgboost, adaboost.

📈 Results

Model	Accuracy	Precision	Recall
SVM	0.98	0.98	0.98
XGBoost	0.97	0.97	0.97
KNN	0.96	0.96	0.95

(Note: Results may vary slightly based on random seed and data split)

👥 Contributors

Mostafa Kermani Nia - Lead Developer & Researcher

📄 License

This project is licensed under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
Dataset/Audio_TEST		Dataset/Audio_TEST
legacy		legacy
reports		reports
scripts		scripts
src		src
.gitattributes		.gitattributes
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Speaker Identification and Gender Classification

🚀 Project Overview

📂 Repository Structure

🛠️ Installation

📊 Methodology

Feature Extraction

Processing Pipeline

Models

🏃‍♂️ Usage

1. Download Data

2. Train Gender Classifier

📈 Results

👥 Contributors

📄 License

About

Uh oh!

Releases

Packages

Languages

License

mostafa-kermaninia/speech-processing-toolkit

Folders and files

Latest commit

History

Repository files navigation

Speaker Identification and Gender Classification

🚀 Project Overview

📂 Repository Structure

🛠️ Installation

📊 Methodology

Feature Extraction

Processing Pipeline

Models

🏃‍♂️ Usage

1. Download Data

2. Train Gender Classifier

📈 Results

👥 Contributors

📄 License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages