Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/AUTHORS.md
Original file line number Diff line number Diff line change
@@ -1,2 +1,2 @@
## Author(s)
-[Jeremy Vachier](https://github.com/jvachier)
-[Jeremy Vachier](https://github.com/jvachier)
2 changes: 1 addition & 1 deletion .github/CODEOWNERS
Original file line number Diff line number Diff line change
@@ -1 +1 @@
* @jvachier
* @jvachier
2 changes: 1 addition & 1 deletion .github/CONTRIBUTORS.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Contributor(s)

## Main contributor(s)
-[Jeremy Vachier](https://github.com/jvachier)
-[Jeremy Vachier](https://github.com/jvachier)
6 changes: 3 additions & 3 deletions .github/ISSUE_TEMPLATE/issue_template.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,9 +16,9 @@

## Steps to Reproduce
<!-- For bugs, provide a step-by-step guide to reproduce the issue. Skip this section for feature requests. -->
1.
2.
3.
1.
2.
3.

## Expected Behavior
<!-- Describe what you expected to happen. -->
Expand Down
2 changes: 1 addition & 1 deletion .github/pull_request_template.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,4 +24,4 @@ Before submitting your pull request, ensure the following:
- [ ] I have added or updated relevant documentation (if applicable).
- [ ] My changes do not introduce any new warnings or errors.
- [ ] I have checked for security vulnerabilities in the code.
- [ ] I have ensured backward compatibility (if applicable).
- [ ] I have ensured backward compatibility (if applicable).
25 changes: 16 additions & 9 deletions .github/workflows/test.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -25,23 +25,30 @@ jobs:
steps:
# Checkout the repository
- name: Checkout code
uses: actions/checkout@v3
uses: actions/checkout@v4

# Set up Python
- name: Set up Python
uses: actions/setup-python@v4
uses: actions/setup-python@v5
with:
python-version: "3.11"

# Install dependencies
# Install uv
- name: Install uv
uses: astral-sh/setup-uv@v2

# Install dependencies (dev only, audio not needed for tests)
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install poetry
poetry config virtualenvs.create false
poetry install --without macos --without pyaudio --without kaggle
uv sync --extra dev

# Run code quality checks
- name: Run code quality checks
run: |
uv run ruff check ./src ./app ./tests
uv run ruff format --check ./src ./app ./tests

# Run pytest
# Run pytest (excludes audio-dependent modules like speech_to_text)
- name: Run tests with pytest
run: |
poetry run pytest
uv run pytest
7 changes: 5 additions & 2 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -105,7 +105,10 @@ ipython_config.py
# Similar to Pipfile.lock, it is generally recommended to include uv.lock in version control.
# This is especially recommended for binary packages to ensure reproducibility, and is more
# commonly ignored for libraries.
#uv.lock
# uv
# uv.lock should be committed to version control for reproducible builds
# This is especially recommended for applications to ensure reproducibility
# uv.lock

# poetry
# Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control.
Expand Down Expand Up @@ -185,4 +188,4 @@ vosk-model-en-us-0.22/
vosk-model-small-sv-rhasspy-0.15/
recognized_text.txt
src/models/*.keras
src/models/*.json
src/models/*.json
34 changes: 34 additions & 0 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
# Pre-commit hooks configuration
# See https://pre-commit.com for more information
repos:
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v4.6.0
hooks:
- id: trailing-whitespace
- id: end-of-file-fixer
- id: check-merge-conflict
- id: check-yaml
- id: check-toml
- id: check-json
- id: check-added-large-files

- repo: https://github.com/astral-sh/ruff-pre-commit
rev: v0.6.8
hooks:
- id: ruff
args: [--fix]
files: ^(src/|app/|tests/).*\.py$
- id: ruff-format
files: ^(src/|app/|tests/).*\.py$
# Make ruff-format more flexible - only run when SKIP_FORMAT is not set
stages: [manual]

- repo: local
hooks:
- id: pytest
name: pytest
entry: uv run python -m pytest
language: system
pass_filenames: false
always_run: true
args: [tests/, -v]
51 changes: 47 additions & 4 deletions Makefile
Original file line number Diff line number Diff line change
@@ -1,4 +1,47 @@
ruff:
ruff check ./src ./app ./tests
ruff check --fix ./src ./app ./tests
ruff format ./src ./app ./tests
# Makefile for Sentiment Analysis
.PHONY: help install test lint format clean run

help: ## Show available commands
@grep -E '^[a-zA-Z_-]+:.*?## .*$$' $(MAKEFILE_LIST) | sort | awk 'BEGIN {FS = ":.*?## "}; {printf "\033[36m%-15s\033[0m %s\n", $$1, $$2}'

install: ## Install all dependencies including audio support
uv sync --extra audio --extra dev

install-audio: ## Install with audio dependencies only
uv sync --extra audio

install-dev: ## Install development dependencies with audio support
uv sync --extra dev --extra audio

test: ## Run tests with coverage
uv run python -m pytest tests/ --cov=src --cov=app --cov-report=term

lint: ## Check and fix code quality
uv run ruff check --fix ./src ./app ./tests
uv run ruff format ./src ./app ./tests

format: ## Format code only
uv run ruff format ./src ./app ./tests

format-check: ## Check if code needs formatting (without changing files)
uv run ruff format --check ./src ./app ./tests

run: ## Run the Dash application
uv run python app/voice_to_text_app.py

pre-commit-install: ## Install pre-commit hooks
uv run pre-commit install

pre-commit: ## Run pre-commit hooks on all files (excluding format)
SKIP=ruff-format uv run pre-commit run --all-files

pre-commit-format: ## Run pre-commit including manual formatting stage
uv run pre-commit run --all-files --hook-stage manual

pre-commit-full: ## Run all pre-commit hooks including formatting
uv run pre-commit run --all-files --hook-stage manual --hook-stage commit

clean: ## Remove temporary files
find . -name "__pycache__" -type d -exec rm -rf {} +
find . -name "*.pyc" -delete
rm -rf .coverage htmlcov/ .pytest_cache/
53 changes: 36 additions & 17 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,8 @@
[![Deep Learning](https://img.shields.io/badge/Deep%20Learning-TensorFlow-orange)](https://www.tensorflow.org/)
[![Keras](https://img.shields.io/badge/Keras-red)](https://keras.io/)
[![TensorFlow](https://img.shields.io/badge/TensorFlow-2.0%2B-orange)](https://www.tensorflow.org/)
[![Python](https://img.shields.io/badge/Python-3.8%2B-blue)](https://www.python.org/)
[![Python](https://img.shields.io/badge/Python-3.11%2B-blue)](https://www.python.org/)
[![uv](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/astral-sh/uv/main/assets/badge/v0.json)](https://github.com/astral-sh/uv)
[![License: Apache 2.0](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)

# Sentiment Analysis and Translation
Expand Down Expand Up @@ -49,17 +50,21 @@ The sentiment analysis and translation models included in this repository are **
## Installation

### Prerequisites
- Python 3.8 or higher
- Poetry for dependency management
- Python 3.11 or higher
- uv for fast dependency management (10-100x faster than pip/poetry)

### Install Dependencies
1. Install Poetry:
1. Install uv:
```bash
pip install poetry
pip install uv
```
Or on macOS/Linux:
```bash
curl -LsSf https://astral.sh/uv/install.sh | sh
```
2. Install project dependencies:
```bash
poetry install
uv sync
```

### Download the Vosk Model
Expand All @@ -75,6 +80,26 @@ The sentiment analysis and translation models included in this repository are **
└── ...
```

### Quick Start with Makefile
For easier development workflow, use the provided Makefile:

```bash
# Install all dependencies
make install

# Run tests
make test

# Check code quality
make lint

# Run the application
make run

# See all available commands
make help
```

---

## Required Datasets
Expand Down Expand Up @@ -109,7 +134,7 @@ Sentiment_Analysis/
│ │ ├── sentiment_keras_binary.keras
│ │ ├── transformer_best_model.keras
│ │ ├── optuna_model_binary.json # Best binary classification model parameters from Optuna
│ │ └── optuna_transformer_best_params.json # Best transformer model hyperparameters from Optuna
│ │ └── optuna_transformer_best_params.json # Best transformer model hyperparameters from Optuna
│ ├── configurations/ # Configuration files
│ │ ├── model_builder_config.json
│ │ ├── model_trainer_config.json
Expand Down Expand Up @@ -150,7 +175,7 @@ Sentiment_Analysis/
├── .gitignore # Git ignore file
├── LICENSE # License file
├── Makefile # Makefile for common tasks
├── pyproject.toml # Poetry configuration file
├── pyproject.toml # uv configuration file
├── README.md # Project documentation
└── ruff.toml # Ruff configuration file
```
Expand All @@ -162,7 +187,7 @@ Sentiment_Analysis/
### Interactive Application
1. **Run the Application**:
```bash
poetry run python app/voice_to_text_app.py
uv run python app/voice_to_text_app.py
```
2. **Features**:
- **Start Recording**: Begin recording your speech.
Expand All @@ -175,7 +200,7 @@ Sentiment_Analysis/
### Sentiment Analysis
1. **Train or Load the Model**:
```bash
poetry run python src/sentiment_analysis.py
uv run python src/sentiment_analysis.py
```
- If a saved model exists, it will be loaded.
- Otherwise, a new model will be trained and saved in the `src/models/` folder.
Expand All @@ -190,7 +215,7 @@ Sentiment_Analysis/
Place your English-French dataset in the `src/data/` folder.
2. **Train or Load the Model**:
```bash
poetry run python src/translation_french_english.py
uv run python src/translation_french_english.py
```
- If a saved model exists, it will be loaded.
- Otherwise, a new model will be trained and saved in the `src/models/` folder.
Expand All @@ -215,9 +240,3 @@ Sentiment_Analysis/
This project is licensed under the Apache License 2.0. See the [LICENSE](LICENSE) file for details.

---

## About

This repository is designed for researchers, developers, and enthusiasts interested in exploring advanced NLP techniques. It provides a practical implementation of speech-to-text, sentiment analysis, and translation pipelines, along with an interactive web application.

For questions or feedback, feel free to open an issue or contact the repository maintainers.
8 changes: 4 additions & 4 deletions app/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,17 +38,17 @@ Before running the application, ensure you have:
- Ensure the sentiment analysis inference model is available at the path defined in `ModelPaths.INFERENCE_MODEL.value`.

5. **Dependencies**:
- Install all project dependencies using Poetry:
- Install all project dependencies using uv:
```bash
poetry install
uv sync
```

## How to Run

From the project root directory:

```bash
poetry run python app.py
uv run python app/voice_to_text_app.py
```

The application will start and be accessible at: [http://127.0.0.1:8050](http://127.0.0.1:8050)
Expand Down Expand Up @@ -100,4 +100,4 @@ If you encounter issues:
## Development Notes

- The app runs in debug mode by default.
- For production deployment, set `debug=False` in the `app.run_server()` method.
- For production deployment, set `debug=False` in the `app.run_server()` method.
1 change: 1 addition & 0 deletions app/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
# App package
22 changes: 14 additions & 8 deletions app/voice_to_text_app.py
Original file line number Diff line number Diff line change
@@ -1,13 +1,19 @@
import dash
from dash import html, dcc
from dash.dependencies import Input, Output, State
import base64
import logging
from modules.speech_to_text import SpeechToText
from translation_french_english import test_translation, transformer_model
from modules.data_processor import DatasetProcessor, TextPreprocessor
from modules.utils import ModelPaths
from typing import Tuple, Any
from typing import Any, Tuple

import dash
from dash import dcc, html
from dash.dependencies import Input, Output, State


from src.modules.data_processor import DatasetProcessor, TextPreprocessor
from src.modules.speech_to_text import SpeechToText
from src.modules.utils import ModelPaths
from src.translation_french_english import (
translation_test as test_translation,
transformer_model,
)

# Configure logging
logging.basicConfig(
Expand Down
Loading