Merge pull request #4 from cld2labs/dev/Audify

geethac2l · web-flow · commit ccfc34dbfc5b · 2026-04-01T15:55:27.000-05:00
docs: update README benchmarks and rewrite CONTRIBUTING
diff --git a/.github/workflows/code-scans.yaml b/.github/workflows/code-scans.yaml
@@ -7,7 +7,7 @@ on:
         description: 'Pull request number'
         required: true
   push:
-    branches: [ main ]
+    branches: [ main, dev/Audify ]
   pull_request:
     types: [opened, synchronize, reopened, ready_for_review]
 
diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
@@ -1,22 +1,313 @@
-# Contributing to  Audify 
-
-Thank you for your interest in contributing to the
-**Audify** by Cloud2 Labs.
-
-## Scope of Contributions
-Appropriate contributions include:
-- Documentation improvements
-- Bug fixes
-- Reference architecture enhancements
-- Educational clarity and examples
-
-Major feature additions or architectural changes require prior discussion with
-the Cloud2 Labs maintainers.
-
-## Contribution Guidelines
-- Follow existing coding and documentation standards
-- Avoid production-specific assumptions
-- Do not introduce sensitive, proprietary, or regulated data
-
-By submitting a contribution, you agree that your work may be used, modified,
-and redistributed by Cloud2 Labs.
+# Contributing to Audify
+
+Thanks for your interest in contributing to Audify.
+
+Audify is an open-source AI application that turns documents into editable, two-speaker podcast-style scripts and downloadable audio using a FastAPI microservices backend and a React frontend. We welcome improvements across the codebase, documentation, bug reports, UX refinements, observability, and feature enhancements.
+
+Before you start, read the relevant section below. It helps keep contributions focused, reviewable, and aligned with the current project setup.
+
+---
+
+## Quick Setup Checklist
+
+Before you dive in, make sure you have these installed:
+
+```bash
+# Check Python (3.11+ recommended)
+python --version
+
+# Check Node.js (18+ recommended)
+node --version
+
+# Check npm
+npm --version
+
+# Check Docker
+docker --version
+docker compose version
+
+# Check Git
+git --version
+```
+
+New to contributing?
+
+1. Open an issue or pick an existing one to work on.
+2. Sync your branch from `dev/Audify`.
+3. Follow the local setup guide below.
+4. Run the app locally and verify your change before opening a PR.
+
+## Table of Contents
+
+- [How do I...?](#how-do-i)
+  - [Get help or ask a question?](#get-help-or-ask-a-question)
+  - [Report a bug?](#report-a-bug)
+  - [Suggest a new feature?](#suggest-a-new-feature)
+  - [Set up Audify locally?](#set-up-audify-locally)
+  - [Start contributing code?](#start-contributing-code)
+  - [Improve the documentation?](#improve-the-documentation)
+  - [Submit a pull request?](#submit-a-pull-request)
+- [Code guidelines](#code-guidelines)
+- [Pull request checklist](#pull-request-checklist)
+- [Branching model](#branching-model)
+- [Thank you](#thank-you)
+
+---
+
+## How do I...
+
+### Get help or ask a question?
+
+- Start with the main project docs in [`README.md`](./README.md), [`docs/PROJECT_ARCHITECTURE.md`](./docs/PROJECT_ARCHITECTURE.md), and the service-level READMEs under [`api/`](./api).
+- Review relevant config files such as [`simple_backend.py`](./simple_backend.py), [`api/llm-service/app/config.py`](./api/llm-service/app/config.py), and [`api/tts-service/app/config.py`](./api/tts-service/app/config.py).
+- If something is still unclear, open a GitHub issue with your question and the context you already checked.
+
+### Report a bug?
+
+1. Search existing issues first.
+2. If the bug is new, open a GitHub issue.
+3. Include your environment, what happened, what you expected, and exact steps to reproduce.
+4. Add screenshots, logs, request payloads, or response details if relevant.
+
+### Suggest a new feature?
+
+1. Open a GitHub issue describing the feature.
+2. Explain the problem, who it helps, and how it fits Audify.
+3. If the change is large, get alignment in the issue before writing code.
+
+### Set up Audify locally?
+
+#### Prerequisites
+
+- Python 3.11+
+- Node.js 18+ and npm
+- Git
+- Docker with Docker Compose v2
+- One inference path for script generation:
+  - Ollama or another OpenAI-compatible local inference endpoint, or
+  - An OpenAI-compatible API endpoint for fallback or hosted inference
+- OpenAI API key for TTS generation
+
+#### Option 1: Local Development
+
+##### Step 1: Clone the repository
+
+```bash
+git clone https://github.com/cld2labs/Audify.git
+cd Audify
+```
+
+##### Step 2: Configure environment variables
+
+Create the root `.env` file:
+
+```bash
+cp .env.example .env
+```
+
+If `.env.example` is not present in your branch, create `.env` manually using the values documented in [`README.md`](./README.md).
+
+Create `api/llm-service/.env` with your inference settings. Example:
+
+```env
+SERVICE_PORT=8002
+OPENAI_API_KEY=sk-...
+OPENAI_BASE_URL=
+INFERENCE_API_ENDPOINT=
+INFERENCE_API_TOKEN=
+INFERENCE_MODEL_NAME=gpt-4o-mini
+VLLM_BASE_URL=http://localhost:11434/v1
+VLLM_MODEL=Qwen/Qwen3-1.7B
+DEFAULT_MODEL=gpt-4o-mini
+DEFAULT_TONE=conversational
+DEFAULT_MAX_LENGTH=2000
+TEMPERATURE=0.7
+MAX_TOKENS=2048
+MAX_RETRIES=3
+```
+
+Create `api/tts-service/.env` with your TTS settings. Example:
+
+```env
+SERVICE_PORT=8003
+OPENAI_API_KEY=sk-...
+TTS_MODEL=tts-1-hd
+DEFAULT_HOST_VOICE=alloy
+DEFAULT_GUEST_VOICE=nova
+OUTPUT_DIR=static/audio
+AUDIO_FORMAT=mp3
+AUDIO_BITRATE=192k
+SILENCE_DURATION_MS=500
+MAX_CONCURRENT_REQUESTS=5
+MAX_SCRIPT_LENGTH=100
+```
+
+##### Step 3: Install backend dependencies
+
+```bash
+python -m venv .venv
+source .venv/bin/activate  # Windows: .venv\Scripts\activate
+pip install -r requirements.txt
+pip install -r api/pdf-service/requirements.txt
+pip install -r api/llm-service/requirements.txt
+pip install -r api/tts-service/requirements.txt
+```
+
+##### Step 4: Install frontend dependencies
+
+```bash
+cd ui
+npm install
+cd ..
+```
+
+##### Step 5: Start the backend services
+
+Open separate terminals and start:
+
+```bash
+# Terminal 1: gateway
+source .venv/bin/activate
+python simple_backend.py
+```
+
+```bash
+# Terminal 2: PDF service
+source .venv/bin/activate
+cd api/pdf-service
+uvicorn app.main:app --reload --host 0.0.0.0 --port 8001
+```
+
+```bash
+# Terminal 3: LLM service
+source .venv/bin/activate
+cd api/llm-service
+uvicorn app.main:app --reload --host 0.0.0.0 --port 8002
+```
+
+```bash
+# Terminal 4: TTS service
+source .venv/bin/activate
+cd api/tts-service
+uvicorn app.main:app --reload --host 0.0.0.0 --port 8003
+```
+
+##### Step 6: Start the frontend
+
+Open another terminal:
+
+```bash
+cd ui
+npm run dev
+```
+
+##### Step 7: Access the application
+
+- Frontend: `http://localhost:5173` in local Vite development, or `http://localhost:3000` when using Docker
+- Backend gateway health check: `http://localhost:8000/health`
+- PDF service docs: `http://localhost:8001/docs`
+- LLM service docs: `http://localhost:8002/docs`
+- TTS service docs: `http://localhost:8003/docs`
+
+#### Option 2: Docker
+
+From the repository root:
+
+```bash
+# Create and configure the required env files first
+docker compose up --build
+```
+
+This starts:
+
+- Frontend on `http://localhost:3000`
+- Backend gateway on `http://localhost:8000`
+- PDF service on `http://localhost:8001`
+- LLM service on `http://localhost:8002`
+- TTS service on `http://localhost:8003`
+
+#### Common Troubleshooting
+
+- If ports `3000`, `8000`, `8001`, `8002`, or `8003` are already in use, stop the conflicting process before starting Audify.
+- If script generation fails, confirm the LLM service `.env` points to a reachable model endpoint.
+- If you use Ollama with Docker, make sure Ollama is running on the host and reachable from the container.
+- If audio generation fails, verify `OPENAI_API_KEY` is set in `api/tts-service/.env`.
+- If Docker fails to build, rebuild with `docker compose up --build`.
+- If Python packages fail to install, confirm you are using a supported Python version.
+
+### Start contributing code?
+
+1. Open or choose an issue.
+2. Create a feature branch from `dev/Audify`.
+3. Keep the change focused on a single problem.
+4. Run the app locally and verify the affected workflow.
+5. Update docs when behavior, setup, configuration, or architecture changes.
+6. Open a pull request from your feature branch into `dev/Audify`.
+
+### Improve the documentation?
+
+Documentation updates are welcome. Relevant files currently live in:
+
+- [`README.md`](./README.md)
+- [`docs/`](./docs/)
+- [`api/pdf-service/README.md`](./api/pdf-service/README.md)
+- [`api/llm-service/README.md`](./api/llm-service/README.md)
+- [`api/tts-service/README.md`](./api/tts-service/README.md)
+- [`benchmarks/README.md`](./benchmarks/README.md)
+
+### Submit a pull request?
+
+Follow the checklist below before opening your PR. Your pull request should:
+
+- Stay focused on one issue or topic.
+- Explain what changed and why.
+- Include manual verification steps.
+- Include screenshots or short recordings for UI changes.
+- Reference the related GitHub issue when applicable.
+
+Note: pull requests should target the `dev/Audify` branch unless maintainers ask otherwise.
+
+---
+
+## Code guidelines
+
+- Follow the existing project structure and patterns before introducing new abstractions.
+- Keep frontend changes consistent with the React + Vite + Tailwind setup already in use under [`ui/`](./ui/).
+- Keep backend changes consistent with the FastAPI microservice structure under [`api/`](./api/) and the gateway in [`simple_backend.py`](./simple_backend.py).
+- Avoid unrelated refactors in the same pull request.
+- Do not commit secrets, API keys, local `.env` files, generated audio, or benchmark artifacts that do not belong in version control.
+- Prefer clear, small commits and descriptive pull request summaries.
+- Update documentation when contributor setup, behavior, environment variables, or service logic changes.
+- If you change API contracts, verify both the service endpoint and the frontend consumer still match.
+
+---
+
+## Pull request checklist
+
+Before submitting your pull request, confirm the following:
+
+- You tested the affected flow locally.
+- The application still starts successfully in the environment you changed.
+- You removed debug code, stray logs, and commented-out experiments.
+- You documented any new setup steps, environment variables, or behavior changes.
+- You kept the pull request scoped to one issue or topic.
+- You added screenshots for UI changes when relevant.
+- You did not commit secrets, API keys, sample private documents, or generated media outputs by mistake.
+- You are opening the pull request against `dev/Audify`.
+
+If one or more of these are missing, the pull request may be sent back for changes before review.
+
+---
+
+## Branching model
+
+- Base new work from `dev/Audify`.
+- Open pull requests against `dev/Audify`.
+- Use descriptive branch names such as `fix/script-generation-timeout` or `docs/update-contributing-guide`.
+- Rebase or merge the latest `dev/Audify` before opening your PR if your branch has drifted.
+
+---
+
+## Thank you
+
+Thanks for contributing to Audify. Whether you are fixing a bug, improving the docs, refining the UI, strengthening the service architecture, or making the generation workflow more reliable, your work helps make the project more useful and easier to maintain.
diff --git a/README.md b/README.md
@@ -395,14 +395,17 @@ The table below compares inference performance across different providers, deplo
 | Provider | Model | Deployment | Context Window | Avg Input Tokens | Avg Output Tokens | Avg Tokens / Request | P50 Latency (ms) | P95 Latency (ms) | Throughput (req/s) | Hardware |
 |----------|-------|------------|----------------|------------------|-------------------|----------------------|------------------|------------------|--------------------|----------|
 | vLLM | `Qwen/Qwen3:1.7b` | Local | 4,096 | 1,183 | 1,308 | 2,492 | 58,855 | 59,773 | 0.0162 | Apple Silicon Metal (Macbook Pro M4) |
+| OPEA EI / SLM | `Qwen/Qwen3:1.7b` | Local | 8.1K | 1,075 | 350 | 1,425 | 10,446 | 23,445 | 0.0853 | Xeon CPU (CPU only) |
 | OpenAI (Cloud) | `gpt-4o-mini` | API (Cloud) | 128K | 1,625 | 680 | 2,330 | 19,848 | 20,733 | 0.0276 | Cloud GPUs |
 
 > **Notes:**
 >
 > - Context Window for vLLM (4,096) reflects the `LLM_MAX_TOKENS` / `--max-model-len` used during benchmarking, not the model's native maximum context. vLLM shares its configured context between input and output tokens.
+> - EI is configured with an 8,192-token context window for this benchmark run.
 > - All benchmarks use the same Audify script-generation prompt and identical inputs across 3 runs.
 > - Token counts may vary slightly per run due to non-deterministic model output.
 > - vLLM on Apple Silicon requires [vllm-metal](https://github.com/vllm-project/vllm-metal); the standard `pip install vllm` package does not provide macOS Metal support.
+> - [Intel OPEA EI](https://github.com/opea-project/Enterprise-Inference) runs on Intel Xeon CPUs without GPU acceleration.
 
 ---