Skip to content

Latest commit

 

History

History
217 lines (173 loc) · 6.95 KB

File metadata and controls

217 lines (173 loc) · 6.95 KB

AI Agent Instructions for GlobalContentProvider

For GitHub Copilot Coding Agent and AI Assistants

See .github/copilot-instructions.md for comprehensive technical details.

🎯 Quick Start

What This Project Does

AI-driven video localization platform: Translate videos into multiple languages with automated speech recognition (WhisperX), translation (Dify), voice cloning (OpenVoice/MeloTTS), and audio mixing (Demucs, FFmpeg).

Tech Stack

  • Language: Python 3.11.x (3.12+ NOT supported)
  • Framework: PyQt6 GUI, FastAPI (API), asyncio
  • AI/ML: PyTorch, WhisperX, Demucs, OpenVoice, MeloTTS
  • Testing: pytest, pytest-asyncio, pytest-cov
  • Code Quality: black, isort, mypy, flake8

⚠️ CRITICAL RULES

🚫 File Edit Restrictions

ONLY edit these directories without asking:

  • src/Pipeline/pipelines/ - Pipeline implementations
  • src/Pipeline/steps/ - Pipeline step implementations
  • src/services/ - AI service implementations (see guidelines below)

For ANY other files, ask for permission first.

Guidelines for editing src/services/:

  • ✅ Can add new services
  • ✅ Can modify services, but minimize changes to existing code
  • ⚠️ Must ensure proper use of GPU scheduler (PyTorchTask pattern)
  • ⚠️ Prefer adding new services over modifying existing ones

🔒 Never Do This

  • Don't use Python 3.12+ (dependencies incompatible)
  • Don't modify core architecture (Signal Bus, GPU Scheduler, Task Model) without approval
  • Don't commit secrets or .env files
  • Don't remove existing tests
  • Don't introduce security vulnerabilities

🛠️ Development Essentials

Run Tests (ALWAYS use venv!)

# Windows - Activate venv FIRST
.\venv\Scripts\Activate.ps1

# Then run tests
.\venv\Scripts\python.exe -m pytest tests/unit -v --cov=src

Format & Lint

# Format
black src/ tests/
isort src/ tests/

# Type check
mypy src/

# Lint
flake8 src/ tests/

Project Structure

src/
├── Pipeline/         # ⭐ You can edit: pipelines/ and steps/
├── GpuScheduler/     # 🔒 Core: GPU task scheduling
├── core/signal_bus/  # 🔒 Core: Event system
├── Gui/              # 🔒 PyQt6 interface
├── services/         # ⭐ You can edit: Add/modify services (use GPU scheduler properly)
├── models/           # 🔒 Data models (Task, Enums)
└── utils/            # 🔒 Utilities

💡 Common Tasks

Adding a New Pipeline Step

  1. Copy template: src/Pipeline/steps/_step_template.py
  2. Implement execute() method
  3. Set step_id, step_name, depends_on
  4. Use self.report_progress() for updates
  5. Store results: task.set_step_data(self.step_id, result)

Adding a New Pipeline

  1. Copy template: src/Pipeline/pipelines/_pipeline_template.py
  2. Implement register_steps() to return step list
  3. Steps execute in order automatically

Adding or Modifying Services

Prefer adding new services over modifying existing ones.

When adding a new service in src/services/:

  1. Check existing services for similar patterns
  2. Use GPU scheduler (PyTorchTask) for GPU-intensive operations
  3. Follow async patterns (async/await)
  4. Add proper error handling and logging
  5. Write unit tests in tests/unit/services/

When modifying existing services:

  • ⚠️ Minimize changes to existing code
  • ✅ Ensure GPU scheduler usage remains correct
  • ✅ Maintain backward compatibility
  • ✅ Update tests if behavior changes

Using GPU Scheduler

from src.GpuScheduler import PyTorchTask, FFmpegTask

# For PyTorch models (ASR, TTS, RIFE, Wav2Lip)
class MyTask(PyTorchTask):
    def __init__(self, data):
        super().__init__(
            name="my_task",
            model_loader=self._load_model,
            inference_fn=self._inference,
            input_data=data,
            model_key="my_model",  # Enables model sharing
        )
    
    def _load_model(self, device):
        return load_model().to(device)
    
    def _inference(self, model, data, device):
        return model.process(data)

# Usage
task = MyTask(data)
result = await task.wait()

# For FFmpeg NVENC encoding ⭐NEW
task = FFmpegTask(
    name="encode_video",
    command=["ffmpeg", "-i", "input.mp4", "-c:v", "h264_nvenc", "output.mp4"],
    resource_spec={'nvenc_sessions': 1}
)
result = await task.wait()

📚 Key Concepts

Signal Bus (Event System)

  • Location: src/core/signal_bus/
  • Pattern: Pub/Sub with typed signals
  • Usage: SignalBus.emit(StepCompletedSignal(...)) → Controllers listen → UI updates
  • Rule: Each signal emitted from ONE source only (no re-emission)

Task Model

  • Location: src/models/task.py
  • Purpose: Central data container for pipeline execution
  • Access: task.get_step_data("step_id") / task.set_step_data("step_id", data)

GPU Scheduler Task Types ⭐NEW

  • PyTorchTask: Deep learning models (ASR, TTS, RIFE, Wav2Lip)
  • FFmpegTask: Hardware encoding with NVENC (video processing)
  • OnnxTask: ONNX runtime models
  • TensorFlowTask: TensorFlow models
  • Key Feature: NVENC can run parallel with CUDA tasks (independent hardware)

Interactive Pipeline

  • Location: src/Pipeline/core/interactive_pipeline.py
  • Features: Steps can pause for user confirmation/retry/skip
  • GUI: PyQt6 interface shows progress, allows interaction

Available Pipelines

  1. 视频翻译管线 (video_translation_pipeline.py) 🌐

    • Standard translation with Dify cloud translation
    • Audio time-stretching for sync
  2. 本地视频翻译管线 (local_video_translation_pipeline.py) 🎬

    • Uses NLLB-200 local translation model
    • Preserves background music
    • Precise audio-video sync
  3. 蘑菇级音画同步管线 (advanced_av_sync_pipeline.py) 🍄 ⭐NEW

    • Advanced AV sync with RIFE frame interpolation
    • Wav2Lip lip sync for precise mouth movements
    • Time remapping + semantic frame insertion

🧪 Testing Guidelines

Required

  • Write tests for new pipeline steps in tests/unit/
  • Mark async tests: @pytest.mark.asyncio
  • Use fixtures for setup/teardown
  • Target 80%+ coverage

Test Patterns

@pytest.mark.asyncio
async def test_my_step():
    step = MyStep()
    task = Task(...)
    result = await step.execute({}, {}, task)
    assert result.success

🔐 Security Checklist

  • Validate all user inputs
  • Sanitize file paths (pathlib.Path)
  • No secrets in code (use .env)
  • HTTPS for external APIs
  • Handle exceptions gracefully

🌐 Language Support

Repository has bilingual docs (Chinese/English). Maintain both when updating documentation.

📖 Full Documentation


Questions? Check the comprehensive instructions in .github/copilot-instructions.md or ask the user.