For GitHub Copilot Coding Agent and AI Assistants
See
.github/copilot-instructions.mdfor comprehensive technical details.
AI-driven video localization platform: Translate videos into multiple languages with automated speech recognition (WhisperX), translation (Dify), voice cloning (OpenVoice/MeloTTS), and audio mixing (Demucs, FFmpeg).
- Language: Python 3.11.x (3.12+ NOT supported)
- Framework: PyQt6 GUI, FastAPI (API), asyncio
- AI/ML: PyTorch, WhisperX, Demucs, OpenVoice, MeloTTS
- Testing: pytest, pytest-asyncio, pytest-cov
- Code Quality: black, isort, mypy, flake8
ONLY edit these directories without asking:
src/Pipeline/pipelines/- Pipeline implementationssrc/Pipeline/steps/- Pipeline step implementationssrc/services/- AI service implementations (see guidelines below)
For ANY other files, ask for permission first.
Guidelines for editing src/services/:
- ✅ Can add new services
- ✅ Can modify services, but minimize changes to existing code
⚠️ Must ensure proper use of GPU scheduler (PyTorchTask pattern)⚠️ Prefer adding new services over modifying existing ones
- Don't use Python 3.12+ (dependencies incompatible)
- Don't modify core architecture (Signal Bus, GPU Scheduler, Task Model) without approval
- Don't commit secrets or
.envfiles - Don't remove existing tests
- Don't introduce security vulnerabilities
# Windows - Activate venv FIRST
.\venv\Scripts\Activate.ps1
# Then run tests
.\venv\Scripts\python.exe -m pytest tests/unit -v --cov=src# Format
black src/ tests/
isort src/ tests/
# Type check
mypy src/
# Lint
flake8 src/ tests/src/
├── Pipeline/ # ⭐ You can edit: pipelines/ and steps/
├── GpuScheduler/ # 🔒 Core: GPU task scheduling
├── core/signal_bus/ # 🔒 Core: Event system
├── Gui/ # 🔒 PyQt6 interface
├── services/ # ⭐ You can edit: Add/modify services (use GPU scheduler properly)
├── models/ # 🔒 Data models (Task, Enums)
└── utils/ # 🔒 Utilities
- Copy template:
src/Pipeline/steps/_step_template.py - Implement
execute()method - Set
step_id,step_name,depends_on - Use
self.report_progress()for updates - Store results:
task.set_step_data(self.step_id, result)
- Copy template:
src/Pipeline/pipelines/_pipeline_template.py - Implement
register_steps()to return step list - Steps execute in order automatically
Prefer adding new services over modifying existing ones.
When adding a new service in src/services/:
- Check existing services for similar patterns
- Use GPU scheduler (PyTorchTask) for GPU-intensive operations
- Follow async patterns (
async/await) - Add proper error handling and logging
- Write unit tests in
tests/unit/services/
When modifying existing services:
⚠️ Minimize changes to existing code- ✅ Ensure GPU scheduler usage remains correct
- ✅ Maintain backward compatibility
- ✅ Update tests if behavior changes
from src.GpuScheduler import PyTorchTask, FFmpegTask
# For PyTorch models (ASR, TTS, RIFE, Wav2Lip)
class MyTask(PyTorchTask):
def __init__(self, data):
super().__init__(
name="my_task",
model_loader=self._load_model,
inference_fn=self._inference,
input_data=data,
model_key="my_model", # Enables model sharing
)
def _load_model(self, device):
return load_model().to(device)
def _inference(self, model, data, device):
return model.process(data)
# Usage
task = MyTask(data)
result = await task.wait()
# For FFmpeg NVENC encoding ⭐NEW
task = FFmpegTask(
name="encode_video",
command=["ffmpeg", "-i", "input.mp4", "-c:v", "h264_nvenc", "output.mp4"],
resource_spec={'nvenc_sessions': 1}
)
result = await task.wait()- Location:
src/core/signal_bus/ - Pattern: Pub/Sub with typed signals
- Usage:
SignalBus.emit(StepCompletedSignal(...))→ Controllers listen → UI updates - Rule: Each signal emitted from ONE source only (no re-emission)
- Location:
src/models/task.py - Purpose: Central data container for pipeline execution
- Access:
task.get_step_data("step_id")/task.set_step_data("step_id", data)
- PyTorchTask: Deep learning models (ASR, TTS, RIFE, Wav2Lip)
- FFmpegTask: Hardware encoding with NVENC (video processing)
- OnnxTask: ONNX runtime models
- TensorFlowTask: TensorFlow models
- Key Feature: NVENC can run parallel with CUDA tasks (independent hardware)
- Location:
src/Pipeline/core/interactive_pipeline.py - Features: Steps can pause for user confirmation/retry/skip
- GUI: PyQt6 interface shows progress, allows interaction
-
视频翻译管线 (
video_translation_pipeline.py) 🌐- Standard translation with Dify cloud translation
- Audio time-stretching for sync
-
本地视频翻译管线 (
local_video_translation_pipeline.py) 🎬- Uses NLLB-200 local translation model
- Preserves background music
- Precise audio-video sync
-
蘑菇级音画同步管线 (
advanced_av_sync_pipeline.py) 🍄 ⭐NEW- Advanced AV sync with RIFE frame interpolation
- Wav2Lip lip sync for precise mouth movements
- Time remapping + semantic frame insertion
- Write tests for new pipeline steps in
tests/unit/ - Mark async tests:
@pytest.mark.asyncio - Use fixtures for setup/teardown
- Target 80%+ coverage
@pytest.mark.asyncio
async def test_my_step():
step = MyStep()
task = Task(...)
result = await step.execute({}, {}, task)
assert result.success- Validate all user inputs
- Sanitize file paths (
pathlib.Path) - No secrets in code (use
.env) - HTTPS for external APIs
- Handle exceptions gracefully
Repository has bilingual docs (Chinese/English). Maintain both when updating documentation.
- Architecture deep-dive:
.github/copilot-instructions.md - Pipeline guide:
README_PIPELINE.md - Main README:
README.md
Questions? Check the comprehensive instructions in .github/copilot-instructions.md or ask the user.