AVSync is a powerful Python tool that automatically synchronizes foreign audio tracks to reference videos using advanced visual anchor detection and precise audio timing algorithms. Perfect for dubbing, multilingual content creation, and audio replacement workflows.
- π― Visual Anchor Detection: Uses scene change detection and template matching to find corresponding frames between videos
- π Precise Audio Timing: Iterative audio processing with millisecond-level precision
- π Multi-language Support: Automatic audio stream detection by language codes
- π Quality Control: Generate side-by-side comparison images and detailed CSV reports
- β‘ Parallel Processing: Multi-threaded frame matching for faster processing
- ποΈ Flexible Configuration: Extensive customization options for different content types
- π Progress Tracking: Beautiful colored console output with progress bars
- Image Pairing Stage: Extracts scene change frames and matches them between reference and foreign videos
- Audio Synchronization Stage: Processes audio segments iteratively to match reference timing precisely
- Muxing Stage: Combines reference video, original audio, and synchronized foreign audio into final output
- FFmpeg and FFprobe (must be in system PATH)
- Python 3.7 or higher
pip install opencv-python scipy numpy tqdmpip install Pillow imagehash # For similarity filtering- Clone the repository:
git clone https://github.com/stinkybread/avsync.git
cd avsync- Install Python dependencies:
pip install -r requirements.txt-
Install FFmpeg:
- Windows: Download from FFmpeg.org or use
winget install FFmpeg - macOS:
brew install ffmpeg - Linux:
sudo apt install ffmpeg(Ubuntu/Debian) or equivalent
- Windows: Download from FFmpeg.org or use
-
Verify installation:
python AVSync.py --helppython AVSync.py reference_video.mkv foreign_video.mkv output_video.mkvSpecify language codes:
python AVSync.py ref.mkv foreign.mkv output.mkv --ref_lang eng --foreign_lang spaUse specific audio stream indices:
python AVSync.py ref.mkv foreign.mkv output.mkv --ref_stream_idx 1 --foreign_stream_idx 2Generate QC images and CSV report:
python AVSync.py ref.mkv foreign.mkv output.mkv \
--qc_output_dir ./qc_images \
--output_csv segments.csvKeep synchronized audio file:
python AVSync.py ref.mkv foreign.mkv output.mkv --output_audio synced_audio.wavFine-tune processing parameters:
python AVSync.py ref.mkv foreign.mkv output.mkv \
--scene_threshold 0.3 \
--match_threshold 0.8 \
--min_segment_duration 10 \
--db_threshold -35--scene_threshold: Scene change detection sensitivity (0.0-1.0, default: 0.25)--match_threshold: Template matching threshold (0.0-1.0, default: 0.7)--similarity_threshold: Perceptual hash difference threshold (default: 4, -1 to disable)
--ref_lang/--foreign_lang: Language codes for audio stream selection--db_threshold: Audio detection threshold in dBFS (default: -40.0)--min_segment_duration: Minimum segment duration in seconds (default: 5.0)--ref_stream_idx/--foreign_stream_idx: Force specific audio stream indices
--output_audio: Save synchronized audio as WAV file--output_csv: Export segment timing information--qc_output_dir: Generate quality control images--mux_foreign_codec: Audio codec for foreign track (default: aac)--mux_foreign_bitrate: Bitrate for foreign track (default: 192k)
- Video File: Reference video + original audio + synchronized foreign audio
- Synchronized Audio: WAV file with precisely timed foreign audio
- QC Images: Side-by-side frame comparisons for visual verification
- CSV Report: Detailed segment timing and processing statistics
- β Use videos with clear scene changes and visual landmarks
- β Ensure good video quality for accurate frame matching
- β Ensure both reference and foreign video are essentially the same bar the audio (extra ads, different intro lengths etc will throw this off)
- β Ensure audio tracks have clear content boundaries as best as you can
- β Use similar audio quality between reference and foreign tracks
- Lower scene threshold: Detects more frames (more anchor points)
- Higher match threshold: Stricter frame matching (fewer false positives)
- Longer min segment duration: Fewer, longer segments (more stable sync)
"FFmpeg not found"
- Ensure FFmpeg is installed and in your system PATH
- Test with
ffmpeg -versionin terminal
"No matches found"
- Try lowering
--scene_threshold(e.g., 0.15) - Try lowering
--match_threshold(e.g., 0.6) - Check that videos actually correspond to each other
Audio sync drift
- Adjust
--min_segment_durationfor your content type - Check
--db_thresholdif audio boundaries are incorrectly detected - Review QC images to verify visual anchor quality
Performance issues
- Reduce video resolution for faster processing
- Adjust
--similarity_thresholdto reduce redundant anchors - Use SSD storage for temporary files
- Processing time scales with video length and frame extraction count
- Typical processing speed: 1-5x real-time depending on content and hardware
- Memory usage peaks during frame extraction and comparison phases
- Temporary disk space required: ~2-10GB for feature-length content
Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.
git clone https://github.com/stinkybread/avsync.git
cd avsync
pip install -r requirements-dev.txtThis project is licensed under the MIT License - see the LICENSE file for details.
- FFmpeg team for the excellent multimedia framework
- OpenCV community for computer vision tools
- SciPy contributors for audio processing capabilities
- π Bug Reports: GitHub Issues
- π‘ Feature Requests: GitHub Discussions
- π§ Email: vaibhav.bhat@gmail.com