FFsubsync C++

A ground-up C++ reimplementation of the ffsubsync Python project, built to run natively on Android and other platforms without a Python runtime.

What FFsubsync Does

FFsubsync aligns subtitles to video by:

Extracting speech segments from the video/audio track using Voice Activity Detection (VAD)
Converting subtitles into a binary "speech/not-speech" timeline
Aligning the two binary sequences using FFT-based cross-correlation to find the optimal time offset (and optionally framerate ratio)
Shifting the subtitles by the computed offset and optionally scaling for framerate mismatches

Why a C++ Reimplementation?

This project reimplements the ffsubsync Python tool in C++. The primary motivation is to eliminate the Python runtime dependency so it can be deployed on Android, iOS, embedded systems, and anywhere Python is impractical.

Android Integration: No Python runtime required; native C++ integrates cleanly via JNI
Performance: FFT and audio processing benefit from native compilation and can leverage SIMD
Distribution: Single binary/library with no Python environment to manage
Portability: C++ can target iOS, embedded Linux, and desktop with the same core code

Status

FFT Aligner: Kiss FFT-based cross-correlation alignment
Sherpa ONNX VAD: Silero VAD via prebuilt macOS binaries
SRT Parser: Custom lightweight parser with round-trip support
Subtitle Speech Extraction: Binary vector generation from subtitle timestamps
macOS CLI Tool: ffsubsync_cli accepts --reference (video/audio) and --subs-dir

Build

cmake -B build -S .
cmake --build build -j$(sysctl -n hw.ncpu)
ctest --test-dir build --output-on-failure

CLI Usage

# Run sync against a video/audio file and a directory of SRT files
./build/ffsubsync_cli \
  --reference input.mp4 \
  --subs-dir /path/to/subtitles/ \
  --model models/silero_vad.onnx

Example output:

Audio: 321916 frames (3219.16s @ 100Hz)
The.Copenhagen.Test.S01E03.1080p.WEB.h264-ETHEL.srt -> offset: 0.42s, score: 260118
The.Copenhagen.Test.S01E03.1080p.10bit.WEBRip.6CH.x265.HEVC-PSA.srt -> offset: -0.07s, score: 256168

Files in This Directory

File	Purpose
`README.md`	This file — high-level overview and quick start
`CMakeLists.txt`	Build configuration
`docs/ARCHITECTURE.md`	Component-by-component mapping from Python to C++
`docs/DEPENDENCIES.md`	Dependency analysis and C++ equivalents
`docs/API_DESIGN.md`	Public C++ API and class structure
`docs/IMPLEMENTATION_ROADMAP.md`	Phased implementation plan with milestones
`docs/TESTING_STRATEGY.md`	How to verify correctness
`docs/CODE_MAP.md`	Detailed line-by-line mapping of Python modules to C++
`docs/ADR/001-vad-dependency.md`	Architecture Decision Record: VAD dependency choice

Quick Reference: Python -> C++ Mapping

Python Module	C++ Component	Notes
`aligners.py`	`src/aligners/`	FFTAligner, MaxScoreAligner
`speech_transformers.py`	`src/speech/`	VADProcessor (Sherpa ONNX), subtitle speech extraction
`subtitle_parser.py`	`src/subtitles/`	SRT/ASS/VTT parsers
`subtitle_transformers.py`	`src/subtitles/`	Shifter, Scaler, Merger
`generic_subtitles.py`	`src/subtitles/`	SubtitleEntry, SRTData, ASSData
`ffmpeg_utils.py`	`src/media/`	FFmpeg C API wrapper (Phase 2)
`golden_section_search.py`	`src/aligners/`	Golden-section search for framerate ratio
`sklearn_shim.py`	`src/core/`	Pipeline/Transformer pattern (simplified)
`ffsubsync.py`	`src/cli/`	CLI application + main orchestration
`constants.py`	`include/ffsubsync/constants.h`	Compile-time constants

Key Dependencies

Python Dependency	C++ Equivalent	Android Feasibility
`ffmpeg-python` + `ffmpeg`	`libavformat`, `libavcodec`, `libavutil`, `libswresample` (Phase 2)	Excellent
`webrtcvad`	Sherpa ONNX + Silero VAD	Prebuilt AAR available
`numpy` (FFT)	Kiss FFT (vendored)	Trivial
`srt` (parser)	Custom C++ parser	N/A
`pysubs2` (ASS/SSA)	Custom C++ parser (Phase 2)	Moderate
`argparse`	`cxxopts`	N/A
`torch` + `silero-vad`	Sherpa ONNX (same models, native runtime)	Supported

Project Structure

ffsubsync-cpp/
├── CMakeLists.txt
├── include/ffsubsync/
│   ├── constants.h
│   ├── types.h
│   ├── aligner.h
│   ├── srt_parser.h
│   ├── subtitle_speech.h
│   ├── vad_processor.h
│   └── ffmpeg_audio_decoder.h
├── src/
│   ├── core/pipeline.cpp
│   ├── aligners/
│   │   ├── fft_aligner.cpp
│   │   ├── max_score_aligner.cpp
│   │   └── golden_section_search.cpp
│   ├── speech/
│   │   ├── subtitle_speech.cpp
│   │   └── vad_processor.cpp
│   ├── media/
│   │   └── ffmpeg_audio_decoder.cpp
│   ├── subtitles/
│   │   ├── srt_parser.cpp
│   │   └── subtitle_transformer.cpp
│   └── cli/main.cpp
├── tests/
│   ├── test_fft_aligner.cpp
│   ├── test_subtitle_speech.cpp
│   ├── test_srt_parser.cpp
│   ├── test_vad_processor.cpp
│   └── main.cpp
├── models/
│   └── silero_vad.onnx
├── sherpa-onnx-v1.13.2-osx-arm64-jni/
│   ├── include/sherpa-onnx/c-api/
│   └── lib/
├── test_data/
│   ├── the.copenhagen.test.s01e03/
│   └── la.brea.s01e01/
└── third_party/kiss_fft/

Build Requirements

CMake 3.18+
C++17 compiler (Apple Clang 17+, GCC 11+, MSVC 2022+)
macOS: prebuilt Sherpa ONNX binaries included for arm64
Linux/Windows: download corresponding Sherpa ONNX release
FFmpeg libraries (libavformat, libavcodec, libswresample, libavutil)

FFmpeg Setup (Required)

FFmpeg is a system dependency on desktop. The project links against installed FFmpeg libraries and does not build FFmpeg from source.

macOS:

brew install ffmpeg

Ubuntu/Debian:

sudo apt install libavformat-dev libavcodec-dev libswresample-dev libavutil-dev

Verify:

pkg-config --exists libavformat && echo "FFmpeg OK"

If pkg-config is unavailable, CMake will fall back to searching common library paths (/opt/homebrew/lib, /usr/local/lib, /usr/lib).

Project Phases

Phase 1: Core Library + macOS CLI ✅ COMPLETE

FFT-based aligner (FFTAligner, MaxScoreAligner)
SRT parser and subtitle transformers (shifter, scaler)
Sherpa ONNX VAD integration with Silero model
macOS CLI tool that can sync SRT against WAV
All tests passing (13/13)

Phase 2: Media Pipeline 🔄 IN PROGRESS

ASS/SSA/VTT support
Framerate ratio inference and golden-section search
Subtitle merging
Serialized speech cache (.npz equivalent)

Phase 3: Advanced Features ⏳ PENDING

Full CLI with video input support
Encoding detection
Progress callbacks and logging
Subtitle writing with proper formatting

Phase 4: Android Integration ⏳ PENDING

JNI bindings
Gradle/AAR packaging
Android-specific FFmpeg build (minimal features)
Java/Kotlin wrapper API
Android UI demo app

Critical Design Decisions

FFT Library: Kiss FFT (simple, permissive, vendored) — can upgrade to KFR later
VAD: Sherpa ONNX + Silero VAD (superior accuracy, future ASR path)
Encoding Detection: uchardet vs ICU vs minimal built-in detection (Phase 2)
ASS/SSA Support: Full libass integration vs lightweight custom parser (Phase 2)
Memory Model: Streaming audio processing (chunked) vs loading entire audio
Error Handling: Exceptions in C++ API, return codes in C API boundary

See docs/IMPLEMENTATION_ROADMAP.md for detailed phase breakdown, and docs/API_DESIGN.md for the proposed C++ interface.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FFsubsync C++

What FFsubsync Does

Why a C++ Reimplementation?

Status

Build

CLI Usage

Files in This Directory

Quick Reference: Python -> C++ Mapping

Key Dependencies

Project Structure

Build Requirements

FFmpeg Setup (Required)

Project Phases

Phase 1: Core Library + macOS CLI ✅ COMPLETE

Phase 2: Media Pipeline 🔄 IN PROGRESS

Phase 3: Advanced Features ⏳ PENDING

Phase 4: Android Integration ⏳ PENDING

Critical Design Decisions

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
docs		docs
ffmpeg-build		ffmpeg-build
include/ffsubsync		include/ffsubsync
models		models
src		src
tests		tests
.gitignore		.gitignore
CMakeLists.txt		CMakeLists.txt
README.md		README.md
setup.sh		setup.sh

Folders and files

Latest commit

History

Repository files navigation

FFsubsync C++

What FFsubsync Does

Why a C++ Reimplementation?

Status

Build

CLI Usage

Files in This Directory

Quick Reference: Python -> C++ Mapping

Key Dependencies

Project Structure

Build Requirements

FFmpeg Setup (Required)

Project Phases

Phase 1: Core Library + macOS CLI ✅ COMPLETE

Phase 2: Media Pipeline 🔄 IN PROGRESS

Phase 3: Advanced Features ⏳ PENDING

Phase 4: Android Integration ⏳ PENDING

Critical Design Decisions

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages