Many MoCap

AI-powered markerless motion capture from two synchronized camera views.

Two videos in → 3D human pose reconstruction → animation-ready BVH motion file out.

Overview

Many MoCap is a markerless AI motion-capture pipeline that reconstructs 3D human motion from two synchronized video streams recorded from different camera angles.

The system uses AI-based human pose estimation to detect body and hand landmarks in each camera view, then applies stereo computer vision to triangulate the detected 2D landmarks into 3D space. The reconstructed motion is converted into a hierarchical animation skeleton and exported as a BVH motion-capture file.

This project is built for low-cost motion capture experiments using normal cameras instead of expensive marker suits or optical mocap stages.

Camera 0 video        Camera 1 video
     │                     │
     ├── AI pose detection ┤
     │                     │
     └── 2D body + hand keypoints
               │
               ▼
      Stereo triangulation
               │
               ▼
        3D keypoint tracks
               │
               ▼
   Skeleton solving + smoothing
               │
               ▼
        BVH motion capture file

What kind of AI is used?

This is not generative AI.
Many MoCap uses computer-vision AI for human pose estimation.

The AI part of the pipeline detects human landmarks from video frames, including:

full-body pose landmarks
left-hand landmarks
right-hand landmarks
frame-by-frame motion tracking

The project combines:

Area	Role
AI / Machine Learning	Human pose and hand landmark detection
Computer Vision	Two-camera reconstruction and camera projection
Geometry	DLT triangulation from two views
Animation Systems	Skeleton hierarchy, joint rotations, BVH export
Signal Processing	Median filtering to reduce jitter

Key features

Takes two synchronized videos from different angles.
Detects body and hand landmarks using AI pose estimation.
Tracks up to 75 landmarks per frame:
- 33 body pose landmarks
- 21 left-hand landmarks
- 21 right-hand landmarks
Reconstructs 3D keypoints using calibrated stereo camera geometry.
Uses Direct Linear Transform (DLT) for triangulation.
Saves intermediate 2D and 3D keypoint data for debugging.
Applies motion smoothing to reduce noisy/jittery landmarks.
Builds a hierarchical human skeleton.
Estimates bone lengths from captured motion.
Computes joint rotations frame by frame.
Exports the final motion as a BVH file.
Includes a 3D visualizer for inspecting reconstructed motion.

Why this project is important

Many single-camera pose systems only estimate 2D landmarks or approximate 3D pose.
Many MoCap uses two real camera views and camera calibration data to reconstruct a more meaningful 3D motion track.

This makes it useful for:

indie animation
game development
virtual production
AR / VR avatar motion
human-motion analysis
robotics and biomechanics experiments
low-cost motion capture research
animation prototyping for Blender, Unity, Unreal Engine, and similar tools

Pipeline

1. Two-view AI landmark detection

bodypose3d.py reads two video streams:

media/studio7/cam0.mp4
media/studio7/cam1.mp4

For each frame, it runs AI-based landmark detection on both views and extracts body and hand keypoints.

The current keypoint layout is:

0  - 32 : body pose landmarks
33 - 53 : left hand landmarks
54 - 74 : right hand landmarks

If a landmark is not detected, it is stored as:

2D: [-1, -1]
3D: [-1, -1, -1]

This allows the pipeline to continue even when some points are temporarily missing.

2. Camera projection

utils.py loads camera calibration files and builds projection matrices for both cameras.

Expected calibration structure:

camera_parameters/
└── studio7/
    ├── camera0_intrinsics.dat
    ├── camera1_intrinsics.dat
    ├── world_to_camera0_rot_trans.dat
    └── world_to_camera1_rot_trans.dat

These files describe each camera's intrinsic parameters and its position/orientation in the capture setup.

3. 3D reconstruction

For every matching landmark pair from the two camera views, the pipeline triangulates a 3D position using DLT.

Generated files:

kpts_cam0.dat   # 2D keypoints from camera 0
kpts_cam1.dat   # 2D keypoints from camera 1
kpts_3d.dat     # reconstructed 3D keypoints

These intermediate files make the system easier to debug and improve.

4. 3D visualization

show_3d_pose.py loads the reconstructed 3D keypoints and visualizes the skeleton motion in a 3D plot.

Run:

python show_3d_pose.py

5. BVH export

BVHmaker4.py converts reconstructed 3D keypoints into an animation skeleton and exports a BVH file.

The exporter:

maps keypoint indices to named joints
adds virtual HIP and NECK joints
defines the body hierarchy
calculates bone lengths
creates a normalized base skeleton
computes root motion
computes per-joint rotations
writes the HIERARCHY and MOTION sections of a BVH file

Run:

python BVHmaker4.py kpts_3d.dat

Default output:

Bebinam_output.bvh

The generated .bvh file can be imported into tools such as:

Blender
Unity
Unreal Engine
MotionBuilder
Maya
other BVH-compatible animation tools

Repository structure

many_mocap/
├── bodypose3d.py              # Two-view AI pose detection + 3D reconstruction
├── BVHmaker4.py               # 3D keypoints to BVH motion export
├── BVHmaker3.py               # Earlier BVH conversion experiment
├── show_3d_pose.py            # 3D pose visualization
├── utils.py                   # Camera projection, DLT, rotations, file IO
├── kpts_cam0.dat              # Camera 0 detected 2D keypoints
├── kpts_cam1.dat              # Camera 1 detected 2D keypoints
├── kpts_3d.dat                # Reconstructed 3D keypoints
├── kpts_3d_studio7.dat        # Studio capture 3D keypoint data
├── GrandPapa.bvh              # Sample BVH output
├── GrandMama.bvh              # Sample BVH output
└── GrandPapa_refigned.bvh     # Refined BVH output

Installation

Create a virtual environment:

python -m venv .venv
source .venv/bin/activate

On Windows PowerShell:

python -m venv .venv
.\.venv\Scripts\Activate.ps1

Install dependencies:

pip install numpy scipy opencv-python mediapipe matplotlib

Usage

1. Prepare two synchronized videos

Place your videos here:

media/studio7/cam0.mp4
media/studio7/cam1.mp4

The videos should capture the same motion from two different viewpoints.

2. Add camera calibration files

Place calibration files here:

camera_parameters/studio7/

Required files:

camera0_intrinsics.dat
camera1_intrinsics.dat
world_to_camera0_rot_trans.dat
world_to_camera1_rot_trans.dat

3. Generate 3D keypoints

python bodypose3d.py

This creates:

kpts_cam0.dat
kpts_cam1.dat
kpts_3d.dat

You can also use webcam IDs:

python bodypose3d.py 0 1

4. Preview the reconstructed 3D pose

python show_3d_pose.py

5. Export BVH motion

python BVHmaker4.py kpts_3d.dat

Output:

Bebinam_output.bvh

Technical highlights

AI pose estimation for body and hands
Markerless motion capture without suits or body markers
Two-view stereo reconstruction
Camera calibration based projection matrices
DLT triangulation
75-point body + hand landmark representation
3D skeleton visualization
Median filtering for jitter reduction
Bone-length estimation
Hierarchical skeleton solving
BVH motion export

Current status

This repository is a working research/prototype stage of a two-camera markerless motion-capture system.

The core concept is implemented:

two videos → AI keypoints → 3D reconstruction → skeleton motion → BVH

Future work can improve the developer experience, calibration flow, retargeting presets, and production packaging.

Roadmap

Author

Built by Ehsan Moradi as part of research and engineering work in AI, computer vision, 3D reconstruction, real-time systems, and animation pipelines.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Many MoCap

Overview

What kind of AI is used?

Key features

Why this project is important

Pipeline

1. Two-view AI landmark detection

2. Camera projection

3. 3D reconstruction

4. 3D visualization

5. BVH export

Repository structure

Installation

Usage

1. Prepare two synchronized videos

2. Add camera calibration files

3. Generate 3D keypoints

4. Preview the reconstructed 3D pose

5. Export BVH motion

Technical highlights

Current status

Roadmap

Suggested GitHub topics

Author

About

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.idea		.idea
camera_parameters		camera_parameters
media		media
BVHmaker3.py		BVHmaker3.py
BVHmaker4.py		BVHmaker4.py
GrandMama.bvh		GrandMama.bvh
GrandPapa.bvh		GrandPapa.bvh
GrandPapa_refigned.bvh		GrandPapa_refigned.bvh
README.md		README.md
angelExtractor.py		angelExtractor.py
bodypose3d.py		bodypose3d.py
kpts_3d.dat		kpts_3d.dat
kpts_3d_studio7.dat		kpts_3d_studio7.dat
kpts_cam0.dat		kpts_cam0.dat
kpts_cam1.dat		kpts_cam1.dat
show_3d_pose.py		show_3d_pose.py
utils.py		utils.py

Folders and files

Latest commit

History

Repository files navigation

Many MoCap

Overview

What kind of AI is used?

Key features

Why this project is important

Pipeline

1. Two-view AI landmark detection

2. Camera projection

3. 3D reconstruction

4. 3D visualization

5. BVH export

Repository structure

Installation

Usage

1. Prepare two synchronized videos

2. Add camera calibration files

3. Generate 3D keypoints

4. Preview the reconstructed 3D pose

5. Export BVH motion

Technical highlights

Current status

Roadmap

Suggested GitHub topics

Author

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!

Languages