Skip to content

shaundano/ocap-elephant

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

51 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ocap-elephant

This is a fork of ocap by the open-world-agents team that is set up to work on a VM, includes microphone capture, and is configured for automated graceful shutdown on Windows. Basically, it writes its process ID in a file, and then another python script can read it and send a signal interrupt, which is necessary to automate the shutdown of ocap.

ocap

ocap gstreamer-bundle

High-performance desktop recorder for Windows. Captures screen, audio, keyboard, mouse, and window events.

This project was first introduced and developed for the D2E project. For more details, see D2E: Scaling Vision-Action Pretraining on Desktop Data for Transfer to Embodied AI If you find this work useful, please cite our paper.

What is ocap?

ocap (Omnimodal CAPture) captures all essential desktop signals in synchronized format. Records screen video, audio, keyboard/mouse input, and window events. Built for the open-world-agents project but works for any desktop recording needs.

TL;DR: Complete, high-performance desktop recording tool for Windows. Captures everything in one command.

demo.mp4

Citation

Citing the original work:

@article{choi2025d2e,
  title={D2E: Scaling Vision-Action Pretraining on Desktop Data for Transfer to Embodied AI},
  author={Choi, Suwhan and Jung, Jaeyoon and Seong, Haebin and Kim, Minchan and Kim, Minyeong and Cho, Yongjun and Kim, Yoonshik and Park, Yubeen and Yu, Youngjae and Lee, Yunsung},
  journal={arXiv preprint arXiv:2510.05684},
  year={2025}
}

About

Fork of open-world-agents OCAP multichannel desktop recorder that includes microphone recording, graceful automated shutdown and non-GPU support.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Python 97.0%
  • Batchfile 3.0%