feat(pose_estimation): support multiprocessing videos_to_poses by J22Melody · Pull Request #130 · sign-language-processing/pose

J22Melody · 2024-12-05T18:45:09Z

AmitMY · 2024-12-05T18:55:59Z



-def pose_video(input_path: str, output_path: str, format: str, additional_config: dict):
+def pose_video(input_path: str, output_path: str, format: str, additional_config: dict = {'model_complexity': 1}, progress: bool = True):


I think additional_config should default to None. it knows how to handle it

We had this line here, which would raise an error:

https://github.com/sign-language-processing/pose/blob/master/src/python/pose_format/bin/pose_estimation.py#L30

Do we still want by default {'model_complexity': 1}? I do not know the default behavior from Mediapipe and how much difference it brings.

i see, in that case, keep it

cleong110 · 2024-12-05T19:20:14Z



+def process_video(vid_path: Path, keep_video_suffixes: bool, pose_format: str, additional_config: dict) -> bool:
+    print(f'Estimating {vid_path} on CPU {psutil.Process().cpu_num()}')


We should probably switch to logging at some point rather than print statements, but apparently combining that with multiprocessing can be tricky.

https://docs.python.org/3/howto/logging-cookbook.html#logging-to-a-single-file-from-multiple-processes seems easiest.

cleong110 · 2024-12-05T19:21:38Z

+    try:
+        pose_path = get_corresponding_pose_path(video_path=vid_path, keep_video_suffixes=keep_video_suffixes)
+        if pose_path.is_file():
+            print(f"Skipping {vid_path}, corresponding .pose file already created.")


Here's an example where logging.debug might be good, as this could potentially output thousands of messages

J22Melody · 2024-12-06T10:37:26Z

can be merged if it looks good to you

J22Melody · 2024-12-16T21:38:58Z

Okay, the current script has a memory leak - the memory usage goes to > 100GB after running 64 processes on thousands of large videos (BOBSL videos, 30 to 45 minutes) for a night.

Let me know if you have any clue about it.

cleong110 · 2024-12-17T13:48:42Z

@J22Melody last time I was debugging memory usage I used memray: https://pypi.org/project/memray/

cleong110 · 2024-12-17T13:50:23Z

That was to do with DGS corpus/ the datasets library, I discovered in that case the tfds was expanding the entire video to frames and holding them in memory, which got really big really fast. I wonder if something similar is going on, that we've not tested with long videos and the pose estimation struggles with really long sequences of frames.

cleong110 · 2024-12-17T14:11:34Z

Following that suspicion and making some calculations...

process_video calls pose_video
pose_video calls load_video_frames, which only yields them, so that's not it, unless something takes all the frames and turns them into a list somewhere...
those frames get passed to load_holistic
which eventually passes them to process_holistic, which iterates over all frames. It doesn't hold onto the frames but it does store the body and confidence data in memory.
So by the time you get to the return statement, for a 30 minute video you will have processed, let's see... 30 minutes * 60 sec/minute * 25 frames/sec = 45k frames. So you get a numpy array of size (45k, 1, 576, 3) and another of (45k, 1, 576), both float32
So that should be about 400 MB per video. 64x that is about 24 GB of memory, just on the numpybody returned, and not counting any intermediate variables.
So I guess possibly there's some inefficiencies in process_holistic to look at? I would probably dig into that a bit more

cleong110 · 2024-12-17T15:27:45Z

Gave it a quick test on Google Colab, found a bug. #135

cleong110 · 2024-12-17T18:50:48Z

OK, after a bit of messing around on Google Colab with memray (NB: you have to use --follow-fork or it won't work right with multiprocessing), I have some results. For two 5-minute videos, AKzh_rKNqzo.mp4 and AKzh_rKNqzo_copy.mp4 (I just copied the same file) from YouTube-ASL, it looks like it uses about 600 MB of memory at peak usage.

As expected, the peak of that is in process_holistic, specifically this line.

memray_flamegraphs_on_AKzh_rKNqzo_with_2_workers.zip
^ here's the flamegraphs.

https://colab.research.google.com/drive/10vMrhoC0zVXLMzhJHB9kUi_0LqBvFhz7?usp=sharing is the quick notebook I threw together.

J22Melody · 2024-12-17T19:54:48Z

Thanks for the investigation.

Yes, every video is like a few hundred MB and if you have processed a few hundred videos without releasing them, then the memory explodes.

I guess the problem is with multiprocessing/process_map, and I tried multiprocessing.Pool.imap_unordered it works the same.

AmitMY · 2024-12-19T08:57:53Z

I looked over the code, I don't see why it would leak memory. We are duplicating the memory on the line @cleong110 refered to though.

I think the first thing to investigate then is: Run pose estimation with a single process (video_to_pose) and trace memory over time. the memory should not increase over time (except for the normal, 10kb per frame~)
A video of 45 minutes should have roughly 100,000 frames, which would be 1 GB. (that means, no leak, but it is just a large file)

So in conclusion, if you run 64 processes, and each is 1GB, but we also duplicate the memory, it makes sense you ran out of memory.

Starting solution: Option 2 here
https://chatgpt.com/share/6763fd6f-c3d8-800e-86dc-48b47204d854

feat(pose_estimation): support multiprocessing videos_to_poses

999a598

J22Melody marked this pull request as draft December 5, 2024 18:45

AmitMY requested changes Dec 5, 2024

View reviewed changes

cleong110 reviewed Dec 5, 2024

View reviewed changes

Comment thread src/python/pose_format/bin/directory.py Outdated

cleong110 reviewed Dec 5, 2024

View reviewed changes

Comment thread src/python/pose_format/bin/pose_estimation.py

review: use tqdm process_map

2228844

J22Melody changed the title ~~[WIP] feat(pose_estimation): support multiprocessing videos_to_poses~~ feat(pose_estimation): support multiprocessing videos_to_poses Dec 6, 2024

J22Melody marked this pull request as ready for review December 6, 2024 10:36

AmitMY approved these changes Dec 6, 2024

View reviewed changes

AmitMY merged commit 7a95e70 into sign-language-processing:master Dec 6, 2024

J22Melody deleted the multiprocessing branch December 6, 2024 14:29



		def pose_video(input_path: str, output_path: str, format: str, additional_config: dict):
		def pose_video(input_path: str, output_path: str, format: str, additional_config: dict = {'model_complexity': 1}, progress: bool = True):



		def process_video(vid_path: Path, keep_video_suffixes: bool, pose_format: str, additional_config: dict) -> bool:
		print(f'Estimating {vid_path} on CPU {psutil.Process().cpu_num()}')

Conversation

J22Melody commented Dec 5, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

AmitMY Dec 5, 2024

Choose a reason for hiding this comment

Uh oh!

J22Melody Dec 5, 2024

Choose a reason for hiding this comment

Uh oh!

AmitMY Dec 5, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

cleong110 Dec 5, 2024

Choose a reason for hiding this comment

Uh oh!

cleong110 Dec 5, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

J22Melody commented Dec 6, 2024

Uh oh!

J22Melody commented Dec 16, 2024

Uh oh!

cleong110 commented Dec 17, 2024

Uh oh!

cleong110 commented Dec 17, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cleong110 commented Dec 17, 2024

Uh oh!

cleong110 commented Dec 17, 2024

Uh oh!

cleong110 commented Dec 17, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

J22Melody commented Dec 17, 2024

Uh oh!

AmitMY commented Dec 19, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

J22Melody commented Dec 5, 2024 •

edited

Loading

cleong110 commented Dec 17, 2024 •

edited

Loading

cleong110 commented Dec 17, 2024 •

edited

Loading

AmitMY commented Dec 19, 2024 •

edited

Loading