Using main.py for computation of basic tags is very very slow for pitch computation

print("Compute pitch")
    pitch_dataset = dataset.cast_column(audio_column_name, Audio(sampling_rate=16_000)).map(
        pitch_apply,
        batched=True,
        batch_size=args.batch_size,
        with_rank=True if torch.cuda.device_count()>0 else False,
        num_proc=torch.cuda.device_count()*args.num_workers_per_gpu_for_pitch if torch.cuda.device_count()>0 else args.cpu_num_workers,
        remove_columns=[audio_column_name], # tricks to avoid rewritting audio
        fn_kwargs={"audio_column_name": audio_column_name, "penn_batch_size": args.penn_batch_size},
    )

and this is my configuration for running the script (I modified the loading part of it - I work with local data, not hugging face):

python main.py \
  --source "local" \
  --metadata_path "/mnt/personal/kubicra3/Czech_par/Czech_par_dataset/metadata_filtered.tsv" \
  --dataset_path "/mnt/personal/kubicra3/Czech_par/Czech_par_dataset" \
  --configuration "default" \
  --text_column_name "text" \
  --audio_column_name "path" \
  --output_dir '/mnt/personal/kubicra3/data-speech/Czech_par' \
  --cpu_num_workers 32 \
  --num_workers_per_gpu_for_squim 4 \
  --num_workers_per_gpu_for_pitch 4 \
  --num_workers_per_gpu_for_snr 4 \
  --rename_column \
  --repo_id "ParlaCZ-tts-tags" \
  --apply_squim_quality_estimation \
  --penn_batch_size 2048 \   #4096 causes Cuda OOM error
  --batch_size 64

The part with pitch computation takes ages, I am computing it for quite a big dataset (around 900 hours of 0.5 - 30s long recordings). However I am computing on 2 GPUs on my institution cluster. I am playing with penn_batch_size and batch_size, however nothing seems to speed it up. It looks like it will take 30 hours to compute everythinh in this part... i know that working with that many audio recordings can be quite heavy work to do, but still it seems weird to me that it takes that long. Please is it normal or not?

Thank you for response



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Using main.py for computation of basic tags is very very slow for pitch computation #42

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Using main.py for computation of basic tags is very very slow for pitch computation #42

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions