From de5a79ba4bb6785f966850886b1310999e7a9c99 Mon Sep 17 00:00:00 2001 From: Mayank Jain Date: Tue, 7 Nov 2023 09:32:41 +0530 Subject: [PATCH] Add warning for multi GPU finetuning for ASR --- asr-finetune-conformer-ctc-nemo.ipynb | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/asr-finetune-conformer-ctc-nemo.ipynb b/asr-finetune-conformer-ctc-nemo.ipynb index 40cc2e4..a0bf3b2 100644 --- a/asr-finetune-conformer-ctc-nemo.ipynb +++ b/asr-finetune-conformer-ctc-nemo.ipynb @@ -336,7 +336,11 @@ "\n", "NeMo uses `.yml` files to configure the training parameters. You may update them directly by editing the configuration file or from the command-line interface. For example, if the number of epochs needs to be modified, along with a change in the learning rate, you can add `trainer.max_epochs=100` and `optim.lr=0.02` and train the model. \n", "\n", - "The following sample command uses the `speech_to_text_ctc_bpe.py` script in the `examples` folder to train/fine-tune a Conformer-CTC ASR model for 1 epoch. For other ASR models like Citrinet, you may find the appropriate config files in the NeMo GitHub repo under [examples/asr/conf/](https://github.com/NVIDIA/NeMo/tree/main/examples/asr/conf).\n" + "The following sample command uses the `speech_to_text_ctc_bpe.py` script in the `examples` folder to train/fine-tune a Conformer-CTC ASR model for 1 epoch. For other ASR models like Citrinet, you may find the appropriate config files in the NeMo GitHub repo under [examples/asr/conf/](https://github.com/NVIDIA/NeMo/tree/main/examples/asr/conf).\n", + "\n", + "
\n", + "If using multiple workers, the number of shards should be divisible by the world size to ensure an even split among workers. If it is not divisible, logging will give a warning but training will proceed, but likely hang at the last epoch. In addition, if using distributed processing, each shard must have the same number of entries after filtering is applied such that each worker ends up with the same number of files. We currently do not check for this in any dataloader, but the user’s program may hang if the shards are uneven.\n", + "" ] }, {