Skip to content

loss diverged to nan when training vits with fp16 option enabled. #4236

@ariacat3366

Description

@ariacat3366

I trained VITS using the original dataset with the fp16 option enabled (--use_amp true), but after turning the training around for a few days, the loss values diverged to NAN in the process.
I think the exception handling when values outside the range of fp16 is not working properly.
I did not see this error when the fp16 option was not enabled for the same data set.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Bugbug should be fixedESPnet2TTSText-to-speechWontfixWant to fix

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions