Skip to content

RuntimeError: Error(s) in loading state_dict for GPT #18

@CAVUling

Description

@CAVUling

Hi, when i run the code python generate/generate.py --model_weight gua_tpsa_logp_sas.pt --props tpsa logp sas --data_name guacamol2 --csv_name gua_tpsa_logp_sas_temp1 --gen_size 10000 --batch_size 512 --vocab_size 94 --block_size 100 in the generate_guacamol_prop.sh, i meet an RuntimeError: Error(s) in loading state_dict for GPT
size mismatch for blocks.0.attn.mask: copying a param with shape torch.Size([1, 1, 101, 101]) from checkpoint, the shape in current model is torch.Size([1, 1, 201, 201]). size mismatch for blocks.1.attn.mask: copying a param with shape torch.Size([1, 1, 101, 101]) from checkpoint, the shape in current model is torch.Size([1, 1, 201, 201]). size mismatch for blocks.2.attn.mask: copying a param with shape torch.Size([1, 1, 101, 101]) from checkpoint, the shape in current model is torch.Size([1, 1, 201, 201]). size mismatch for blocks.3.attn.mask: copying a param with shape torch.Size([1, 1, 101, 101]) from checkpoint, the shape in current model is torch.Size([1, 1, 201, 201]). size mismatch for blocks.4.attn.mask: copying a param with shape torch.Size([1, 1, 101, 101]) from checkpoint, the shape in current model is torch.Size([1, 1, 201, 201]). size mismatch for blocks.5.attn.mask: copying a param with shape torch.Size([1, 1, 101, 101]) from checkpoint, the shape in current model is torch.Size([1, 1, 201, 201]). size mismatch for blocks.6.attn.mask: copying a param with shape torch.Size([1, 1, 101, 101]) from checkpoint, the shape in current model is torch.Size([1, 1, 201, 201]). size mismatch for blocks.7.attn.mask: copying a param with shape torch.Size([1, 1, 101, 101]) from checkpoint, the shape in current model is torch.Size([1, 1, 201, 201]).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions