Skip to content

Some questions about finetuning #36

@victorup

Description

@victorup

Hi,
I would like to finetune CogView2 on my own dataset. I used cogdata to process the datasets with a JSON file {'img1':'text1'; 'img2': 'text2', ...} and a tar file which includes images in the keys of the JSON file.

When I run pretrain_coglm.py, an error occurs:

File "pretrain_coglm.py", line 210, in forward_step
    tokens, position_ids, labels, attention_mask, loss_mask = get_batch(
  File "pretrain_coglm.py", line 61, in get_batch
    raise ValueError('temporally not support pure image samples')

I commented out the raise line and encountered another error:

File "pretrain_coglm.py", line 214, in forward_step
    logits, *mems = model(tokens, position_ids, attention_mask)
  File "/home/xinpeng/miniconda3/envs/cogview_py38/lib/python3.8/site-packages/torch/nn/modules/mo
dule.py", line 1102, in _call_impl                                                                
    return forward_call(*input, **kwargs)                                                         
  File "/home/xinpeng/miniconda3/envs/cogview_py38/lib/python3.8/site-packages/deepspeed/utils/nvt
x.py", line 11, in wrapped_fn
    return func(*args, **kwargs)
  File "/home/xinpeng/miniconda3/envs/cogview_py38/lib/python3.8/site-packages/deepspeed/runtime/engine.py", line 1568, in forward
    loss = self.module(*inputs, **kwargs)
  File "/home/xinpeng/miniconda3/envs/cogview_py38/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/xinpeng/miniconda3/envs/cogview_py38/lib/python3.8/site-packages/SwissArmyTransformer/model/base_model.py", line 111, in forward
    return self.transformer(*args, **kwargs)
  File "/home/xinpeng/miniconda3/envs/cogview_py38/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/xinpeng/miniconda3/envs/cogview_py38/lib/python3.8/site-packages/SwissArmyTransformer/model/transformer.py", line 411, in forward
    hidden_states = HOOKS_DEFAULT['word_embedding_forward'](self, input_ids, output_cross_layer=output_cross_layer,**kw_args)
  File "/home/xinpeng/miniconda3/envs/cogview_py38/lib/python3.8/site-packages/SwissArmyTransforme
r/transformer_defaults.py", line 117, in word_embedding_forward_default
    return self.transformer.word_embeddings(input_ids)
  File "/home/xinpeng/miniconda3/envs/cogview_py38/lib/python3.8/site-packages/torch/nn/modules/mo
dule.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/xinpeng/miniconda3/envs/cogview_py38/lib/python3.8/site-packages/SwissArmyTransforme
r/mpu/layers.py", line 121, in forward
    output_parallel = F.embedding(masked_input, self.weight,
  File "/home/xinpeng/miniconda3/envs/cogview_py38/lib/python3.8/site-packages/torch/nn/functional
.py", line 2044, in embedding
    return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
RuntimeError: 'weight' must be 2-D

My environment is as follows:

  • python==3.8.0
  • torch==1.10.0+cu111
  • ipython==7.21.0
  • deepspeed==0.6.3
  • SwissArmyTransformer==0.2.1
  • icetk==0.0.7
  • sentencepiece==0.1.98

Additionally, I would like to know if there is a command to resume the checkpoint you provided for further fine-tuning. How can I do that?"

Thank you!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions