Describe the bug
Based on your expected output file for flan_t5_small you got 128,743,488 parameters.
The paper mention 80M and huggingface says 77M.
With my own script I arrive to 76,961,152, which matches with the expected order of magnitude.
To Reproduce
Run:
|
def test_flan_t5_small() -> None: |
|
model = AutoModelForSeq2SeqLM.from_pretrained("google/flan-t5-small") |
|
inputs = { |
|
"input_ids": torch.zeros(3, 100).long(), |
|
"attention_mask": torch.zeros(3, 100).long(), |
|
"labels": torch.zeros(3, 100).long(), |
|
} |
|
summary(model, input_data=inputs) |
Expected behavior
Get the good number of parameters.
Describe the bug
Based on your expected output file for
flan_t5_smallyou got128,743,488parameters.The paper mention 80M and huggingface says 77M.
With my own script I arrive to
76,961,152, which matches with the expected order of magnitude.To Reproduce
Run:
torchinfo/tests/torchinfo_xl_test.py
Lines 158 to 165 in 164597f
Expected behavior
Get the good number of parameters.