Performance discrepancy with nan coordinates vs coordinates=None

There is an interesting performance discrepancy when passing in null coordinates (tensor full of nans) versus coordinates = None. 
Passing in null coordinates has significantly worse performance. By default, torch.inf padded tokens are added to the coordinates, whereas with coordinates = None the coordinates become a tensor consisting only of nans. 

To produce discrepancy: 
```
    ptms = [] 
    for _ in range(30):
        seq_cfg = GenerationConfig(track="sequence", num_steps=8)
        struct_cfg = GenerationConfig(track="structure", num_steps=8)
        #seq = model.generate(ESMProtein(sequence='_'*256, coordinates=None), seq_cfg)
        seq = model.generate(ESMProtein(sequence='_'*256, coordinates=torch.full((256, 3, 3), float("nan"), dtype=torch.float, device='cuda')), seq_cfg)
        structure = model.generate(seq, struct_cfg)
        ptms.append(structure.ptm.item())
    print(np.mean(ptms))
```

Coordinates=None: Mean pTM 0.4264
Coordinates=Nans: Mean pTM : 0.09578


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Performance discrepancy with nan coordinates vs coordinates=None #294

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Performance discrepancy with nan coordinates vs coordinates=None #294

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions