slow `predict_as_dataframe` function

The function:

https://github.com/graphnet-team/graphnet/blob/401f28b450e995cc567a70e83bb813757e9a4d63/src/graphnet/models/easy_model.py#L323

Can be really slow when querying for additional attributes, which is handled here:

https://github.com/graphnet-team/graphnet/blob/401f28b450e995cc567a70e83bb813757e9a4d63/src/graphnet/models/easy_model.py#L381-L397

In the prediction loop, we are already looping through the whole data. Can we think of a way to utilise that initial loop already to get additional attributes without looping through the dataloader again? I think the least we can do is put a progress bar in the second loop to let the user know that the program hasn't stalled (I was very confused at times, why does it take so long.?).

	for batch in dataloader:
	for attr in attributes:
	attribute = batch[attr]
	if isinstance(attribute, torch.Tensor):
	attribute = attribute.detach().cpu().numpy()

	# Check if node level predictions
	# If true, additional attributes are repeated
	# to make dimensions fit
	if pulse_level_predictions:
	if len(attribute) < np.sum(
	batch.n_pulses.detach().cpu().numpy()
	):
	attribute = np.repeat(
	attribute, batch.n_pulses.detach().cpu().numpy()
	)
	attributes[attr].extend(attribute)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

slow `predict_as_dataframe` function #830

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

slow predict_as_dataframe function #830

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

slow `predict_as_dataframe` function #830