Skip to content

Add KeyData.train_index_bounds#704

Merged
takluyver merged 3 commits intomasterfrom
feat/keydata-train-index
Mar 10, 2026
Merged

Add KeyData.train_index_bounds#704
takluyver merged 3 commits intomasterfrom
feat/keydata-train-index

Conversation

@philsmt
Copy link
Copy Markdown
Contributor

@philsmt philsmt commented Dec 19, 2025

When using data with more than one entry per train, I find it sometimes convenient (and also regularly more performant) to use plain old ndarrays and have arrays lying around indicating the first and last index of a given train.

This PR adds a method KeyData.train_index_bounds() to deliver this based on KeyData.data_counts(). I went with the traditional labelled approach, though I don't have hard feelings for that.

@philsmt philsmt requested a review from takluyver December 19, 2025 10:41
Comment thread extra_data/tests/test_keydata.py Fixed
@takluyver
Copy link
Copy Markdown
Member

If this fits in with using .ndarray(), how about we make labelled=False the default? I quite like having a convenient way to do lower-level analysis without involving pandas/xarray, and this seems like it would be a natural part of that.

Comment thread extra_data/tests/mockdata/xgm.py
@philsmt philsmt force-pushed the feat/keydata-train-index branch from 8857c23 to b9daf72 Compare January 26, 2026 10:13
@philsmt
Copy link
Copy Markdown
Contributor Author

philsmt commented Jan 26, 2026

Sure, I switched the default to labelled=False.

Comment thread extra_data/keydata.py Outdated
Comment on lines +539 to +543
"""Generate first and last indices of trains to use alongside data
from ``.ndarray()``.

If *labelled* is True, returns a pandas dataframe with columns first
and last. Otherwise, returns a tuple of two NumPy arrays.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpicking, could the docstring and the column names refer to 'start' and 'stop' instead of first & last? Or we can find other names.

If you're talking about a section of the data which you'd slice as [10:20], the index of the last item is 19, not 20.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's no problem, I don't have any strong feelings and your argument makes sense.

@takluyver takluyver added this to the 1.24 milestone Mar 10, 2026
@philsmt philsmt force-pushed the feat/keydata-train-index branch from b9daf72 to ffdd5dd Compare March 10, 2026 14:59
@takluyver takluyver merged commit 5bda4f8 into master Mar 10, 2026
10 checks passed
@takluyver takluyver deleted the feat/keydata-train-index branch March 10, 2026 15:20
@takluyver
Copy link
Copy Markdown
Member

Thanks!

@philsmt
Copy link
Copy Markdown
Contributor Author

philsmt commented Mar 10, 2026

Thanks for review!

I should refrain from making fixup commits with your quick merge finger 🫣

@takluyver
Copy link
Copy Markdown
Member

Oops, sorry. I sometimes get my wires crossed between XFEL projects where I say LGTM and let people merge their own PRs, and projects where I hit merge myself. I'm happy to LGTM a fixup PR if you want to do one. 😉

@philsmt
Copy link
Copy Markdown
Contributor Author

philsmt commented Mar 10, 2026

No, that's quite alright. I refered to the fact I left a fixup commit in the history, and now it's in main. But I'm probably caring too much about history 😉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants