Skip to content

Conversation

@abouteiller
Copy link
Contributor

@abouteiller abouteiller commented Mar 12, 2025

Issue parsec_fatal when the datatype_arenas have not been set (rather than crashing in a cryptic way), examplified by ICLDisco/dplasma#138

@abouteiller abouteiller added documentation Improvements or additions to documentation enhancement New feature or request labels Mar 12, 2025
@abouteiller abouteiller added this to the v4.1 milestone Mar 12, 2025
@abouteiller abouteiller self-assigned this Mar 12, 2025
@abouteiller abouteiller requested a review from a team as a code owner March 12, 2025 14:20
@abouteiller abouteiller changed the title Issue parsec_fatal when the datatype_arenas have not been set (rather Issue parsec_fatal when the datatype_arenas have not been set in the PTG Mar 12, 2025
Copy link
Contributor

@bosilca bosilca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why here in the iterator ? This will bail out on the first flow without a datatype, and you should only bail out if at the end of the get_datatype you don't get a proper datatype, aka in remote_dep_mpi.c:934.

@abouteiller
Copy link
Contributor Author

We got a datatype with a count non-zero and no datatype, that should not exist

@bosilca
Copy link
Contributor

bosilca commented Mar 12, 2025

It should not matter until the end of the iterator. We don't have to police every dep type, only that the end we have one.

@devreal
Copy link
Contributor

devreal commented Mar 27, 2025

Is it ever valid to have a non-zero count with an invalid datatype? How? Otherwise this error seems legit.

@bosilca
Copy link
Contributor

bosilca commented Mar 28, 2025

Define valid in this particular context ? From the communication perspective it is invalid to have a non-zero count with an invalid datatype, but the communication is bound to a flow in the receive case not to any particular deps inside the flow. This PR checks for each dep, even the ones that will not finally be associated with the communication, instead of checking only once at the end to make sure the flow (and therefore the communication to be issued) is correct.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants