Skip to content

PML-302 MerLinProcessor export_config and typing#224

Open
LF-Vigneux wants to merge 17 commits into
merlinquantum:release/0.4from
LF-Vigneux:PML-302-MerLinProcessor-export_config-and-typing
Open

PML-302 MerLinProcessor export_config and typing#224
LF-Vigneux wants to merge 17 commits into
merlinquantum:release/0.4from
LF-Vigneux:PML-302-MerLinProcessor-export_config-and-typing

Conversation

@LF-Vigneux
Copy link
Copy Markdown
Contributor

Summary

Reinforced typing around the layer's export_config method necessity and its output dictionary

Related Issue

PML-302

Type of change

  • Bug fix
  • New feature
  • Documentation update
  • Refactor / Cleanup
  • Performance improvement
  • CI / Build / Tooling
  • Breaking change (requires migration notes)

Proposed changes

  • Created the runtime checkable SupportsExportConfig that makes sure that layers have a export_config function. The error message clearly indicates what should be the output of this method as well
  • Created the class ValidatedLayerConfig that validates and keeps the typed checked values of the output of export_config.

How to test / How to run

pytest -q

Documentation

  • User docs updated (Sphinx)
  • Examples / notebooks updated
  • Docstrings updated
  • Updated the API

@LF-Vigneux
Copy link
Copy Markdown
Contributor Author

I am not sure that I fullfill this criteria perfectly:

There is one obvious typed place to look to understand what MerlinProcessor expects from offloadable layers.

However, When there is no export_config method, I clearly state what the output of this method should look like. Also, a specific definition of types is generated for the errors of the output dictionary. I want your opinion on this point.

@ben9871 ben9871 self-assigned this May 29, 2026
@LF-Vigneux LF-Vigneux marked this pull request as ready for review May 29, 2026 15:30
Copy link
Copy Markdown
Contributor

@ben9871 ben9871 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Token validation is too extreme and stops the branches tests from running. Other notes covered in a PR to this fork unrelated to the ticket(only visible now following tests with scaleway token)

Comment thread merlin/core/merlin_processor.py Outdated
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This token requirement should only apply to the remote_processor= path. For session=, authentication is owned by the ISession; Merlin creates fresh processors via session.build_remote_processor() and does not need to clone them with a token. With a real Scaleway Session, this currently makes every MerlinProcessor(session=scaleway_session) construction fail before any job is submitted.

Fix Scaleway compatibility with Perceval 1.2
@LF-Vigneux
Copy link
Copy Markdown
Contributor Author

From your markdown fixes

  1. Ownership of Session: From what I understand, I removed the session paths to enter and exit in the corresponding methods

@LF-Vigneux
Copy link
Copy Markdown
Contributor Author

For 3, changed in next commit

@CassNot CassNot added this to the v0.4 milestone May 29, 2026
@LF-Vigneux
Copy link
Copy Markdown
Contributor Author

For 2, commented the probs test, right now there is no session that has probs.
Here are all the platforms
SAMPLES ONLY: EMU-SAMPLING-4H100SXM ['samples', 'sample_count']
SAMPLES ONLY: QPU-BELENOS-12PQ ['sample_count', 'samples', 'entropy']
FAILED: QPU-ALTAIR-10PQ {'help_message': 'platform QPU in maintenance, wait for platform to be available or use another platform', 'message': 'precondition is not respected', 'precondition': 'resource_not_usable', 'type': 'precondition_failed'}
SAMPLES ONLY: EMU-ALTAIR-10PQ ['sample_count', 'samples']
SAMPLES ONLY: EMU-BELENOS-12PQ ['sample_count', 'samples']
SAMPLES ONLY: EMU-ASCELLA-6PQ ['sample_count', 'samples']
SAMPLES ONLY: EMU-SAMPLING-L4 ['samples', 'sample_count']
SAMPLES ONLY: EMU-SAMPLING-8H100SXM ['samples', 'sample_count']
SAMPLES ONLY: EMU-SAMPLING-P100 ['samples', 'sample_count']
SAMPLES ONLY: EMU-SAMPLING-2L4 ['samples', 'sample_count']
FAILED: QPU-ASCELLA-6PQ {'help_message': 'platform QPU in maintenance, wait for platform to be available or use another platform', 'message': 'precondition is not respected', 'precondition': 'resource_not_usable', 'type': 'precondition_failed'}
SAMPLES ONLY: EMU-SAMPLING-4L4 ['samples', 'sample_count']
SAMPLES ONLY: EMU-SAMPLING-H100 ['samples', 'sample_count']

So the test on scaleway is not useful yet but, it is tested locally to work on artifiical backends

@LF-Vigneux
Copy link
Copy Markdown
Contributor Author

All tests are now passing scaleway and local

# Output should be normalized probabilities
assert torch.all(y >= 0)
assert torch.allclose(y.sum(dim=1), torch.ones(4), atol=0.001)
# Uncomment when there is a probs backend
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we cannot commit commented out code which the user must 'uncomment'. Find another way to do this life a conditional or a flag

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants