Skip to content

Comments

Add mp.freeze_support to every entry point, for frozen binaries#1191

Merged
sergey-yaroslavtsev merged 1 commit intomasterfrom
freeze_support
Feb 20, 2026
Merged

Add mp.freeze_support to every entry point, for frozen binaries#1191
sergey-yaroslavtsev merged 1 commit intomasterfrom
freeze_support

Conversation

@sergey-yaroslavtsev
Copy link
Collaborator

Bring back parts removed in #1190

@sergey-yaroslavtsev sergey-yaroslavtsev requested review from vasole and woutdenolf and removed request for vasole February 17, 2026 22:32
@sergey-yaroslavtsev
Copy link
Collaborator Author

sergey-yaroslavtsev commented Feb 17, 2026

5.9.6 has same issue as before (at least on our Mac).

Here is the dry-run for this PR: dry-run
I will test it in the morning. If you want and can - fill free to do it before me.

If it works I would suggest to replace DMG in the release to one from this dry-run.

@sergey-yaroslavtsev
Copy link
Collaborator Author

Python 3.9 (minimum version) first time fails with

Traceback (most recent call last):
  File "/opt/hostedtoolcache/Python/3.9.25/x64/lib/python3.9/multiprocessing/queues.py", line 116, in get
    raise Empty
_queue.Empty

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/hostedtoolcache/Python/3.9.25/x64/lib/python3.9/site-packages/PyMca5/tests/HDF5UtilsTest.py", line 52, in testSegFault
    self.assertEqual(_safe_cause_segfault(default=123), 123)
  File "/opt/hostedtoolcache/Python/3.9.25/x64/lib/python3.9/site-packages/PyMca5/tests/HDF5UtilsTest.py", line 29, in _safe_cause_segfault
    return HDF5Utils.run_in_subprocess(_cause_segfault, *args, **kwargs)
  File "/opt/hostedtoolcache/Python/3.9.25/x64/lib/python3.9/site-packages/PyMca5/PyMcaIO/HDF5Utils.py", line 46, in run_in_subprocess
    return queue.get(block=False)
  File "/opt/hostedtoolcache/Python/3.9.25/x64/lib/python3.9/multiprocessing/queues.py", line 120, in get
    self._rlock.release()
ValueError: semaphore or lock released too many times

simple restart of the CI job was enough to succeed.

@woutdenolf
Copy link
Collaborator

woutdenolf commented Feb 18, 2026

5.9.6 has same issue as before (at least on our Mac).

What issue exactly? An issue with the unit test testSegFault or an issue in production when using one of the PyMca applications?

If it is an issue with the unit test testSegFault: we should fix the test (see #1190 (comment)).

If it is an issue with one of the PyMca applications: can we see a traceback or error? I would be surprised because safe_hdf5_group_keys() catches and ignores all issues with multiprocessing. Unless it is a BaseException or segfault of course.

@vasole
Copy link
Member

vasole commented Feb 18, 2026

What issue exactly?

That when opening an HDF5 file the entries in a frozen MacOS binary are not shown.

The problem is not really the test suite itself (although I have sent #1192 but that will not solve anything )

The only information we get is:

image

We have two solutions:

  • To put back multiprocessing.freeze_support() calls

  • To exclude multiprocessing from the frozen binaries as it was done prior to 5.9.5

@vasole
Copy link
Member

vasole commented Feb 18, 2026

I would vote for accepting this PR on the philosophy that even if we do not try to use multiprocessing, the module is a standard module and some dependency might use it. This PR anticipates it.

@vasole
Copy link
Member

vasole commented Feb 18, 2026

It is not necessary to do it for all the modules including a if __name__ == "__main__": It is only needed in those that are frozen and this PR covers all of them.

@vasole vasole self-requested a review February 18, 2026 06:39
@vasole
Copy link
Member

vasole commented Feb 18, 2026

If it works I would suggest to replace DMG in the release to one from this dry-run.

Fine with me. It does not change anything for people not using the MacOS frozen binary and it has not been uploaded to sourceforge yet.

@vasole
Copy link
Member

vasole commented Feb 18, 2026

I can confirm the dry-run mentioned above works under MacOS BigSur.

@sergey-yaroslavtsev
Copy link
Collaborator Author

Dry-run seems to work. HDF5 tree is visible, I rerun tests through interactive console (114 tests, 4 skipped) - it is for both MacOS and Windows, just to be sure.

To exclude multiprocessing from the frozen binaries as it was done prior to 5.9.5

If so then only necessary for MacOS - 5.9.4 had MP for Windows without user complains.

To put back multiprocessing.freeze_support() calls

I would prefer this one.

If it works I would suggest to replace DMG in the release to one from this dry-run.

Fine with me. It does not change anything for people not using the MacOS frozen binary and it has not been uploaded to sourceforge yet.

Done.

@sergey-yaroslavtsev
Copy link
Collaborator Author

sergey-yaroslavtsev commented Feb 18, 2026

    p.start()
    try:
        p.join()
        try:
            return queue.get(block=False)
        except Empty:
            return default
    finally:
        try:
            p.kill()
        except AttributeError:
            p.terminate()

that is why #1190 did not work since there is no error raised. Thus, exception in:

def safe_hdf5_group_keys(file_path, data_path=None):
    try:
        return run_in_subprocess(
            get_hdf5_group_keys, file_path, data_path=data_path, default=list()
        )
    except Exception:
        _logger.warning("run_in_subprocess not available")
        return get_hdf5_group_keys(file_path, data_path)

could not be reached.

If we want to try to get back to logic of #1190 it is the thing to be modified.

However, potential issue is that it is not clear when and why it fails to work properly for one combination of OS+freezing procedure+yes/no mp.freeze_support and not for another.

Update:
we anyway need to modify this function into a way that if subprocess created by MP exists immediately then it should fail -> in this case we will (depending on final choice for this fix) either (1) failed test or (2) it will run properly through exception or (3) this particular combination of frozen binaries works without freeze_support (if excluded).

@woutdenolf
Copy link
Collaborator

woutdenolf commented Feb 18, 2026

Ok after talking to @sergey-yaroslavtsev I finally understood that all this works fine

import multiprocessing

ctx = multiprocessing.get_context(context)
queue = ctx.Queue(maxsize=1)
p = ctx.Process(
    target=subprocess_main,
    args=(queue, target) + args,
    kwargs=kwargs,
)
p.start()
p.join()

I was convinced any of these things would be failing and then we capture this in safe_hdf5_group_keys and ignore + fallback. But no all this works fine, only we get an Empty here:

return queue.get(block=False)

We capture that Empty and return the default which is list().

@woutdenolf
Copy link
Collaborator

woutdenolf commented Feb 18, 2026

The thing is that Empty is also raised when the sub-process segfaults, which is precisely what we are trying to protect ourselves from (so we do not want to fallback because it would segfault the main process). So the solution is not to just "not capture Empty" and treat it like any other exception.

Instead we now add freeze_support to the CLI's (some of them). And capturing the run_in_subprocess() exceptions is there only in case importing multiprocessing raises an ImportError right? Probably best to puts comments in run_in_subprocess for all these findings. And perhaps only capture ImportError?

@sergey-yaroslavtsev
Copy link
Collaborator Author

sergey-yaroslavtsev commented Feb 18, 2026

Actually, join could be called safely outside of try and there is no point to call kill or terminate after join.
Probably initially return default was to protect the Main process from crush. We can move this logic to safe_hdf_group_keys.
finally it cleaned-modified to:

def safe_hdf5_group_keys(file_path, data_path=None):
    try:
        return run_in_subprocess(
            get_hdf5_group_keys, file_path, data_path=data_path
        )
    except Exception:
        _logger.warning("run_in_subprocess not available, multiprocessing is not imported or not protected.")
        try:
            return get_hdf5_group_keys(file_path, data_path=data_path)
        # avoid crushing the main process
        except Exception:
            _logger.warning("Failed to get HDF5 group keys for %s", file_path)
            return list()


def run_in_subprocess(target, *args, context=None, **kwargs):
    import multiprocessing
    ctx = multiprocessing.get_context(context)
    queue = ctx.Queue(maxsize=1)
    p = ctx.Process(
        target=subprocess_main,
        args=(queue, target) + args,
        kwargs=kwargs,
    )
    p.start()
    p.join()
    # check if the subprocess exited with an error
    if p.exitcode != 0:
        raise RuntimeError(f"Subprocess failed with exit code {p.exitcode}")
    try:
        return queue.get(block=False)
    # subprocess succeeded but did not return a result
    except Empty:
        raise RuntimeError("Subprocess did not return a result")

No matter if we add freeze_support (then MP will be used in frozen binaries) or not (then it will use direct call).

Am i missing smthg?

Update:

  1. join could be called because start was already called - if start succussed join will too no matter what happen with subprocess.
  2. there is a way to protect from "hanging" during join but i think it is too much for this particular case

@woutdenolf
Copy link
Collaborator

Actually, join could be called safely outside of try

No, join can definitely fail

AssertionError: can only join a started process
RuntimeError: cannot join current process

@sergey-yaroslavtsev
Copy link
Collaborator Author

sergey-yaroslavtsev commented Feb 18, 2026

Actually, join could be called safely outside of try

No, join can definitely fail

AssertionError: can only join a started process
RuntimeError: cannot join current process

Yes but we have HDF5Utils:

    p.start()
    try:
        p.join()

either start() will fail or none of two - no point for try here.

join could be called because start was already called - if start succussed join will too no matter what happen with subprocess.

And if start fails then we immediately fallback to direct method

@vasole
Copy link
Member

vasole commented Feb 18, 2026

I guess this PR combined with skipping the the test in frozen code independently of the presence of multiprocessing or not is the simplest way to move forward.

The safe reading of HDF5 files while the files are being written is unlikely to be a concern when using frozen binaries.

@woutdenolf
Copy link
Collaborator

woutdenolf commented Feb 19, 2026

This pattern ensures that whatever happens after start, the process is definitely cleaned up

    p.start()
    try:
        p.join()
        ...
    finally:
        try:
            p.kill()
        except AttributeError:
            p.terminate()

I don't see a need to change that pattern.

@sergey-yaroslavtsev sergey-yaroslavtsev merged commit 3fa4213 into master Feb 20, 2026
59 of 60 checks passed
@sergey-yaroslavtsev
Copy link
Collaborator Author

sergey-yaroslavtsev commented Feb 20, 2026

It was merged as accepted because it is better than current state. And to be able to test other branches.
Further discussion on proper handle of MP is discussed in #1194. Probably changes implemented here will be modified soon.

@sergey-yaroslavtsev sergey-yaroslavtsev deleted the freeze_support branch February 20, 2026 17:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants