Conversation
|
build with tests can be find here: Now hdf5 tree is showing (tested on two arm Mac), All tests are passing. Do not see any other issue currently. |
|
Probably we can accumulate with other open PRs and make a release, since this one is crucial for MacOS. |
|
I feel uncomfortable importing a module that PyMca does not need at all when frozen. The proof is that it was never an issue until last release. The multiprocess module is only needed to access HDF5 files while being written. To spawn or to fork a process each time an HDF5 file is opened/accessed is overkill. I am much more comfortable with not shipping the multiprocess module with frozen binary (as it was before this release) and simply not trying to import it when sys.frozen exists and is true. |
The general point would be that if we do not import MP then every method which utilize MP need a workaround and will be run in the main process. Since MP is a standard library and
However, i do agree with a concept "it works - do not touch". So if we decide to be on very safe side it could be a workaround for now.
You are 100% right. I will add it there as well as they are other entry points. Relative to last point - does MacOS users really use them? because it is not trivial to open |
The user does not use them but the code does it. Perhaps not QStackWidget.py, but all the others are called as separate processes when performing batches. |
As far as I know, prior to this version there was only one place and it was handled in HDF5Widget.py |
|
Is it possible to protect each import of and access to multiprocess by a try/except or a check for sys.frozen and its value? Without that the previous freezing pipeline will not work and there will be no fallback for easy testing. In fact, we are modifying 9 files instead of just two (multiprocess was only imported at a single place). |
You mean: try:
import multiprocessing
except Exception:
pass
...
if __name__ == '__main__':
if hasattr(sys, 'frozen'):
multiprocessing.freeze_support()I do not see problem in it.
We can adjust accordingly.
True. If we are sure of no other MP (and we do not want to "clean" P.S. |
|
It should be if getattr(sys, 'frozen', False) to make it future proof. |
|
Same comment as I have for argparse: lets try to have a common CLI base and re-use it everywhere. #1183 (comment) |
I realized that if we need it to work then Thus, probably makes the hole fix to have much less sense... Because if we do it this way we are not allowed to directly import MP in other modules making this fix much less useful for developing since one of ideas was to avoid workarounds and be able to use MP directly. I think we need to decide between two options:
Options in between seems to have actually cons from both at the same time and does not make a lot of sense unfortunately. If after this discussion you have strong opinion please let me know. |
That was one of the two files to fix. The other one is pyinstaller_github.spec
MP does not add any speed up to PyMca. PyMca uses subprocess and not the multiprocess module. Whatever works as standalone, it works with subprocess. The multiprocess approach is the method @woutdenolf found to achieve robustness when trying to access HDF5 files while being written.
The less dependencies the better. If I can manage without using a module, I prefer to avoid it. I do not have to care about if the module works in some exotic platform or conditions.
See above. MP does not add any speed to PyMca. I would expect the opposite (spawning or forking a process just to access a file)
I'm afraid you will have to decide. I expect @woutdenolf and myself to have different opinions on this. |
|
Two options for me:
I don't agree with anything bespoke. |
|
If you want to replace the LLM slop because I don't have time: import subprocess
import pickle
import sys
def run_in_subprocess_popen(func, *args, default=None, **kwargs):
payload = (func, args, kwargs)
proc = subprocess.Popen(
[sys.executable, "worker.py"],
stdin=subprocess.PIPE,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
)
try:
# Send pickled payload
pickle.dump(payload, proc.stdin)
proc.stdin.close()
# Read result
result = pickle.load(proc.stdout)
proc.wait()
if proc.returncode != 0:
return default
return result
except Exception:
proc.kill()
return defaultimport sys
import pickle
def main():
try:
func, args, kwargs = pickle.load(sys.stdin.buffer)
result = func(*args, **kwargs)
pickle.dump(result, sys.stdout.buffer)
sys.stdout.buffer.flush()
except Exception as e:
# You may want better error handling
raise
if __name__ == "__main__":
main() |
|
More LLM slop. Anyway you get the idea, the point here is the data serialization to and from the subprocess, which is why I used import subprocess
import pickle
import sys
import struct
def run_in_subprocess(func_name, *args, default=None, **kwargs):
payload = (func_name, args, kwargs)
data = pickle.dumps(payload)
proc = subprocess.Popen(
[sys.executable, "-m", "yourpackage.worker"],
stdin=subprocess.PIPE,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
)
try:
# Send length-prefixed pickle (prevents framing issues)
proc.stdin.write(struct.pack("!I", len(data)))
proc.stdin.write(data)
proc.stdin.flush()
proc.stdin.close()
# Read response length
raw_len = proc.stdout.read(4)
if not raw_len:
return default
msg_len = struct.unpack("!I", raw_len)[0]
result_data = proc.stdout.read(msg_len)
result = pickle.loads(result_data)
proc.wait()
if proc.returncode != 0:
return default
return result
except Exception:
proc.kill()
return defaultimport sys
import pickle
import struct
from yourpackage.module import get_hdf5_group_keys
FUNCTIONS = {
"get_hdf5_group_keys": get_hdf5_group_keys,
}
def main():
# Read length-prefixed pickle
raw_len = sys.stdin.buffer.read(4)
if not raw_len:
return
msg_len = struct.unpack("!I", raw_len)[0]
data = sys.stdin.buffer.read(msg_len)
func_name, args, kwargs = pickle.loads(data)
result = FUNCTIONS[func_name](*args, **kwargs)
result_data = pickle.dumps(result)
sys.stdout.buffer.write(struct.pack("!I", len(result_data)))
sys.stdout.buffer.write(result_data)
sys.stdout.buffer.flush()
if __name__ == "__main__":
main() |
|
I am more comfortable with PyMca not using multiprocessing. It is only used for the HDF5 robust access stuff and it was never available for frozen binaries. I see it as adding a dependency that can make things crash/misbehave at application start for no gain. Concerning the subprocess alternative. Is it worth to replace multiprocess by subprocess just to have that functionality available for frozen binaries? My feeling is that it is overkill and that we are taking the risk of replacing something that works at beamlines by something untested. I really see the least risky approach to leave the things as they were prior to last release. It warrants things work at beamlines and with frozen binaries. |
|
if we do smthg like in the def run_in_subprocess(target, *args, context=None, default=None, **kwargs):
try:
import multiprocessing
ctx = multiprocessing.get_context(context)
queue = ctx.Queue(maxsize=1)
p = ctx.Process(
target=subprocess_main,
args=(queue, target) + args,
kwargs=kwargs,
)
p.start()
try:
p.join()
try:
return queue.get(block=False)
except Empty:
return default
finally:
try:
p.kill()
except AttributeError:
p.terminate()
except Exception:
if getattr(sys, 'frozen', False):
_logger.debug("Frozen executable. Using standard approach")
else:
_logger.warning("Multiprocessing is not available. Using standard approach")
try:
return target(*args, **kwargs)
except Exception:
return defaultand exclude MP from frozen binaries; Of course this test could be skipped for frozen binaries tests. |
|
if I understood correctly that was the idea of @vasole:
|
it will require to import |
|
I'm not going to die on this hill. I gave my opinion and you are free not to follow it. |
…s with multiprocessing
|
Please can you try if the frozen binaries generated after my modifications still work? |
vasole
left a comment
There was a problem hiding this comment.
Successfully tested on MacOS BigSur
|
MP was not excluded in @vasole Is it intended (just to be sure)? To be noted: |
Yes, it is. |


Closes #1181
Just a proper freeze and hidden imports
Do not affect Windows