Skip to content

Build NCrystal from Source for OpenMC#3304

Closed
ahnaf-tahmid-chowdhury wants to merge 8 commits intoopenmc-dev:developfrom
ahnaf-tahmid-chowdhury:ncrystal-source-build
Closed

Build NCrystal from Source for OpenMC#3304
ahnaf-tahmid-chowdhury wants to merge 8 commits intoopenmc-dev:developfrom
ahnaf-tahmid-chowdhury:ncrystal-source-build

Conversation

@ahnaf-tahmid-chowdhury
Copy link
Contributor

Description

This PR modifies the OpenMC build process to compile NCrystal from source instead of relying on the manylinux-provided package. The motivation behind this change is to ensure a more streamlined integration, avoiding unnecessary dependencies and improving compatibility.

Motivation

  • Using the manylinux version results in a dependency on libNCrystal.so, which is dynamically loaded via subprocess, adding inefficiencies.

  • Setting LD_LIBRARY_PATH is an option, but it introduces complexity.

Reference

This PR follows up on discussions in another PR (shimwell/openmc/pull/67), where the issue of dependency management was raised. As suggested, this change is being made separately to integrate source-built NCrystal before addressing packaging concerns.

@shimwell
Copy link
Member

Many thanks for making this PR @ahnaf-tahmid-chowdhury

This would be very enabling for the sci kit core wheel building.

@tkittel perhaps has some comments on this

@tkittel
Copy link
Contributor

tkittel commented Feb 14, 2025

Yes, I have some comments. On vacation and busy with kids right now, but will try to find time a bit later :-)

@ahnaf-tahmid-chowdhury
Copy link
Contributor Author

It seems that ncrystal-python requires ncrystal-core to be installed; otherwise, it is broken. We need to install the full ncrystal package.

How would you like to proceed? We can either make ncrystal a runtime dependency, or create a fallback so users can install ncrystal when it's called but not installed, also add an optional dependency on pyproject.toml file.

@tkittel
Copy link
Contributor

tkittel commented Feb 17, 2025

Sorry for not being able to respond sooner, but let me try to catch up now.

First of all, I am a bit confused about the context of all of this, and additionally there are now comments both here, on shimwell#67, and mctools/ncrystal#227. Not to forget the already merged #3274 which was needed not only to make OpenMC work with NCrystal4, but also to ensure that you are not using NCrystal without actually having a self-consistent environment. So maybe it is just me now having enough coffee, but please forgive me if I am misinterpreting some intensions, plans, or oversaw some comments :-)

I most assuredly have misunderstood something actually, since some of what I see here doesn't really make sense to me. I have written some comments below, but perhaps rather than responding to them here and risking further mutual misunderstanding, you would be interested in having a zoom call and discuss this? If yes, what time zones are you in? I am CET, but could have a late-evening call if you are interested.

Anyway, here are my comments, take them with a grain of salt and a smilie of course :-)

First of all, I would like to point out that I am not very keen to see downstream projects trying to put the pieces of NCrystal together manually if it can be avoided, since that will inevitably become very fragile and need constant attention and maintenance overhead in both projects (and as you can tell, it is not easy without first understanding a bit more about NCrystal packaging and deployment). Instead, we have invested quite a lot of effort in trying to make sure that it is easy to install NCrystal in various channels (notably both PyPI and Conda), and I would like to understand if there actually any reason for not using such channels? In particular if you plan to add openmc to the PyPI ecosystem, it seems contradictory to not want to be able to work with the packages your dependencies provide there already. Personally I would think it would be desirable if people can simply "pip install openmc ncrystal" and get the latest and greatest of both openmc and ncrystal working together. We did some tutorials using cloud-based jupyter notebooks in the last few years, and we had to constrain ourselves and only add a single example with openmc since the need to rebuild openmc from scratch was a bit complicated and slow.

Anyway, I am a bit horrified to find that this PR tries to revert some of the "recommended NCrystal practices" that I just added. Some of these might seem a bit weird to you perhaps, but they are needed because we 1) want to support windows, 2) want to work whether installed from Python wheels or manual CMake builds (by experts!), 3) Want to make sure that our python layer (which is noarch) is completely separate from the underlying core layer. We ran into a lot of trouble: There are no cross-platform places to put libraries, no CMake-based way to properly install python modules (absent cmake being called from scikit-build-core of course), no Python-wheel supported places to put CMake config files or libraries, etc. The primary solution to all of this comes from the fact that the one thing you CAN provide in a manner which works on all platforms and environment is a command line utility. Hence we base everything around ncrystal-config which is a small app (written in C, so very efficient to launch) provided along with the NCrystal library in ncrystal-core. The ncrystal-config command is always able to locate the rest of ncrystal-core, and tell you about it. And that is how the python modules (and cmdline scripts installed via project.scripts entry points) are able to locate the NCrystal library.

So please do NOT remove the call to ncrystal-config from the CMake code, it is the only way we want to support NCrystal find_package calls. If you want to be able to work in an inconsistent environment, or one where the runtime PATH is not yet ready, you can always use -DNCrystal_DIR=... to specify the location of NCrystalConfig.cmake as needed.

I do not understand the sentence:

Using the manylinux version results in a dependency on libNCrystal.so, which is dynamically loaded via subprocess, adding inefficiencies.

What subprocess, what inefficiencies? Surely you are not advocating against shared libraries? I can guarantee you that neither a call to subprocess.run['ncrystal-config',...] nor the context switching between the openmc binary and the NCrystal shared library will have any sort of measurable inefficiency involved. This smells like premature optimisation, and I would say that 99.99% of the challenge here is related to making something which is easy to configure and maintain, and not these few nanoseconds of runtime.

Also, I really don't like that you resurrect tools/ci/gha-install-ncrystal.sh. This was added by me and @marquezj in the dark ages before NCrystal was on conda-forge and PyPI, and one had to slaughter a chicken at full moon to get the environment set up correctly to run everything. I mean, all the variables in that has to be maintained, which I can guarantee you that they won't before they break enough that some CI test fails. Additionally, we might restructure the NCrystal packaging further (moving data or header files our of ncrystal-core comes to mind), and it really is a hassle to have to go and submit PRs to all the projects that had some sort of hardwired manual build of NCrystal.

Next, the call to nctool --test is a super cheap (0.2seconds on my machine) to verify that the NCrystal installation has been put together correctly, with cmdline scripts installed and able to find python modules, and those able to find both data files and the ncrystal library. For a complicated CI setup like you have here, with jobs running for, what, 20mins, it will be much easier to get a short early failure clearly indicating that NCrystal has been installed wrongly, rather than having to wonder why some output tally changed at the end of a 20min job.

Finally, I understand of course that you want to keep the dependency on NCrystal optional, but I am curious how on earth you plan to do that in the context of scikit-build-core and prebuilt wheels on PyPI ? On that note, I should mention that I have previously agreed with @paulromano to add a PR for the openmc conda feedstock to enable NCrystal there (always). The reason I have not done so yet, is simply that I want to clarify a bit how ABI compatibility issues works on conda first -- and I have been overloaded with other tasks (in the meantime I have reached the conclusion that it will be simpler and better to simply provide a dedicated ultra-stable header with NCrystal, which can be used by OpenMC to completely avoid such ABI concerns, but that is getting to another topic).

In case you are interested in ncrystal-core vs. ncrystal-python, you can read more here: https://github.com/mctools/ncrystal/blob/main/INSTALL.md#package-structure

But in general we actually recommend that downstream packages and users simply depend on the package named ncrystal, since that will allow us to further modify the packaging structure in the future if needed.

Anyway, that's all I could remember right now. As I wrote, I would recommend discussing this in a call (perhaps @paulromano wants to join as well?), since there are a lot of moving parts and opinions :-)

@ahnaf-tahmid-chowdhury
Copy link
Contributor Author

Hi @tkittel , I sincerely apologize for any confusion caused. As I am relatively new to OpenMC development, I wasn’t fully aware that the latest version of NCrystal was designed with OpenMC’s build process in mind.

What I have observed is that when OpenMC is built with NCrystal, libopenmc.so creates a link to libNCrystal.so. Because of this, we need to manually call libNCrystal before libopenmc when using the PyPI version of NCrystal. This issue became evident while @shimwell and I were working on the OpenMC wheel build.

Since pip builds wheels in an isolated environment, NCrystal gets installed in a temporary directory, where CMake creates a dynamic link that later breaks. This prevents auditwheel from detecting the library. However, if we build NCrystal from source in a known directory and pass that path to CMAKE_PREFIX_PATH, CMake can locate NCrystal without relying on ncrystal-config. This also ensures that auditwheel correctly identifies and stores libNCrystal under openmc.libs on Linux and openmc/.dylibs on macOS. On Windows, this isn’t an issue since CMake automatically copies the NCrystal DLL to the OpenMC library folder.

The reason I removed nctool --test is that I’ve set the NCrystal version tag to 4.0.0, which is stable. Since the test will always pass, running it is unnecessary. Additionally, running it would require setting up the PATH since NCrystal is installed in a custom location.

That said, I found calling ncrystal-config via subprocess at runtime is unnecessary when OpenMC has a working manylinux wheel because OpenMC will store its own version of libNCrystal in openmc.libs, whether we build it from source or use the PyPI package. So, it is not an issue anymore.

Another reason I initially proposed building from source was to ensure compatibility for forks that install OpenMC from source with different compilers and versions. Ensuring compatibility across various setups seemed beneficial. However, given what I’ve learned about the updated NCrystal, I’d love to hear your thoughts. Do you think building from source still offers advantages, or should we refine the PyPI based workflow instead?

As I’m still getting up to speed with OpenMC development, I may not be fully aware of all existing discussions and decisions. Please forgive any oversight on my part. I’d be happy to discuss this further perhaps we could schedule a Zoom meeting? I also believe we have a Slack channel for OpenMC wheel builds, and @shimwell might be able to invite you there.

@shimwell
Copy link
Member

If we can find a solution that allows us to make the OpenMC wheel using the scikit core like we could before PR #3274 was merged in then that would be great.

Unfortunately I don't think adding ncrystal as build time dependency and run time dependency to the pyproject.toml is a good solution as that forces a dependency on people who may not want it and makes the environment harder to solve.

This is different to the conda package where I assume/hope ncrystal would be added as build variety to the openmc conda package like mpi and dagmc which are also added as build options and are not mandatory dependencies for every build variety. I would be keen to keep it non mandatory for the same reasons as it makes the environment harder to solve and forces a dependency which not all users want.

Perhaps we can make some changes in ncrystal to improve the packaging that @ahnaf-tahmid-chowdhury suggested a while back?

#===============================================================================

if(OPENMC_USE_NCRYSTAL)
if(NOT DEFINED "NCrystal_DIR")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have removed this as it forces the definition of NCrystal_DIR, which would break the standard method of defining other CMake variables for NCrystal. However, if this is recommended, please let me know.

Comment on lines -127 to -130
execute_process(
COMMAND "ncrystal-config" "--show" "cmakedir"
OUTPUT_VARIABLE "NCrystal_DIR" OUTPUT_STRIP_TRAILING_WHITESPACE
)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can make it available with a different fallback, right?

endif()

if(@OPENMC_USE_NCRYSTAL@)
if(NOT DEFINED "NCrystal_DIR")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is broken. We should set NCrystal_DIR to @NCrystal_DIR@ at the top of OpenMC's config file. The current version can break OpenMC, as the installed version of NCrystal might be different. For example, OpenMC could have been built with NCrystal 4.0.0, while the latest version of NCrystal from PyPI might be 4.1.0.

Comment on lines +19 to +20
chmod +x ./tools/ci/gha-install-ncrystal.sh
./tools/ci/gha-install-ncrystal.sh
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand why I need to chmod this from the action. I have tried multiple times on my system, and the file is always executable. Maybe someone else can give this a try.

@tkittel
Copy link
Contributor

tkittel commented Feb 17, 2025

Hi @ahnaf-tahmid-chowdhury and @shimwell . Thanks a lot for your feedback. Especially the explanations concerning auditwheel etc. makes it more clear to me what you are doing, and why you might have made the changes you have. I also agree, it is all needlessly complicated, with build flags, optional dependencies, etc.

However, while mulling it over this afternoon, I think I came up with something that would be a much better way forward. I hope I can sell you on that idea, so here is my elevator pitch. Let us first hear about the advantages:

  1. You can completely forget about NCrystal in CMake code, pyproject.toml, conda recipes. Rip it all out and be happy! (both projects should of course still keep testing the integration in CI).
  2. All builds of OpenMC (manual builds, conda builds, scikit-core-build's) will work with all builds of NCrystal (conda builds, wheels, manual builds, ...), assuming only some simply platform compatiblity. And there won't be any particular technical need to pair up versions (i.e. an old OpenMC will be able to work with a new NCrystal and vice versa).
  3. People not adding NCrystal materials to their OpenMC geometries will have ~0 overhead. People adding NCrystal materials to their geometries will get a nice pedagogical error message if NCrystal is not already installed, telling them to install it.
  4. I am happy to try to put together a new version of NCrystal which would support this, and also a PR for OpenMC (and it will in a sense be ortogonal to these other scikit-build-core related PRs).
  5. It will also work on Windows, if you ever plan to support that platform.

The disadvantage is that I have to do some work :-)

The technology to make this work would be that the openmc binary upon realising it needs NCrystal will have to query ncrystal-config to see if NCrystal is available, and then do a good old dlopen and use a plugin interface header to be shared between the two repos (it will be small so we can simply add a copy of that file in both repos).

Let me know what you think, if you are positive then I could try to put that up my priority list since it would solve a whole bunch of headaches for me (and hopefully also for you) :-)

@shimwell
Copy link
Member

Sounds good to me, appreciated.

There is some effort to get openmc working on windows. #2919

@tkittel
Copy link
Contributor

tkittel commented Feb 17, 2025

Ok @shimwell. I will have a go at it. It will be a bit of work though, and the next days are a bit busy for me. So it might take 1-2 weeks before it is ready. Is that compatible with your plans?

@shimwell
Copy link
Member

Ok @shimwell. I will have a go at it. It will be a bit of work though, and the next days are a bit busy for me. So it might take 1-2 weeks before it is ready. Is that compatible with your plans?

Many thanks that would be great

@ahnaf-tahmid-chowdhury
Copy link
Contributor Author

I think the current PR already supports building NCrystal from source, as the latest version of NCrystal still supports CMake builds.

On the Python side, I found that only two files (material.py and plotter.py) import NCrystal at the moment. I can add a fallback mechanism in these files to check if NCrystal is installed and prompt users to install it if needed.

This approach should cover all the scenarios you mentioned, ensuring flexibility while keeping the build process straightforward. Is there anything I might be missing? Or would you like me to look into something specific on the NCrystal side?

@tkittel
Copy link
Contributor

tkittel commented Feb 18, 2025

I am not sure I 100% get your point @ahnaf-tahmid-chowdhury ? Are we talking about what you will do in the ~2 weeks time until I finish the promised PR, or something else?

@ahnaf-tahmid-chowdhury
Copy link
Contributor Author

I assume there is no need to create an additional PR if we can fix the issue here, as everything seems to be working fine. Would you like me to cover anything else specific to test?

@tkittel
Copy link
Contributor

tkittel commented Feb 25, 2025

Hi @ahnaf-tahmid-chowdhury @shimwell

Just a small update from my side! I was pretty busy this last week, but managed to more or less finish up all the new infrastructure on the NCrystal side, including new CI tests etc., and a preliminary modification to openmc sources to see that everything will work. It seems so! :-)

So the next step is for me to release NCrystal 4.1.0 with the new infrastructure needed there, and after that I will open a new PR for OpenMC with the required changes. I hope we are talking 1-2 days more in total, pending issues in CI etc.

And I much prefer a separate PR, since nothing in the new PR will depend on this existing PR, and it will get too complicated to intertwine everything.

Cheers,
Thomas

@shimwell
Copy link
Member

Super, thanks for the update, sounds like great progress

@tkittel
Copy link
Contributor

tkittel commented Feb 26, 2025

So I have now created #3328 as promised :-)

@paulromano
Copy link
Contributor

Closing in favor of #3328

@paulromano paulromano closed this Mar 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants