Hi,
I'm having the following problems in running the some of the Gambit example files as described
in arXiv:1705.07908 (the Gambit manual) and arXiv:2107.00030 (the GUM manual).
Would you kindly help me resolve these issues.
Thanks.
Asesh K Datta
#########################################################################
(I built gambit with
"cmake -DWITH_AXEL=ON -DWITH_HEPMC=ON -DWITH_YODA=ON -DWITH_MPI=ON
-Ditch=pybind11 -DBUILD_FS_MODELS=MSSM .. ")
With MSSM7 scan
time mpirun -np 16 ./gambit -f yaml_files/MSSM7.yaml
Issue 1
Aborting after running for around half an hour prompting the following.
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
71 CMS_ttH-3leptons_8TeV_19.6fb-1_125.6GeV_1682105.txt
72 CMS_ttH-4leptons_8TeV_19.6fb-1_125.6GeV_1682106.txt
73 C
At line 106 of file datatables.f90 (unit = 133, file = '/home/asesh/Packages/Gambit-BSM/gambit_2.4/Backends/installed/higgssignals/1.4.0/Expt_tables/latestresults/C')
Fortran runtime error: End of file
Primary job terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
mpiexec detected that one or more processes exited with non-zero status, thus causing
the job to be terminated. The first process to do so was:
Process name: [[63137,1],13]
Exit code: 2
Issue 2
Also cannot restart the scan successfully by including the "-r" option in the execution command, i.e.,
time mpirun -np 16 ./gambit -r -f yaml_files/MSSM7.yaml
(Only removing the folder "MSSM7" under the "gambit_2.4/runs" folder helps starting a fresh scan.)
The following is the error message.
Starting GAMBIT
Running in MPI-parallel mode with 16 processes
Running with 16 OpenMP threads per MPI process (set by the environment variable OMP_NUM_THREADS).
YAML file: yaml_files/MSSM7.yaml
Importing: include/StandardModel_SLHA2_scan.yaml
Initialising logger... log_debug_messages = true; log messages tagged as 'Debug' WILL be logged.
WARNING: This may lead to very large log files!
Group readable: runs/MSSM7//samples//MSSM7.hdf5 , /MSSM7 : 1
FATAL ERROR
GAMBIT has exited with fatal exception: GAMBIT error
ERROR: A problem has occurred in the printer utilities.
Error preparing pre-existing output file 'runs/MSSM7//samples//MSSM7.hdf5' for writing via hdf5printer! The requested output group '/MSSM7 already exists in this file! Please take one of the following actions:
- Choose a new group via the 'group' option in the Printer section of your input YAML file;
- Delete the existing group from 'runs/MSSM7//samples//MSSM7.hdf5';
- Delete the existing output file, or set 'delete_file_on_restart: true' in your input YAML file to give GAMBIT permission to automatically delete it (applies when -r/--restart flag used);
*** Note: This error most commonly occurs when you try to resume a scan that has already finished! ***
Raised at: line 1524 in function Gambit::Printers::HDF5Printer2::HDF5Printer2(const Gambit::Options&, Gambit::Printers::BasePrinter*) of /home/asesh/Packages/Gambit-BSM/gambit_2.4/Printers/src/printers/hdf5printer_v2/hdf5printer_v2.cpp.
rank 0: FinalizeWithTimeout failed to sync for clean MPI shutdown, calling MPI_Abort...
rank 0: Issuing MPI_Abort command, attempting to terminate all processes...
MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD
with errorcode 1.
NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
===============================
With GUM implemented model MDMSM
Issue 1
-
For smaller values of NP (=1000) for a Diver scan (as set in
"yaml_files/MDMSM_Tute.yaml"),
the scan exits smoothly creating an "MDMSM.hdf5" file in the
"runs/MDMSM/samples" folder.
However, executing "pippi MDMSM.pip" under "gum/Tutorial" is returning the
following error messages for two different installations of pippi.
Case A
When using "pippi" from the "pippi" folder created under gambit root folder by
"make get-pippi":
File "/home/asesh/Packages/Gambit-BSM/gambit_2.4/gum/Tutorial/../../pippi/pippi", line 41
print 'Beginning pippi '+arguments[1]+' operation...'
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
SyntaxError: Missing parentheses in call to 'print'. Did you mean print(...)?
Case B
When using "pippi" from a direct git-cloned folder:
"Beginning pippi parse-to-plot operation...
Running pippi failed in parse operation.
Error: field specific_bins required for requested operation not found in MDMSM.pip.
Quitting...
--------------------------------------------------------
Issue 2
When I increase the value of NP (say, 5000 or 10000), the Diver scan crashes routinely
by prompting the following (after running for more than 20 minutes).
:::::::::::::::::::::::::::::::::::::
:::::::::::::::::::::::::::::::::::::
theta13: 0.15495
theta23: 0.76958
nuclear_params_sigmas_sigmal:
deltad: -0.427
deltas: -0.085
deltau: 0.842
sigmal: 58
sigmas: 43
Raised at: line 329 in function void Gambit::Printers::HDF5DataSet::write_buffer(const T (&)[100000], std::size_t, std::size_t, bool) [with T = double; std::size_t = long unsigned int] of /home/asesh/Packages/Gambit-BSM/gambit_2.4/Printers/include/gambit/Printers/printers/hdf5printer_v2.hpp.
mpirun detected that one or more processes exited with non-zero status, thus causing
the job to be terminated. The first process to do so was:
Process name: [[54161,1],13]
Exit code: 1
Hi,
I'm having the following problems in running the some of the Gambit example files as described
in arXiv:1705.07908 (the Gambit manual) and arXiv:2107.00030 (the GUM manual).
Would you kindly help me resolve these issues.
Thanks.
Asesh K Datta
#########################################################################
(I built gambit with
"cmake -DWITH_AXEL=ON -DWITH_HEPMC=ON -DWITH_YODA=ON -DWITH_MPI=ON
-Ditch=pybind11 -DBUILD_FS_MODELS=MSSM .. ")
With MSSM7 scan
time mpirun -np 16 ./gambit -f yaml_files/MSSM7.yaml
Issue 1
Aborting after running for around half an hour prompting the following.
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
71 CMS_ttH-3leptons_8TeV_19.6fb-1_125.6GeV_1682105.txt
72 CMS_ttH-4leptons_8TeV_19.6fb-1_125.6GeV_1682106.txt
73 C
At line 106 of file datatables.f90 (unit = 133, file = '/home/asesh/Packages/Gambit-BSM/gambit_2.4/Backends/installed/higgssignals/1.4.0/Expt_tables/latestresults/C')
Fortran runtime error: End of file
Primary job terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
mpiexec detected that one or more processes exited with non-zero status, thus causing
the job to be terminated. The first process to do so was:
Process name: [[63137,1],13]
Exit code: 2
Issue 2
Also cannot restart the scan successfully by including the "-r" option in the execution command, i.e.,
time mpirun -np 16 ./gambit -r -f yaml_files/MSSM7.yaml
(Only removing the folder "MSSM7" under the "gambit_2.4/runs" folder helps starting a fresh scan.)
The following is the error message.
Starting GAMBIT
Running in MPI-parallel mode with 16 processes
Running with 16 OpenMP threads per MPI process (set by the environment variable OMP_NUM_THREADS).
YAML file: yaml_files/MSSM7.yaml
Importing: include/StandardModel_SLHA2_scan.yaml
Initialising logger... log_debug_messages = true; log messages tagged as 'Debug' WILL be logged.
WARNING: This may lead to very large log files!
Group readable: runs/MSSM7//samples//MSSM7.hdf5 , /MSSM7 : 1
FATAL ERROR
GAMBIT has exited with fatal exception: GAMBIT error
ERROR: A problem has occurred in the printer utilities.
Error preparing pre-existing output file 'runs/MSSM7//samples//MSSM7.hdf5' for writing via hdf5printer! The requested output group '/MSSM7 already exists in this file! Please take one of the following actions:
*** Note: This error most commonly occurs when you try to resume a scan that has already finished! ***
Raised at: line 1524 in function Gambit::Printers::HDF5Printer2::HDF5Printer2(const Gambit::Options&, Gambit::Printers::BasePrinter*) of /home/asesh/Packages/Gambit-BSM/gambit_2.4/Printers/src/printers/hdf5printer_v2/hdf5printer_v2.cpp.
rank 0: FinalizeWithTimeout failed to sync for clean MPI shutdown, calling MPI_Abort...
rank 0: Issuing MPI_Abort command, attempting to terminate all processes...
MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD
with errorcode 1.
NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
===============================
With GUM implemented model MDMSM
Issue 1
For smaller values of NP (=1000) for a Diver scan (as set in
"yaml_files/MDMSM_Tute.yaml"),
the scan exits smoothly creating an "MDMSM.hdf5" file in the
"runs/MDMSM/samples" folder.
However, executing "pippi MDMSM.pip" under "gum/Tutorial" is returning the
following error messages for two different installations of pippi.
Case A
When using "pippi" from the "pippi" folder created under gambit root folder by
"make get-pippi":
File "/home/asesh/Packages/Gambit-BSM/gambit_2.4/gum/Tutorial/../../pippi/pippi", line 41
print 'Beginning pippi '+arguments[1]+' operation...'
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
SyntaxError: Missing parentheses in call to 'print'. Did you mean print(...)?
Case B
When using "pippi" from a direct git-cloned folder:
Issue 2
When I increase the value of NP (say, 5000 or 10000), the Diver scan crashes routinely
by prompting the following (after running for more than 20 minutes).
:::::::::::::::::::::::::::::::::::::
:::::::::::::::::::::::::::::::::::::
theta13: 0.15495
theta23: 0.76958
nuclear_params_sigmas_sigmal:
deltad: -0.427
deltas: -0.085
deltau: 0.842
sigmal: 58
sigmas: 43
Raised at: line 329 in function void Gambit::Printers::HDF5DataSet::write_buffer(const T (&)[100000], std::size_t, std::size_t, bool) [with T = double; std::size_t = long unsigned int] of /home/asesh/Packages/Gambit-BSM/gambit_2.4/Printers/include/gambit/Printers/printers/hdf5printer_v2.hpp.
mpirun detected that one or more processes exited with non-zero status, thus causing
the job to be terminated. The first process to do so was:
Process name: [[54161,1],13]
Exit code: 1