Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
137 commits
Select commit Hold shift + click to select a range
e0f1fec
[oct23av] rerun 78 tput alltees, all ok (itscrd90 Silver4216 el9 VM, …
valassi Oct 25, 2023
319f035
[oct23av] rerun 18 tmad alltees, all ok (itscrd90 Silver4216 el9 VM, …
valassi Oct 25, 2023
ed1cd75
[oct23av] (TEMPORARY TESTS ON PLATINUM) rerun 78 tput alltees, all ok…
valassi Oct 26, 2023
93f2978
[oct23av] (TEMPORARY TESTS ON PLATINUM) rerun 18 tmad alltees, all ok…
valassi Oct 26, 2023
ca18b76
[oct23av] go back to performance baseline logs on itscrd90 (REVERT TE…
valassi Oct 26, 2023
b20257a
[oct23av] regenerate 8 processes mad (move from itscrd80 to itscrd90)
valassi Oct 26, 2023
ae5498a
[oct23av] regenerate 7 processes sa (move from itscrd80 to itscrd90) …
valassi Oct 26, 2023
50ad9b7
Merge commit 'a6731bd9ead2f378b5e1a61624a55f5753a1139b' into oct23av
valassi Oct 26, 2023
2a67667
[oct23av] regenerate 7 processes mad (all but pp012j) and all 7 sa, o…
valassi Oct 26, 2023
1903d30
[oct23av] regenerate 7 pp_tt012j.mad, there are two issues: clang for…
valassi Oct 26, 2023
8641b29
fixing issue of stefan with nprocesses>1
oliviermattelaer Aug 31, 2023
8a34ba1
[oct23av] regenerate pp_tt012j.mad after cherry-picking Olivier's fix…
valassi Oct 26, 2023
ec4c8b5
[oct23av] in CODEGEN, improve comments to generated code for mirror p…
valassi Oct 26, 2023
6e6dcbf
[oct23av] regenerate pp_tt012j.mad with my own changes to code comments
valassi Oct 26, 2023
45ebd42
[oct23av] regenerate all other 7 processes mad and 7 sa, no code chan…
valassi Oct 26, 2023
4597ecf
Merge commit '6bf4a658edbd9b3f3f0d3969ba38832e6e002d63' into oct23av
valassi Oct 26, 2023
6a4192a
[oct23av] regenerate 7 processes mad and 6 sa (all except ee_mumu) af…
valassi Oct 26, 2023
7964ff9
[oct23av] rerun gq_ttq tmad tests after Stefan's PR #757 - gqttq xsec…
valassi Oct 26, 2023
34cf1c7
[oct23av] rerun 8 tput tests for gqttq - now runTest fails (I guess t…
valassi Oct 26, 2023
7e57228
[oct23av] in CODEGEN, update gqttq ref file for runTest after fixing …
valassi Oct 26, 2023
4eaa091
[oct23av] regenerate gqttq sa and mad with the correct ref file for r…
valassi Oct 26, 2023
77cd619
[oct23av] rerun 8 tput tests for gqttq - now all ok after fixing the …
valassi Oct 26, 2023
b9d336e
Merge commit 'd64423586' into oct23av
valassi Oct 26, 2023
69c4090
Merge commit '9fc9873d0' into oct23av
valassi Oct 26, 2023
c01ca6f
[oct23av] in CODEGEN MatrixElementKernels.cc, fix clang-format after …
valassi Oct 26, 2023
d883478
[oct23av] TEMPORARELY UNDO Olivier's changes to CODEGEN in 9fc9873d0 …
valassi Oct 26, 2023
88f45f2
[oct23av] regenerate 7 processes mad and 6 sa (all except 2x eemumu) …
valassi Oct 26, 2023
6d449e9
[oct23av] reapply previous patches by Olivier, otherwise the followin…
valassi Oct 26, 2023
cdd0251
Merge commit '5b22a9201' into oct23av
valassi Oct 26, 2023
6fa765d
Merge commit '3fbf7b10c' into oct23av
valassi Oct 26, 2023
d5b9d55
[oct23av] in CODEGEN, try to recover my 'tmad mode' in patchMad.sh, a…
valassi Oct 27, 2023
5720491
[oct23av] regenerate 7 mad and 6 sa processes (all but 2x eemumu wher…
valassi Oct 27, 2023
2389f74
Merge commit 'a062c0fbd' into oct23av
valassi Oct 27, 2023
99f0008
[oct23av] in CODEGEN, minor improvements in comments and verbosity of…
valassi Oct 27, 2023
b7122bf
[oct23av] regenerate 7 mad and 6 sa processes (all but eemumu) after …
valassi Oct 27, 2023
328b18e
[oct23av] in 8 mad directories (copy it manually also for eemumu), ad…
valassi Oct 27, 2023
eeab712
[oct23av] in 8 mad directories (create it manually also for eemumu), …
valassi Oct 27, 2023
739f8d3
Merge commit '216ed1833' into oct23av
valassi Oct 27, 2023
a298ba0
[oct23av] regenerate 7 mad and 6 sa processes (all but eemumu) after …
valassi Oct 27, 2023
80c98f3
Merge commit '8e34ccae3' into oct23av
valassi Oct 27, 2023
2037678
[oct23av] regenerate all 8 mad and 7 sa processes (now including eemu…
valassi Oct 27, 2023
9fb2a07
[oct23av] fix clang format in eemumu after Olivier's "Ccoeff" patch f…
valassi Oct 27, 2023
fbba1a8
[oct23av] regenerate all 8 mad and 7 sa processes (including eemumu)
valassi Oct 27, 2023
7677a74
Merge commit 'c586208a9' into oct23av
valassi Oct 27, 2023
408955d
[oct23av] add copyright and license to Stephan's runCodegen.sh script
valassi Oct 27, 2023
ba4a19e
[oct23AV] in CODEGEN/generateAndCompare.sh, add file mg5.in in each g…
valassi Oct 27, 2023
71e46f1
[oct23av] in CODEGEN, fix build warning in counters.cc (improve Steph…
valassi Oct 27, 2023
216db39
[oct23av] in CODEGEN/generateAndCompare.sh, minor fix for code genera…
valassi Oct 27, 2023
8a0c4d7
[oct23av] regenerate all 8 mad and 7 sa processes after including Ste…
valassi Oct 27, 2023
cd444c6
[oct23av] temporarely move to c586208a9 generated code to avoid confl…
valassi Oct 27, 2023
5c76ed5
Merge commit 'f1244bf14' into oct23av
valassi Oct 27, 2023
39e7519
[oct23av] go back to the latest 8 mad and 7 sa generated processes (u…
valassi Oct 27, 2023
f3d4ef7
Merge commit 'f75e99418' into oct23av
valassi Oct 27, 2023
73a5f23
[oct23av] in CODEGEN/generateAndCompare.sh, remove py3_model.pkl duri…
valassi Oct 27, 2023
06d970e
[oct23av] regenerate all 8 mad and 7 sa processes again, removing py3…
valassi Oct 27, 2023
bcec0af
Merge commit 'fbdacbd54' into oct23av
valassi Oct 27, 2023
0af6151
[oct23av] in CODEGEN, fix a silly issue in my previous conflict resol…
valassi Oct 27, 2023
f237f9f
[oct23av] in CODEGEN, ensure that patchMad.sh stdout/stderr are alway…
valassi Oct 27, 2023
b940731
[oct23av] in CODEGEN, add copyright and license to launch_plugin.py
valassi Oct 27, 2023
a30fd57
[oct23av] in CODEGEN, improve formatting and add an optional CUDACPPR…
valassi Oct 27, 2023
2fa4830
[oct23av] in CODEGEN move tmadmode steps (SDE config, runcard/paramca…
valassi Oct 27, 2023
e8031f4
[oct23av] in CODEGEN launch_plugin.py, make the exception in reset_si…
valassi Oct 27, 2023
226ac3f
[oct23av] in CODEGEN, add a workaround for the exception thrown by re…
valassi Oct 27, 2023
70b770c
[oct23av] regenerate all 8 mad and 7 sa processes after merging and p…
valassi Oct 27, 2023
540f369
Merge commit 'a3d3490b8' into oct23av
valassi Oct 27, 2023
5ed0e4b
[oct23av] in CODEGEN/generateAndCompare.sh, no longer copy or clean u…
valassi Oct 27, 2023
f87854e
[oct23av] in CODEGEN, fix clang format for 'COUPs[ndcoup+0]' from one…
valassi Oct 27, 2023
759e60c
[oct23av] regenerate all 8 mad and 7 sa processes after completing th…
valassi Oct 27, 2023
2d57c0d
Merge commit '6771781ae' into oct23av
valassi Oct 27, 2023
b3badbf
[oct23av] regenerate all 8 mad and 7 sa processes after completing th…
valassi Oct 27, 2023
fad8a92
[oct23av] in eemumu mgOnGpuCxtypes.h, add missing function 'cxsmpl<fl…
valassi Oct 28, 2023
bfa2f9f
[oct23av] in CODEGEN mgOnGpuCxtypes.h, add missing function 'cxsmpl<f…
valassi Oct 28, 2023
a71881e
[oct23av] regenerate eemumu.mad after copying mgOnGpuCxtypes.h to COD…
valassi Oct 28, 2023
ee83b62
[oct23av] rerun 78 tput tests - gqttq runTest failures, NaNs have dis…
valassi Oct 28, 2023
536ad49
[oct23av] rerun 18 tmad tests - failures in ggttggg (madevent crashes…
valassi Oct 28, 2023
973349d
[oct23av] in ggtt.mad, fix all build errors and warnings for macOS on…
valassi Oct 29, 2023
2065bb9
[oct23av] in CODEGEN, backport from gg_tt.mad the fixes for macOS bui…
valassi Oct 29, 2023
84ddaf8
[oct23av] in CODEGEN, regenerate patch.P1 and patch.common from gg_tt…
valassi Oct 29, 2023
2902dfb
[oct23av] regenerate ggtt.mad after updating CODEGEN with macOS patch…
valassi Oct 29, 2023
2cbd4aa
[oct23av] regenerate all other 7 mad and 7 sa processes after updatin…
valassi Oct 29, 2023
b525447
[oct23av] in github workflows, switch CI tests from .sa to .mad direc…
valassi Oct 29, 2023
0605140
[oct23av] fix issues in my previous patch in the CI configuration
valassi Oct 29, 2023
4cb6d11
Merge remote-tracking branch 'roiser/preserve_coupling_order' into oc…
valassi Oct 29, 2023
1594914
[oct23av] regenerate all 8 mad and 7 sa processes after merging Stefa…
valassi Oct 29, 2023
36080e0
[oct23av] rerun 8 tput tests for gqttq - now all ok again after inclu…
valassi Oct 29, 2023
ab05eb4
[oct23av] rerun 3 tmad tests for gqttq - now all ok again after inclu…
valassi Oct 29, 2023
f5dea39
Merge remote-tracking branch 'upstream/master' into ghav_oct23av
valassi Oct 29, 2023
5a140d2
[oct23av] rerun all 78 tput tests - now all of them succed including …
valassi Oct 29, 2023
6ddb1d3
[oct23av] rerun 18 tmad tests - still failures in ggttggg (madevent c…
valassi Oct 29, 2023
aea8b17
[oct23av] add to the repo heft_gg_h.sa/mg5.in which I had forgotten
valassi Oct 29, 2023
ce9892d
[oct23av] in CODEGEN fix cudacpp_src.mk for non SM processes (fix bug…
valassi Oct 29, 2023
904d688
[oct23av/nobm] in CODEGEN check_sa.cc, enable FPEs in check_sa.cc to …
valassi Oct 29, 2023
354d511
[oct23av] "regenerate" all processes with the new check_sa.cc that op…
valassi Oct 29, 2023
b6cdb02
[oct23av] in tput tests, enable FPEs in check.exe by default (unless …
valassi Oct 29, 2023
2a82105
[oct23av] rerun 78 tput tests, with FPEs enabled in the check executa…
valassi Oct 30, 2023
dd6fefe
[oct23av] rerun 18 tmad tests (while rerunning also tput with FPEs en…
valassi Oct 30, 2023
8caf257
[oct23av] in CODEGEN patchMad.sh, reorder leading make_ops lines befo…
valassi Oct 30, 2023
134b57f
[oct23av] in CODEGEN, add a quieter (-q) option to generateAndCompare.sh
valassi Oct 30, 2023
87c43d6
[oct23av] in CODEGEN, minor improvements in generateAndCompare.sh (ni…
valassi Oct 30, 2023
8f96c0a
[oct23av] in tput/throughputX.sh, BUG FIX (remove the build of topdir…
valassi Oct 30, 2023
7929623
[oct23av] in CODEGEN, add target gtestlibs in cudacpp.mk to allow the…
valassi Oct 30, 2023
cc11757
[oct23av] in CODEGEN finally add allGenerateAndCompare.sh to generate…
valassi Oct 30, 2023
3e2af01
[oct23av] in tput/throughputX.sh, improve the bug fix to build gtestl…
valassi Oct 30, 2023
1ab6b78
[oct23av] regenerate all 8 mad and 7 sa processes with latest CODEGEN…
valassi Oct 30, 2023
6c12785
[oct23av] in tput/throughputX.sh, further improve the bug fix to buil…
valassi Oct 30, 2023
30fbebb
[oct23av] rerun 78 tput tests, with FPEs enabled in the check executa…
valassi Oct 31, 2023
201f880
[oct23av] rerun 18 tmad tests (while rerunning also tput with FPEs en…
valassi Oct 31, 2023
e4768a4
[oct23av] in CODEGEN output.py, modify a few comments as suggested in…
valassi Nov 1, 2023
5bf9589
[oct23av] in CODEGEN output.py, move 'tmadmode' patches from patchMad…
valassi Nov 1, 2023
a2149a2
[oct23av] in CODEGEN output.py, try to reenable exception in CPPRunCa…
valassi Nov 1, 2023
efc1b0c
[oct23av] in CODEGEN output.py, disable again the exception in CPPRun…
valassi Nov 1, 2023
07b1090
[oct23av] (complete oct23av?) regenerate all 15 processes, no changes…
valassi Nov 1, 2023
355b9bb
Merge remote-tracking branch 'upstream/master' into oct23av
valassi Nov 1, 2023
a39aa0b
[oct23av] (complete oct23av?) regenerate all 15 processes including t…
valassi Nov 1, 2023
51a2e03
remove all monkeypatch method
oliviermattelaer Nov 1, 2023
3a73803
force flag also for GCC on mac
oliviermattelaer Nov 1, 2023
b46724e
put back the make_opts as required for consistency with MG5aMC practi…
oliviermattelaer Nov 1, 2023
4e1dccb
forbid openmp on mac
oliviermattelaer Nov 1, 2023
d4452f5
[oct23av] upgrade back mg5amcnlo to the latest gpucpp (Olivier downgr…
valassi Nov 2, 2023
d69bd0b
[oct23av] in CODEGEN cudacpp.mk, only include Source/make_opts if Sou…
valassi Nov 2, 2023
75fbc33
[oct23av] regenerate ggtt.mad including Olivier's patches - 'make cle…
valassi Nov 3, 2023
707065e
[oct23av] in ggtt.mad cudacpp.mk, add some debug printouts to show ho…
valassi Nov 3, 2023
c92ec8e
[oct23av] in ggtt.mad cudacpp.mk, try to fix the CUDACPP_MAKEFILE iss…
valassi Nov 3, 2023
113d332
[oct23av] in ggtt.mad cudacpp.mk, use ':=' to set CUDACPP_MAKEFILE on…
valassi Nov 3, 2023
784f37c
[oct23av] in ggtt.mad cudacpp.mk, add override on top of ':=' to ensu…
valassi Nov 3, 2023
418c8b1
[oct23av] in ggtt.mad cudacpp.mk, remove debug printouts - 'make clea…
valassi Nov 3, 2023
b13ae49
[oct23av] in CODEGEN, backport the fixes in cudacpp.mk for make_opts …
valassi Nov 3, 2023
6b39fcb
[oct23av] regenerate all processes including the last changes to make…
valassi Nov 3, 2023
c4d2e9e
[oct23av] rerun 78 tput tests, with FPEs enabled in the check executa…
valassi Nov 3, 2023
5d28956
[oct23av] rerun 18 tmad tests (while rerunning also tput with FPEs en…
valassi Nov 3, 2023
c492e2c
[oct23av] in CODEGEN, fix BUG in Olivier's 4e1dccb44 for OpenMP on Ma…
valassi Nov 3, 2023
b647330
[oct23av] regenerate all processes including the last changes to make…
valassi Nov 3, 2023
5418ed5
[oct23av] in ggttmad cudacpp.mk, fix one third error for openmp on Ma…
valassi Nov 3, 2023
1b3ea52
[oct23av] in CODEGEN, backport the third bug fix in openmp for mac fr…
valassi Nov 3, 2023
08a7a7d
[oct23av] regenerate all processes including the third bug fix in mak…
valassi Nov 3, 2023
4351daa
[oct23av] rerun 78 tput tests, with FPEs enabled in the check executa…
valassi Nov 3, 2023
f53166d
[oct23av] ** COMPLETE OCT23AV ** rerun 18 tmad tests (while rerunning…
valassi Nov 3, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
20 changes: 10 additions & 10 deletions .github/workflows/c-cpp.yml
Original file line number Diff line number Diff line change
Expand Up @@ -15,40 +15,40 @@ jobs:
fail-fast: false
steps:
- uses: actions/checkout@v2
- name: make epoch1
- name: make debug
run: make -C ${{ matrix.folder }} debug
CPU:
runs-on: ubuntu-latest
strategy:
matrix:
folder: [ epochX/cudacpp/ee_mumu.sa/SubProcesses/P1_Sigma_sm_epem_mupmum , epochX/cudacpp/gg_ttgg.sa/SubProcesses/P1_Sigma_sm_gg_ttxgg ]
folder: [ epochX/cudacpp/ee_mumu.mad/SubProcesses/P1_epem_mupmum , epochX/cudacpp/gg_ttgg.mad/SubProcesses/P1_gg_ttxgg ]
precision: [ d , f , m ]
fail-fast: false
steps:
- uses: actions/checkout@v2
- name: make info
run: make FPTYPE=${{ matrix.precision }} -C ${{ matrix.folder }} info
run: make FPTYPE=${{ matrix.precision }} -C ${{ matrix.folder }} -f cudacpp.mk info
- name: make
run: make FPTYPE=${{ matrix.precision }} -C ${{ matrix.folder }}
- name: make check
run: make FPTYPE=${{ matrix.precision }} -C ${{ matrix.folder }} check
run: make FPTYPE=${{ matrix.precision }} -C ${{ matrix.folder }} -f cudacpp.mk check
CPU_MAC:
runs-on: macos-latest
env:
FC: gfortran-11
strategy:
matrix:
folder: [ epochX/cudacpp/ee_mumu.sa/SubProcesses/P1_Sigma_sm_epem_mupmum, epochX/cudacpp/gg_ttgg.sa/SubProcesses/P1_Sigma_sm_gg_ttxgg ]
folder: [ epochX/cudacpp/ee_mumu.mad/SubProcesses/P1_epem_mupmum, epochX/cudacpp/gg_ttgg.mad/SubProcesses/P1_gg_ttxgg ]
precision: [ d , f , m ]
fail-fast: false
steps:
- uses: actions/checkout@v2
- name: make info
run: make AVX=none OMPFLAGS= FPTYPE=${{ matrix.precision }} -C ${{ matrix.folder }} info
run: make AVX=none OMPFLAGS= FPTYPE=${{ matrix.precision }} -C ${{ matrix.folder }} -f cudacpp.mk info
- name: make
run: make AVX=none OMPFLAGS= FPTYPE=${{ matrix.precision }} -C ${{ matrix.folder }}
- name: make check
run: make AVX=none OMPFLAGS= FPTYPE=${{ matrix.precision }} -C ${{ matrix.folder }} check
run: make AVX=none OMPFLAGS= FPTYPE=${{ matrix.precision }} -C ${{ matrix.folder }} -f cudacpp.mk check
GPU:
runs-on: self-hosted
env:
Expand All @@ -57,16 +57,16 @@ jobs:
REQUIRE_CUDA: 1
strategy:
matrix:
folder: [ epochX/cudacpp/ee_mumu.sa/SubProcesses/P1_Sigma_sm_epem_mupmum , epochX/cudacpp/gg_ttgg.sa/SubProcesses/P1_Sigma_sm_gg_ttxgg ]
folder: [ epochX/cudacpp/ee_mumu.mad/SubProcesses/P1_epem_mupmum , epochX/cudacpp/gg_ttgg.mad/SubProcesses/P1_gg_ttxgg ]
precision: [ d , f , m ]
fail-fast: false
steps:
- uses: actions/checkout@v2
- name: path
run: echo "PATH=$PATH"
- name: make info
run: make FPTYPE=${{ matrix.precision }} -C ${{ matrix.folder }} info
run: make FPTYPE=${{ matrix.precision }} -C ${{ matrix.folder }} -f cudacpp.mk info
- name: make
run: make FPTYPE=${{ matrix.precision }} -C ${{ matrix.folder }}
- name: make check
run: make FPTYPE=${{ matrix.precision }} -C ${{ matrix.folder }} check
run: make FPTYPE=${{ matrix.precision }} -C ${{ matrix.folder }} -f cudacpp.mk check
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
// Copyright (C) 2020-2023 CERN and UCLouvain.
// Licensed under the GNU Lesser General Public License (version 3 or later).
// Created by: A. Valassi (Dec 2022) for the MG5aMC CUDACPP plugin.
// Further modified by: A. Valassi (2022-2023) for the MG5aMC CUDACPP plugin.
// Further modified by: S. Hageboeck, A. Valassi (2022-2023) for the MG5aMC CUDACPP plugin.

#include "timer.h"
#define TIMERTYPE std::chrono::high_resolution_clock
Expand Down Expand Up @@ -40,7 +40,6 @@ extern "C"
static float smatrix1_totaltime = 0;
static mgOnGpu::Timer<TIMERTYPE> smatrix1multi_timer[nimplC];
static float smatrix1multi_totaltime[nimplC] = { 0 };
static int matrix1_counter = 0;
static int smatrix1_counter = 0;
static int smatrix1multi_counter[nimplC] = { 0 };

Expand Down
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
diff --git b/epochX/cudacpp/gg_tt.mad/SubProcesses/P1_gg_ttx/auto_dsig1.f a/epochX/cudacpp/gg_tt.mad/SubProcesses/P1_gg_ttx/auto_dsig1.f
index 27ed1439e..3b24a9924 100644
index 880769442..5a3da931f 100644
--- b/epochX/cudacpp/gg_tt.mad/SubProcesses/P1_gg_ttx/auto_dsig1.f
+++ a/epochX/cudacpp/gg_tt.mad/SubProcesses/P1_gg_ttx/auto_dsig1.f
@@ -469,23 +469,140 @@ C
@@ -484,23 +484,140 @@ C
INTEGER VECSIZE_USED

INTEGER IVEC
Expand Down Expand Up @@ -284,7 +284,7 @@ index 71fbf2b25..0f1d199fc 100644
open(unit=lun,file=tempname,status='old',ERR=20)
fopened=.true.
diff --git b/epochX/cudacpp/gg_tt.mad/SubProcesses/P1_gg_ttx/matrix1.f a/epochX/cudacpp/gg_tt.mad/SubProcesses/P1_gg_ttx/matrix1.f
index 3ac962688..ef18aff22 100644
index 3ac962688..daea73a6d 100644
--- b/epochX/cudacpp/gg_tt.mad/SubProcesses/P1_gg_ttx/matrix1.f
+++ a/epochX/cudacpp/gg_tt.mad/SubProcesses/P1_gg_ttx/matrix1.f
@@ -72,7 +72,10 @@ C
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ index a59181c70..af7e0efbc 100644
PARAMETER(MAXTRIES=25)
C To pass the helicity configuration chosen by the DiscreteSampler to
diff --git b/epochX/cudacpp/gg_tt.mad/Source/makefile a/epochX/cudacpp/gg_tt.mad/Source/makefile
index 617f10b93..dbe08b846 100644
index 617f10b93..00c73099a 100644
--- b/epochX/cudacpp/gg_tt.mad/Source/makefile
+++ a/epochX/cudacpp/gg_tt.mad/Source/makefile
@@ -120,7 +120,7 @@ $(LIBDIR)libiregi.a: $(IREGIDIR)
Expand All @@ -37,12 +37,11 @@ index 617f10b93..dbe08b846 100644
+ for i in `ls -d ../SubProcesses/P*`; do cd $$i; make cleanavxs; cd -; done;
+cleanall: cleanSource # THIS IS THE ONE
+ for i in `ls -d ../SubProcesses/P*`; do cd $$i; make cleanavxs; cd -; done;
+
diff --git b/epochX/cudacpp/gg_tt.mad/SubProcesses/makefile a/epochX/cudacpp/gg_tt.mad/SubProcesses/makefile
index 348c283be..74db44d84 100644
index 348c283be..65369d610 100644
--- b/epochX/cudacpp/gg_tt.mad/SubProcesses/makefile
+++ a/epochX/cudacpp/gg_tt.mad/SubProcesses/makefile
@@ -1,6 +1,22 @@
@@ -1,6 +1,28 @@
+SHELL := /bin/bash
+
include ../../Source/make_opts
Expand All @@ -54,6 +53,12 @@ index 348c283be..74db44d84 100644
+# Compile counters with -O3 as in the cudacpp makefile (avoid being "unfair" to Fortran #740)
+CXXFLAGS = -O3 -Wall -Wshadow -Wextra
+
+# Add -std=c++17 explicitly to avoid build errors on macOS
+# Add -mmacosx-version-min=11.3 to avoid "ld: warning: object file was built for newer macOS version than being linked"
+ifneq ($(shell $(CXX) --version | egrep '^Apple clang'),)
+CXXFLAGS += -std=c++17 -mmacosx-version-min=11.3
+endif
+
+# Enable ccache if USECCACHE=1
+ifeq ($(USECCACHE)$(shell echo $(CXX) | grep ccache),1)
+ override CXX:=ccache $(CXX)
Expand All @@ -65,7 +70,7 @@ index 348c283be..74db44d84 100644
# Load additional dependencies of the bias module, if present
ifeq (,$(wildcard ../bias_dependencies))
BIASDEPENDENCIES =
@@ -24,7 +40,26 @@ else
@@ -24,7 +46,26 @@ else
MADLOOP_LIB =
endif

Expand All @@ -81,19 +86,19 @@ index 348c283be..74db44d84 100644
+CUDACPP_MAKEENV:=$(shell echo '$(.VARIABLES)' | tr " " "\n" | egrep "(USEBUILDDIR|AVX|FPTYPE|HELINL|HRDCOD)")
+###$(info CUDACPP_MAKEENV=$(CUDACPP_MAKEENV))
+###$(info $(foreach v,$(CUDACPP_MAKEENV),$(v)="$($(v))"))
+CUDACPP_BUILDDIR:=$(shell $(MAKE) $(foreach v,$(CUDACPP_MAKEENV),$(v)="$($(v))") -f $(CUDACPP_MAKEFILE) -pn 2>/dev/null | awk '/Building/{print $$3}' | sed s/BUILDDIR=//)
+#ifeq ($(CUDACPP_BUILDDIR),)
+#$(error CUDACPP_BUILDDIR='$(CUDACPP_BUILDDIR)' should not be empty!)
+#else
+CUDACPP_BUILDDIR:=$(shell $(MAKE) $(foreach v,$(CUDACPP_MAKEENV),$(v)="$($(v))") -f $(CUDACPP_MAKEFILE) -pn 2>&1 | awk '/Building/{print $$3}' | sed s/BUILDDIR=//)
+ifeq ($(CUDACPP_BUILDDIR),)
+$(error CUDACPP_BUILDDIR='$(CUDACPP_BUILDDIR)' should not be empty!)
+else
+$(info CUDACPP_BUILDDIR='$(CUDACPP_BUILDDIR)')
+#endif
+endif
+CUDACPP_COMMONLIB=mg5amc_common
+CUDACPP_CXXLIB=mg5amc_$(processid_short)_cpp
+CUDACPP_CULIB=mg5amc_$(processid_short)_cuda

LIBS = $(LIBDIR)libbias.$(libext) $(LIBDIR)libdhelas.$(libext) $(LIBDIR)libdsample.$(libext) $(LIBDIR)libgeneric.$(libext) $(LIBDIR)libpdf.$(libext) $(LIBDIR)libgammaUPC.$(libext) $(LIBDIR)libmodel.$(libext) $(LIBDIR)libcernlib.$(libext) $(MADLOOP_LIB) $(LOOP_LIBS)

@@ -43,41 +78,112 @@ ifeq ($(strip $(MATRIX_HEL)),)
@@ -43,41 +84,117 @@ ifeq ($(strip $(MATRIX_HEL)),)
endif


Expand All @@ -113,7 +118,12 @@ index 348c283be..74db44d84 100644

-$(PROG): $(PROCESS) auto_dsig.o $(LIBS) $(MATRIX)
- $(FC) -o $(PROG) $(PROCESS) $(MATRIX) $(LINKLIBS) $(LDFLAGS) $(BIASDEPENDENCIES) -fopenmp
+#LDFLAGS+=-Wl,--no-relax # avoid 'failed to convert GOTPCREL relocation' error #458 (flag not universal -> skip?)
+ifeq ($(UNAME),Darwin)
+LDFLAGS += -lc++ # avoid 'Undefined symbols' for chrono::steady_clock on macOS (checked with otool -L libmg5amc_gg_ttx_cpp.so)
+LDFLAGS += -mmacosx-version-min=11.3 # avoid "ld: warning: object file was built for newer macOS version than being linked"
+else
+LDFLAGS += -Wl,--no-relax # avoid 'failed to convert GOTPCREL relocation' error #458 (not supported on macOS)
+endif

-$(PROG)_forhel: $(PROCESS) auto_dsig.o $(LIBS) $(MATRIX_HEL)
- $(FC) -o $(PROG)_forhel $(PROCESS) $(MATRIX_HEL) $(LINKLIBS) $(LDFLAGS) $(BIASDEPENDENCIES) -fopenmp
Expand All @@ -128,8 +138,8 @@ index 348c283be..74db44d84 100644
+else ifneq ($(shell $(CXX) --version | egrep '^clang'),)
+override OMPFLAGS = -fopenmp
+$(CUDACPP_BUILDDIR)/$(PROG)_cpp: LINKLIBS += -L $(shell dirname $(shell $(CXX) -print-file-name=libc++.so)) -lomp # see #604
+###else ifneq ($(shell $(CXX) --version | egrep '^Apple clang'),)
+###override OMPFLAGS = -fopenmp # OMP is not supported yet by cudacpp for Apple clang
+else ifneq ($(shell $(CXX) --version | egrep '^Apple clang'),)
+override OMPFLAGS = # OMP is not supported yet by cudacpp for Apple clang
+else
+override OMPFLAGS = -fopenmp
+endif
Expand Down Expand Up @@ -166,24 +176,24 @@ index 348c283be..74db44d84 100644
+madevent_fortran_link: $(PROG)_fortran
+ rm -f $(PROG)
+ ln -s $(PROG)_fortran $(PROG)
+

-$(LIBDIR)libpdf.$(libext):
- cd ../../Source/PDF; make
+madevent_cpp_link: $(CUDACPP_BUILDDIR)/$(PROG)_cpp
+ rm -f $(PROG)
+ ln -s $(CUDACPP_BUILDDIR)/$(PROG)_cpp $(PROG)
+

-$(LIBDIR)libgammaUPC.$(libext):
- cd ../../Source/PDF/gammaUPC; make
+madevent_cuda_link: $(CUDACPP_BUILDDIR)/$(PROG)_cuda
+ rm -f $(PROG)
+ ln -s $(CUDACPP_BUILDDIR)/$(PROG)_cuda $(PROG)

-$(LIBDIR)libpdf.$(libext):
- cd ../../Source/PDF; make
+
+# Building $(PROG)_cpp also builds $(PROG)_cuda if $(CUDACPP_CULIB) exists (improved patch for cpp-only builds #503)
+$(CUDACPP_BUILDDIR)/$(PROG)_cpp: $(PROCESS) $(DSIG_cudacpp) auto_dsig.o $(LIBS) $(MATRIX) counters.o ompnumthreads.o $(CUDACPP_BUILDDIR)/.cudacpplibs
+ $(FC) -o $(CUDACPP_BUILDDIR)/$(PROG)_cpp $(PROCESS) $(DSIG_cudacpp) auto_dsig.o $(MATRIX) $(LINKLIBS) $(BIASDEPENDENCIES) $(OMPFLAGS) counters.o ompnumthreads.o -L$(LIBDIR)/$(CUDACPP_BUILDDIR) -l$(CUDACPP_COMMONLIB) -l$(CUDACPP_CXXLIB) $(LIBFLAGSRPATH) $(LDFLAGS)
+ if [ -f $(LIBDIR)/$(CUDACPP_BUILDDIR)/lib$(CUDACPP_CULIB).* ]; then $(FC) -o $(CUDACPP_BUILDDIR)/$(PROG)_cuda $(PROCESS) $(DSIG_cudacpp) auto_dsig.o $(MATRIX) $(LINKLIBS) $(BIASDEPENDENCIES) $(OMPFLAGS) counters.o ompnumthreads.o -L$(LIBDIR)/$(CUDACPP_BUILDDIR) -l$(CUDACPP_COMMONLIB) -l$(CUDACPP_CULIB) $(LIBFLAGSRPATH) $(LDFLAGS); fi

-$(LIBDIR)libgammaUPC.$(libext):
- cd ../../Source/PDF/gammaUPC; make
+
+$(CUDACPP_BUILDDIR)/$(PROG)_cuda: $(CUDACPP_BUILDDIR)/$(PROG)_cpp
+
+counters.o: counters.cc timer.h
Expand Down Expand Up @@ -222,7 +232,7 @@ index 348c283be..74db44d84 100644

# Dependencies

@@ -97,5 +203,61 @@ unwgt.o: genps.inc nexternal.inc symswap.inc cluster.inc run.inc message.inc \
@@ -97,5 +214,61 @@ unwgt.o: genps.inc nexternal.inc symswap.inc cluster.inc run.inc message.inc \
run_config.inc
initcluster.o: message.inc

Expand Down Expand Up @@ -287,10 +297,10 @@ index 348c283be..74db44d84 100644
+distclean: cleanall # Clean all fortran and cudacpp builds as well as the googletest installation
+ $(MAKE) -f $(CUDACPP_MAKEFILE) distclean
diff --git b/epochX/cudacpp/gg_tt.mad/bin/internal/gen_ximprove.py a/epochX/cudacpp/gg_tt.mad/bin/internal/gen_ximprove.py
index 4dd71db86..3b8ec3121 100755
index ebbc1ac1d..a88d60b28 100755
--- b/epochX/cudacpp/gg_tt.mad/bin/internal/gen_ximprove.py
+++ a/epochX/cudacpp/gg_tt.mad/bin/internal/gen_ximprove.py
@@ -380,8 +380,20 @@ class gensym(object):
@@ -385,8 +385,20 @@ class gensym(object):
done = True
if not done:
raise Exception('Parsing error in gensym: %s' % stdout)
Expand All @@ -314,7 +324,7 @@ index 4dd71db86..3b8ec3121 100755
self.submit_to_cluster(job_list)
job_list = {}
diff --git b/epochX/cudacpp/gg_tt.mad/bin/internal/madevent_interface.py a/epochX/cudacpp/gg_tt.mad/bin/internal/madevent_interface.py
index a056d3861..b70b548e5 100755
index 389b93ab8..d72270289 100755
--- b/epochX/cudacpp/gg_tt.mad/bin/internal/madevent_interface.py
+++ a/epochX/cudacpp/gg_tt.mad/bin/internal/madevent_interface.py
@@ -3614,8 +3614,20 @@ Beware that this can be dangerous for local multicore runs.""")
Expand Down
32 changes: 13 additions & 19 deletions epochX/cudacpp/CODEGEN/PLUGIN/CUDACPP_SA_OUTPUT/launch_plugin.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,7 @@
# Copyright (C) 2020-2023 CERN and UCLouvain.
# Licensed under the GNU Lesser General Public License (version 3 or later).
# Created by: O. Mattelaer (Aug 2023) for the MG5aMC CUDACPP plugin.
# Further modified by: O. Mattelaer, A. Valassi (2023) for the MG5aMC CUDACPP plugin.

import logging
import os
Expand All @@ -19,22 +23,15 @@
import madgraph.various.banner as banner_mod

class CPPMEInterface(madevent_interface.MadEventCmdShell):

def compile(self, *args, **opts):
""" """

import multiprocessing
if not self.options['nb_core'] or self.options['nb_core'] == 'None':
self.options['nb_core'] = multiprocessing.cpu_count()

if args and args[0][0] == 'madevent' and hasattr(self, 'run_card'):
import pathlib
import os
pjoin = os.path.join




cudacpp_backend = self.run_card['cudacpp_backend'].upper() # the default value is defined in banner.py
logger.info("Building madevent in madevent_interface.py with '%s' matrix elements"%cudacpp_backend)
if cudacpp_backend == 'FORTRAN':
Expand All @@ -50,15 +47,14 @@ def compile(self, *args, **opts):
return misc.compile(nb_core=self.options['nb_core'], *args, **opts)

class CPPRunCard(banner_mod.RunCardLO):

def reset_simd(self, old_value, new_value, name):
if not hasattr(self, 'path'):
raise Exception

logger.warning('WARNING! CPPRunCard instance has no attribute path')
return
###raise Exception('INTERNAL ERROR! CPPRunCard instance has no attribute path')
if name == "vector_size" and new_value <= int(old_value):
# code can handle the new size -> do not recompile
return

Sourcedir = pjoin(os.path.dirname(os.path.dirname(self.path)), 'Source')
subprocess.call(['make', 'cleanavx'], cwd=Sourcedir, stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL)

Expand All @@ -68,33 +64,31 @@ def plugin_input(self, finput):
def default_setup(self):
super().default_setup()
self.add_param('cudacpp_backend', 'CPP', include=False, hidden=False)


def write_one_include_file(self, output_dir, incname, output_file=None):
"""write one include file at the time"""

if incname == "vector.inc" and 'vector_size' not in self.user_set:
return
super().write_one_include_file(output_dir, incname, output_file)


def check_validity(self):
"""ensure that PLUGIN information are consistent"""

super().check_validity()

if self['SDE_strategy'] != 1:
logger.warning('SDE_strategy different of 1 is not supported with SMD/GPU mode')
self['sde_strategy'] = 1

if self['hel_recycling']:
self['hel_recycling'] = False

class GPURunCard(CPPRunCard):

def default_setup(self):
super(CPPRunCard, self).default_setup()
self.add_param('cudacpp_backend', 'CUDA', include=False, hidden=False)


#class CUDACPPRunCard(CPPRunCard):
# def default_setup(self):
# super(CPPRunCard, self).default_setup()
# self.add_param('cudacpp_backend', 'CPP', include=False, hidden=False)

MEINTERFACE = CPPMEInterface
RunCard = CPPRunCard
Original file line number Diff line number Diff line change
Expand Up @@ -116,7 +116,7 @@ namespace mg5amcCpu
bool known = true;
bool ok = __builtin_cpu_supports( "sse4.2" );
const std::string tag = "nehalem (SSE4.2)";
#else
#else // AV FIXME! Added by OM for Mac, should identify the correct __xxx__ flag that should be targeted
bool known = false; // __builtin_cpu_supports is not supported
// See https://gcc.gnu.org/onlinedocs/gcc/Basic-PowerPC-Built-in-Functions-Available-on-all-Configurations.html
// See https://stackoverflow.com/q/62783908
Expand Down
Loading