Skip to content

Conversation

@harrywaugh
Copy link

@harrywaugh harrywaugh commented Jun 17, 2019

Summary of Changes:

  • Added some restrict and const qualifiers so that the compiler is better at auto-vectorizing the propagate loop.
  • Edited the Makefile so that pb_mpi can be build with Intel compiler, this can be done using make INTEL=1. (Faster than GNU currently). To compile on Intel, some missing return statements were also inserted to preserve the correctness of the output trace file.
  • Removed explicit MPI_Barriers, which then idenitifes MPI_Bcast as a major bottleneck with more cores.
  • Added benchmark report.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant