|
| 1 | +When ./configure-mana is called in MANA, it calls ./configure, which |
| 2 | +in turn configures the DMTCP submodule as: |
| 3 | + ./configure ... --enable-debug CFLAGS=-fno-stack-protector CXXFLAGS=-fno-stack-protector MPI_BIN=/usr/local/bin ... MANA_USE_LH_FIXED_ADDRESS=1 --with-mana-helper-dir=../restart_plugin --disable-dlsym-wrapper ... |
| 4 | + |
| 5 | +Hence, '--with-mana-helper-dir' above points to this directory. |
| 6 | + |
| 7 | +In ../dmtcp/src/mtcp/Makefile.in, it has hardwired MANA-specific code |
| 8 | +to compile the local filenames here as object files in ../dmtcp/src/mtcp: |
| 9 | + |
| 10 | +ifneq ($(MANA_HELPER_DIR),) |
| 11 | + HEADERS += $(MANA_HELPER_DIR)/mtcp_split_process.h \ |
| 12 | + $(MANA_HELPER_DIR)/ucontext_i.h |
| 13 | + OBJS += mtcp_restart_plugin.o mtcp_split_process.o getcontext.o |
| 14 | + CFLAGS += -DMTCP_PLUGIN_H="<mtcp_restart_plugin.h>" |
| 15 | + INCLUDES += -I$(MANA_HELPER_DIR) |
| 16 | +endif |
| 17 | + |
| 18 | +(Note that MANA_HELPER_DIR was set by ./configure using --with-mana-helper-dir.) |
| 19 | + |
| 20 | +==== |
| 21 | +When MANA restarts using mana_restart, the relevant logic is found in the |
| 22 | +files of this directory (at the time of restart) and |
| 23 | +../mpi-proxy-split/mpi_plugin.cpp (earlier at the time of checkpoint). |
| 24 | + |
| 25 | +mpi_plugin.cpp has written libsStart, libsEnd and highMemStart into the |
| 26 | +MTCP header of each checkpoint image at the time of checkpoint. |
| 27 | + |
| 28 | +At the time of checkpoint, control comes to: |
| 29 | + ../mpi-proxy-split/mpi_plugin.cpp:computeUnionOfCkptImageAddresses() |
| 30 | + (i) which computes libsStart, libsEnd, highMemStart |
| 31 | + (ii) and saves it in the MTCP header of the checkpoint image, |
| 32 | + (iii) such that [libsStart, libsEnd]+[highMemStart, STACK] should |
| 33 | + cover all memory regions of the upper half for every rank. |
| 34 | + |
| 35 | +At the time of restart, control comes to: |
| 36 | + ../dmtcp/src/mtcp/mtcp_restart.c:main() -> |
| 37 | + mtcp_restart_plugin.c:mtcp_plugin_hook() -> |
| 38 | + mtcp_split_process.c:splitProcess() -> |
| 39 | + mtcp_split_process.c:initializeLowerHalf() -> |
| 40 | + (i) mtcp_split_process.c:splitProcess() |
| 41 | + // forks proxy process for lower half |
| 42 | + // and then copies it into cur. process |
| 43 | + (ii) initializes the lower half with libc_start_main (now that it is |
| 44 | + in the current process) |
| 45 | + (iii) returns to 'splitProcess()', which returns to 'mtcp_plugin_hook()': |
| 46 | + mtcp_restart_plugin.c:mtcp_plugin_hook() -> |
| 47 | + (i) We finished 'splitProcess()', above. |
| 48 | + (ii) reserve_fds_upper_half() |
| 49 | + reserveUpperHalfMemoryRegionsForCkptImgs() // mmap memory regions |
| 50 | + // of future upper half |
| 51 | + (iii) JUMP_TO_LOWER_HALF() |
| 52 | + (iv) // MPI_Init is called here. Network memory areas are loaded by MPI_Init |
| 53 | + // Also, MPI_Cart_create will be called to restore cartesian topology. |
| 54 | + // Based on the coordinates, checkpoint image is restored instead of |
| 55 | + // world rank. |
| 56 | + // This includes /dev/xpmem, *shared_mem*, etc. |
| 57 | + (v) RETURN_TO_UPPER_HALF() |
| 58 | + (vi) releaseUpperHalfMemoryRegionsForCkptImgs() |
| 59 | + unreserve_fds_upper_half() |
| 60 | + (vii) getCkptImageByRank() // Sets ckpt image for upper half for this rank |
| 61 | + (viii) returns to ../dmtcp/src/mtcp/mtcp_restart.c:main() |
| 62 | + ../dmtcp/src/mtcp/mtcp_restart.c:main() -> |
| 63 | + (i) Load ckpt image file found by 'mtcp_plugin_hook()' |
| 64 | + (ii) Control passes to program counter and stack from time of checkpoint |
| 65 | + (iii) The upper half then rebinds MPI wrappers, etc. |
| 66 | + |
| 67 | +==== |
| 68 | +DEBUGGING mana_restart: |
| 69 | + Note that the coordinator dumps a *.json file in the directory where the |
| 70 | + coordinator was launched, at the time of checkpoint (and during restart). |
| 71 | + The checkpoint version includes: |
| 72 | + libsStart, libsEnd, highMemStart, and the /proc/*/maps during checkpoint. |
| 73 | + This can be used to verify that [libsStart, libsEnd]+[highMemStart, STACK] |
| 74 | + truly covers all upper-half memory regions. |
| 75 | + This can also be checked in GDB by comparing /proc/self/maps inside |
| 76 | + ../dmtcp/src/mtcp/mtcp_restart.c:main() just before it loads the |
| 77 | + checkpoint image file, with the /proc/self/maps when afterward executing |
| 78 | + the statement 'case DMTCP_EVENT_RESTART:' in the file |
| 79 | + ../mpi-proxy-split/mpi_plugin.cpp. |
0 commit comments