The tutorials are written in C++ and use MPI for inter-process communications. The algorithm implemented in the samples are simple: Multiply each element of an input array with a constant multiplication factor, in parallel. Example output can be something like: Array [0,1,2,3,4] * multiplication factor [2] = [0,2,4,6,8]. The samples are fully documented, so download it and study the source code at your leisure:
C++ and MPI Tutorial 1 — Basic application
C++ and MPI Tutorial 2 — Extended to demonstrate screen output
C++ and MPI Tutorial 3 — Extended to demonstrate file output
Once you have downloaded the tutorials, copy them to your home directory and follow the commands listed below to compile the programs to binary form:
/opt/mpich/gnu/bin/mpicc -o MPI_Tutorial1.out MPI_Tutorial1.c
/opt/mpich/gnu/bin/mpicc -o MPI_Tutorial2.out MPI_Tutorial2.c
/opt/mpich/gnu/bin/mpicc -o MPI_Tutorial3.out MPI_Tutorial3.c
mpicc is the command used to compile the source code. The –o option allows you to combine linking and compiling of the source code in a single command. The second parameter is the name of the binary file and the last parameter is the name of the source code file. The command is run from within the directory where the MPI_Tutorialx.c files are stored.
The binaries of the compiled source code tutorials can also be downloaded here:
Once compiled, the binaries can be executed to run the program implemented in the source code. The following commands will execute the binaries:
mpirun -machinefile host.txt --mca btl ^openib -np 2 /home/itcadmin/MPI_Tutorial1.out
mpirun -machinefile host.txt --mca btl ^openib -np 2 /home/itcadmin/MPI_Tutorial2.out
mpirun -machinefile host.txt --mca btl ^openib -np 2 /home/itcadmin/MPI_Tutorial3.out
mpirun is the command used to run the binary. The –machinefile host.txt option specifies the setup of the processors on the HPC cluster. An example machinefile (named host.txt) can be downloaded here. This is the same machinefile that was used in these examples. –np 2 specifies the number of nodes (not processors, there are 8 processors per node on the POC) to use for your binary’s execution. Therefore, -np 2 means 16 processors will be used. And the last parameter specifies the name of the binary you wish to run.
Lastly, binaries can also be queued on the sandbox using the qsub command, in exactly the same manner as on the live environment. This is discussed in the next section.