Skip to content

  • Projects
  • Groups
  • Snippets
  • Help
    • Loading...
    • Help
    • Support
    • Submit feedback
    • Contribute to GitLab
  • Sign in
SHARK
SHARK
  • Project overview
    • Project overview
    • Details
    • Activity
    • Releases
    • Cycle Analytics
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
    • Charts
  • Issues 0
    • Issues 0
    • List
    • Boards
    • Labels
    • Milestones
  • Merge Requests 0
    • Merge Requests 0
  • Wiki
    • Wiki
  • Members
    • Members
  • Collapse sidebar
  • Activity
  • Graph
  • Charts
  • Create a new issue
  • Commits
  • Issue Boards
  • Shark
  • SHARKSHARK
  • Wiki
  • MPICH2

MPICH2

Last edited by mpvillerius Apr 08, 2016
Page history

How to submit a MPI job with qsub

MPICH2 on Shark

MPICH is a freely available, portable implementation of MPI, a standard for message-passing for distributed-memory applications used in parallel computing. The MPICH2 libraries are installed on all execution nodes. MPICH2 can be used with an old startup method MPD or with the new startup method Hydra.
In order to use the MPICH2 implementation you need a parallel environment. To list all parallel environments execute this command:

qconf -spl
BWA
make
mpich2
mpich2_mpd
smp

The libraries are located at:

/usr/local/mpich/mpich2    
/usr/local/mpich/mpich-3.2    

You can load either one of these MPI libraries by using the module command.
module load mpich2
module load mpich/3.2

To use the mpich2 library with MPD

The mpich2 library uses the parallel environment mpich2_mpd. First create a .mpd.conf file in your home directory with the following command:

echo "MPD_SECRETWORD=set_some_passwd_or_use_this_one" >> $HOME/.mpd.conf

Change the security settings to read write for the user with :

chmod 600 HOME/.mpd.conf

To submit a test helloworld with 2 slots:

qsub -pe mpich2_mpd 2 mpich2_mpd.sh

The mpich2_mpd.sh submit script:

#!/bin/bash
#
# sample mpich2 job
# you will need to adjust the $PATH to your mpich2 installation
#with module load mpich2
# be sure to get the correct mpiexec for mpich2_mpd!!!

. /usr/local/Modules/current/init/bash

module load mpich2

echo "MPICH2_ROOT = $MPICH2_ROOT"

export MPD_CON_EXT="sge_$JOB_ID.$SGE_TASK_ID"

echo "Got $NSLOTS slots."

mpiexec -machinefile $TMPDIR/machines -n $NSLOTS mpihello

exit 0

The mpihello.c source code:

#include <stdio.h>
#include <mpi.h>
#include <unistd.h>

main(int argc, char **argv)
{
   int node;

   MPI_Init(&argc, &argv);
   MPI_Comm_rank(MPI_COMM_WORLD, &node);

   printf("Hello World from Node %d.\n", node);
   sleep(60);

   MPI_Finalize();
}

To compile this mpihello.c first load the right module:
module load mpich2
Now compile the code :
mpicc -o mpihello mpihello.c

Now you are ready to submit your first mpi code with 2 slots(cores/CPUs) :
qsub -pe mpich2_mpd 2 mpich2_mpd.sh

To use the mpich2 library with Hydra

With Version 1.3 of MPICH2, they moved to Hydra as default startup method for their slaves tasks and other startup methods will be removed over time. Hydra has a compiled in Tight Integration into SGE by default, and no special setup in SGE to support MPICH2 jobs is necessary any longer.

Hydra will work out-of-the-box with a defined parallel environment where start- and stop_proc_args are both set to NONE in the to be used PE (in the essence, the same PE can now be used for Open MPI and MPICH2), and in the jobscript a plain mpiexec will discover automatically the granted slots and nodes without any further options. Nevertheless, in case that there is more than one MPI installation in a cluster available, the correct mpiexec corresponding to the compiled application must be used as usual.

The module for this version of MPICH2 is: module load mpich/3.2

The right parallel environment for MPICH2 with Hydra is : mpich2

The same mpihello.c can be used only you need to recompile this source code with the right mpicc from

/usr/local/mpich/mpich-3.2/bin/mpicc  

Make sure that the mpich2 module is unloaded:

module unload mpich2

Now load the the right module mpich/3.2

module load mpich/3.2

Now compile the code :
mpicc -o mpi3hello mpihello.c

For MPICH2 we need to use the parallel environment mpich2. To submit a mpich2 job with qsub and 10 slots you need a shell script: The option -rmk sge is important to use, and the environment variable QRSH_WRAPPER. The submit script would look like this, mpich2-qsub.sh

#!/bin/bash
#$ -V
#$ -N mpich2_tst
#$ -cwd
#$ -pe mpich2 10
#$ -v QRSH_WRAPPER=/usr/local/OpenGridScheduler/gridengine/bin/linux-x64/qrshwrapper

. /usr/local/Modules/current/init/bash

module load mpich/3.2

echo "Got $NSLOTS slots."

mpiexec -rmk sge -np $NSLOTS mpi3hello

You can now submit this script with qsub:

qsub mpich2-qsub.sh

If you get an error in your mpich2_tst.e<job_id> file like this:

[proxy:0:0@greenlandshark] HYDU_create_process (utils/launch/launch.c:75): execvp error on file mpi3hello (No such file or directory)

You can solve this by using the absolute path to your mpi3hello file.
If for example you saved your mpi3hello file in /home//mpi3hello
Use this location in your submit script mpich2-qsub.sh

mpiexec -rmk sge -np $NSLOTS /home/<username>/mpi3hello
Clone repository
  • AccessingGridStorage
  • ChangePasswd
  • CheckpointingQueue
  • Configuration
  • Contact_info_shark
  • DescriptionExenode
  • EnvironmentModules
  • Errorqueue
  • Examples
  • Examples_slurm
  • FAQ
  • FineTune__SLASH__Solutions
  • GetConnected
  • Graphical
  • Graphicalview
More Pages