Skip to content.

UPR HPCf

Sections
Personal tools
You are here: Home » Members » humberto's Home » Test ZWiki » MPIClusterErrors
Views
HPCfInfo » ClusterSetup »

MPIClusterErrors

last edited 4 years ago by dayala

Common Errors from using MPI in cluster

  • What does p4_error: semget failed for setnum: 0 mean?

    This means that the maximum number of allowed semaphores on the master node has been created, and the program you are trying to run cannot allocate a new semaphore for inter-process communication. This can happen when somebody has been testing software that does not exit properly, leaving semaphores and shared memory segments allocated.

    If the leftover semaphores are owned by you, it can be fixed by running the following two commands:

    $ /opt/mpich/gnu/sbin/cleanipcs

    $ cluster-fork /opt/mpich/gnu/sbin/cleanipcs

    (In this case, using the intel or gnu version doesn't matter. The scripts are identical.)

    It is possible that other users may have filled up the semaphore table. In this case, either they or root will need to clean the tables.

  • What does p4_error: net_recv failed for fd = 7, errno = : 104 mean?

    This means that the maximum number of allowed semaphores on one of the child nodes has been created, and the program you are trying to run cannot allocate a new semaphore for inter-process communication with this node. This can happen when somebody has been testing software that does not exit properly, leaving semaphores and shared memory segments allocated.

    If the leftover semaphores are owned by you, it can be fixed by running the following command:

    $ cluster-fork /opt/mpich/gnu/sbin/cleanipcs

 

Powered by Plone

This site conforms to the following standards: