New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Plumed and GROMACS openmpi #492
Comments
Can you try to:
I suspect there is a problem with the MPI library that is not related directly to PLUMED. |
Thank you for response. The run without plumed works. Could you suggest a way to debug the issue? |
Do you mean case 1 above or case 2? In case it is 1 can you check if gromacs was built with MPI or with threadMPI? |
I've run the case 1 i.e. mpirun -np 2 gmx_mpi mdrun -s topol.tpr -multi 2 -nsteps 500000 on the same data |
Can you point to the script used for building gromacs? In addition, please check also what happens if you call mpirun -np 1 (that is with a single process). Thanks |
The run with -np1 runs well. If you remove the multi option Here is the easybuild script we use. The build is happening inside the singularity centos 7.5 bootstrap_cmds = [ easyblock = "CMakeMake" name = 'GROMACS' homepage = 'http://www.gromacs.org' toolchain = {'name': 'foss', 'version': '2017a'} source_urls = [ sources = [ dependencies = [ hiddendependencies = [ builddependencies = [ preconfigopts = 'pwd && cd %(builddir)s/gromacs-2018.6 && plumed patch -p --runtime -e gromacs-2018.6 && cd /opt/easybuild/build/GROMACS/%(version)s/foss-2017a-mpi/easybuild_obj && ' separate_build_dir = True config_list = [ configopts = ' '.join(config_list) buildopts_list = [ buildopts = ' '.join(buildopts_list) postinstallcmds = [ modextrapaths = { modextravars = { moduleclass = 'bio' |
Sorry I am afraid I do not know how to help... Did you manage to run with other plumed or gromacs versions on the same cluster? Another thing that you can check is if plumed and gromacs are linked to the same MPI library. These two commands should report exactly the same file:
|
close because of no clear documentation |
Hello,
I've build the GROMACS 2018.6 patched with Plumed 2.5.1 using the easybuild. I'm trying to run the CINECA tutorial (https://www.plumed.org/doc-v2.5/user-doc/html/cineca.html) exercise 4.
gmx_mpi mdrun -s topol.tpr -plumed plumed.dat -nsteps 500000 runs fine
mpirun -np 2 gmx_mpi mdrun -s topol.tpr -plumed plumed.dat -multi 2 -nsteps 500000
Fails with following errors:
[kfhl160@seskscpg009 SCRIPTS]$ mpirun np 2 gmx_mpi mdrun -s ../SETUP/topol.tpr -plumed plumed.dat -multi 2 -nsteps 500000
WARNING: Open MPI will create a shared memory backing file in a
directory that appears to be mounted on a network filesystem.
Creating the shared memory backup file on a network file system, such
as NFS or Lustre is not recommended -- it may cause excessive network
traffic to your file servers and/or cause shared memory traffic in
Open MPI to be much slower than expected.
You may want to check what the typical temporary directory is on your
node. Possible sources of the location of this temporary directory
include the $TEMPDIR, $TEMP, and $TMP environment variables.
Note, too, that system administrators can set a list of filesystems
where Open MPI is disallowed from creating temporary files by setting
the MCA parameter "orte_no_session_dir".
Local host: seskscpg009.prim.scp
Filename: /scratch/kfhl160/openmpi-sessions-486452227@seskscpg009_0/5849/1/shared_mem_pool.seskscpg009
You can set the MCA paramter shmem_mmap_enable_nfs_warning to 0 to
disable this message.
:-) GROMACS - gmx mdrun, 2018.6 (-:
GROMACS is written by:
Emile Apol Rossen Apostolov Paul Bauer Herman J.C. Berendsen
Par Bjelkmar Aldert van Buuren Rudi van Drunen Anton Feenstra
Gerrit Groenhof Aleksei Iupinov Christoph Junghans Anca Hamuraru
Vincent Hindriksen Dimitrios Karkoulis Peter Kasson Jiri Kraus
Carsten Kutzner Per Larsson Justin A. Lemkul Viveca Lindahl
Magnus Lundborg Pieter Meulenhoff Erik Marklund Teemu Murtola
Szilard Pall Sander Pronk Roland Schulz Alexey Shvetsov
Michael Shirts Alfons Sijbers Peter Tieleman Teemu Virolainen
Christian Wennberg Maarten Wolf
and the project leaders:
Mark Abraham, Berk Hess, Erik Lindahl, and David van der Spoel
Copyright (c) 1991-2000, University of Groningen, The Netherlands.
Copyright (c) 2001-2017, The GROMACS development team at
Uppsala University, Stockholm University and
the Royal Institute of Technology, Sweden.
check out http://www.gromacs.org for more information.
GROMACS is free software; you can redistribute it and/or modify it
under the terms of the GNU Lesser General Public License
as published by the Free Software Foundation; either version 2.1
of the License, or (at your option) any later version.
GROMACS: gmx mdrun, version 2018.6
Executable: /opt/scp/software/GROMACS/2018.6_GPU-foss-2017a-mpi/bin/gmx_mpi
Data prefix: /opt/scp/software/GROMACS/2018.6_GPU-foss-2017a-mpi
Working dir: /home/kfhl160/gromacstest1/cineca/SCRIPTS
Command line:
gmx_mpi mdrun -s ../SETUP/topol.tpr -plumed plumed.dat -multi 2 -nsteps 500000
Back Off! I just backed up md0.log to ./#md0.log.8#
Back Off! I just backed up md1.log to ./#md1.log.8#
++ Loading the PLUMED kernel runtime ++
++ PLUMED_KERNEL="/opt/scp/software/PLUMED/2.5.1-foss-2017a/lib/libplumedKernel.so" ++
++ Loading the PLUMED kernel runtime ++
++ PLUMED_KERNEL="/opt/scp/software/PLUMED/2.5.1-foss-2017a/lib/libplumedKernel.so" ++
++ Loading the PLUMED kernel runtime ++
++ PLUMED_KERNEL="/opt/scp/software/PLUMED/2.5.1-foss-2017a/lib/libplumedKernel.so" ++
++ Loading the PLUMED kernel runtime ++
++ PLUMED_KERNEL="/opt/scp/software/PLUMED/2.5.1-foss-2017a/lib/libplumedKernel.so" ++
Reading file ../SETUP/topol1.tpr, VERSION 4.6.7 (single precision)
Note: file tpx version 83, software tpx version 112
NOTE: GPU found, but the current simulation can not use GPUs
To use a GPU, set the mdp option: cutoff-scheme = Verlet
Overriding nsteps with value passed on the command line: 500000 steps, 1e+03 ps
Reading file ../SETUP/topol0.tpr, VERSION 4.6.7 (single precision)
Note: file tpx version 83, software tpx version 112
NOTE: GPU found, but the current simulation can not use GPUs
To use a GPU, set the mdp option: cutoff-scheme = Verlet
Overriding nsteps with value passed on the command line: 500000 steps, 1e+03 ps
This is simulation 1 out of 2 running as a composite GROMACS
multi-simulation job. Setup for this simulation:
Using 1 MPI process
This is simulation 0 out of 2 running as a composite GROMACS
multi-simulation job. Setup for this simulation:
Using 1 MPI process
Non-default thread affinity set probably by the OpenMP library,
disabling internal thread affinity
Non-default thread affinity set probably by the OpenMP library,
disabling internal thread affinity
NOTE: This file uses the deprecated 'group' cutoff_scheme. This will be
removed in a future release when 'verlet' supports all interaction forms.
NOTE: This file uses the deprecated 'group' cutoff_scheme. This will be
removed in a future release when 'verlet' supports all interaction forms.
Back Off! I just backed up traj_comp1.xtc to ./#traj_comp1.xtc.6#
Back Off! I just backed up traj_comp0.xtc to ./#traj_comp0.xtc.6#
Back Off! I just backed up ener1.edr to ./#ener1.edr.6#
Back Off! I just backed up ener0.edr to ./#ener0.edr.6#
starting mdrun 'alanine dipeptide in vacuum'
500000 steps, 1000.0 ps.
starting mdrun 'alanine dipeptide in vacuum'
500000 steps, 1000.0 ps.
[seskscpg009:90377] * Process received signal
[seskscpg009:90376] Process received signal
[seskscpg009:90376] Signal: Segmentation fault (11)
[seskscpg009:90376] Signal code: Address not mapped (1)
[seskscpg009:90376] Failing at address: 0x30
[seskscpg009:90377] Signal: Segmentation fault (11)
[seskscpg009:90377] Signal code: Address not mapped (1)
[seskscpg009:90377] Failing at address: 0x30
[seskscpg009:90376] [ 0] [seskscpg009:90377] [ 0] /lib64/libpthread.so.0(+0xf6d0)[0x7f65825396d0]
[seskscpg009:90376] [ 1] /lib64/libpthread.so.0(+0xf6d0)[0x7f8897b1e6d0]
[seskscpg009:90377] [ 1] /opt/scp/software/OpenMPI/2.0.2-GCC-6.3.0-2.27/lib/libmpi.so.20(MPI_Allreduce+0x1a4)[0x7f657aa95f24]
[seskscpg009:90376] [ 2] /opt/scp/software/OpenMPI/2.0.2-GCC-6.3.0-2.27/lib/libmpi.so.20(MPI_Allreduce+0x1a4)[0x7f889007af24]
[seskscpg009:90377] [ 2] /opt/scp/software/PLUMED/2.5.1-foss-2017a/lib/libplumedKernel.so(ZN4PLMD4GREX3cmdERKNSt7_cxx1112basic_stringIcSt11char_traitsIcESaIcEEEPv+0xb78)[0x7f65639d0198]
[seskscpg009:90376] [ 3] /opt/scp/software/PLUMED/2.5.1-foss-2017a/lib/libplumedKernel.so(ZN4PLMD4GREX3cmdERKNSt7_cxx1112basic_stringIcSt11char_traitsIcESaIcEEEPv+0xb78)[0x7f8885258198]
[seskscpg009:90377] [ 3] /opt/scp/software/PLUMED/2.5.1-foss-2017a/lib/libplumedKernel.so(ZN4PLMD10PlumedMain3cmdERKNSt7_cxx1112basic_stringIcSt11char_traitsIcESaIcEEEPv+0x1c66)[0x7f65639df1e6]
[seskscpg009:90376] [ 4] /opt/scp/software/PLUMED/2.5.1-foss-2017a/lib/libplumedKernel.so(ZN4PLMD10PlumedMain3cmdERKNSt7_cxx1112basic_stringIcSt11char_traitsIcESaIcEEEPv+0x1c66)[0x7f88852671e6]
[seskscpg009:90377] [ 4] /opt/scp/software/PLUMED/2.5.1-foss-2017a/lib/libplumedKernel.so(plumed_plumedmain_cmd+0x5d)[0x7f65639ef37d]
[seskscpg009:90376] [ 5] gmx_mpi[0x4159af]
[seskscpg009:90376] [ 6] gmx_mpi[0x438095]
[seskscpg009:90376] [ 7] gmx_mpi[0x41faae]
[seskscpg009:90376] [ 8] /opt/scp/software/PLUMED/2.5.1-foss-2017a/lib/libplumedKernel.so(plumed_plumedmain_cmd+0x5d)[0x7f888527737d]
[seskscpg009:90377] [ 5] gmx_mpi[0x4159af]
[seskscpg009:90377] [ 6] gmx_mpi[0x438095]
[seskscpg009:90377] [ 7] gmx_mpi[0x4204e2]
[seskscpg009:90376] [ 9] gmx_mpi[0x444cdb]
[seskscpg009:90376] [10] gmx_mpi[0x40fbcc]
[seskscpg009:90376] [11] gmx_mpi[0x41faae]
[seskscpg009:90377] [ 8] gmx_mpi[0x4204e2]
[seskscpg009:90377] [ 9] gmx_mpi[0x444cdb]
[seskscpg009:90377] [10] gmx_mpi[0x40fbcc]
[seskscpg009:90377] [11] /lib64/libc.so.6(_libc_start_main+0xf5)[0x7f6579baa445]
[seskscpg009:90376] [12] gmx_mpi[0x412dee]
/lib64/libc.so.6(_libc_start_main+0xf5)[0x7f888f18f445]
[seskscpg009:90377] [12] gmx_mpi[0x412dee]
[seskscpg009:90377] End of error message
[seskscpg009:90376] End of error message *
mpirun noticed that process rank 1 with PID 0 on node seskscpg009 exited on signal 11 (Segmentation fault).
[seskscpg009.prim.scp:90335] 3 more processes have sent help message help-opal-shmem-mmap.txt / mmap on nfs
[seskscpg009.prim.scp:90335] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
Could you advise that can wrong in the installation? Plumed version is 2.5.1
The text was updated successfully, but these errors were encountered: