Using AmberMD

Amber is a collection of software used for simulating molecules, managed by a dedicated developer community. It consists of two main parts: AmberTools23, which contains standalone packages for molecular simulations, and Amber22, an expansion that includes enhanced performance through the 'pmemd' program. This improvement makes molecular dynamics simulations significantly faster, especially when using multiple CPUs and GPUs.

 

This KB Article References: High Performance Computing
This Information is Intended for: Instructors, Researchers, Staff
Created: 02/05/2024 Last Updated: 05/15/2024

Current Version

AmberTools23 and Amber22 are available on SeaWulf as a module. This can be accessed with:

module load amber/22

Amber Examples

The example simulation we will be running comes from a 50 nanoseconds (ns) explicit solvent molecular dynamics (MD) production run of a protein system. The system has undergone prior preparation, including building and equilibrating the protein structure in explicit solvent. While most simulations are designed to run over the course of several hours, this experiment has been significantly shortened to provide a more practical example.

 

AmberMD jobs will require three distinct files.

1. Input file (input.mdin) - This file specifies the parameters and settings for the simulation.

2. Topology file (topology.prmtop) - This file includes the necessary information regarding atom types, charges, and connectivity for defining the force field and structure of the system. This file is typically generated using software tools such as LeAP.

3. Coordinate file (coordinates.rst7) This file contains the initial coordinates of the atoms in the system. This file is typically generated from experimental data or previous simulations.

 

First make a new directory and copy the example slurm script, input file, topology file, and coordinate file from the samples folder:

cd /gpfs/projects/samples/amber_samples
mkdir -p $HOME/amber_examples
cp * $HOME/amber_examples && cd $_

 

Next you will be running the simulation via a slurm batch job. AmberMD has implemented multiple versions of the program to optimize your simulation for the available hardware. We have provided example slurm scripts for pmemd, pmemd.MPI, pmemd.cuda, and pmemd.cuda.MPI. Because of the large number of output files from these programs, it is unwise to submit multiple jobs from the same directory.


pmemd 

Pmemd stands for Particle Mesh Ewald Molecular Dynamics. It's an optimized and high-performance molecular dynamics simulation program designed to efficiently utilize parallel processing. pmemd is well-suited for large-scale simulations and high-core-count computing environments, allowing for detailed studies of complex molecular systems.

 

The simulation will be run using the following command within a slurm script:

$AMBERHOME/bin/pmemd -O -i input.mdin -p topology.prmtop -c coordinates.rst7 \
-ref coordinates.rst7 -o output.mdout -r restart.rst7 -x trajectory.nc

Important flags to consider:

    -O   Overwrite output files
    -i   input file
    -p   topology file
    -c   the starting coordinate file
    -ref reference coordinates; use the same as the starting coordinates
    -o   output file 
    -r   restart file (last set of xyz coordinates from the simulation)
    -x   file with trajectory 

 

amber_md.slurm:

#!/bin/bash
#SBATCH --job-name=amber_md
#SBATCH --partition=short-40core
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=1
#SBATCH --cpus-per-task=40
#SBATCH --time=4:00:00
#SBATCH --output=amber_md.out

module load amber/22

$AMBERHOME/bin/pmemd -O -i input.mdin -p topology.prmtop -c coordinates.rst7 \
-ref coordinates.rst7 -o output.mdout -r restart.rst7 -x trajectory.nc

 

Then load the Slurm module and submit the job script to the scheduler:

module load slurm
sbatch amber_md.slurm

pmemd.MPI 

Pmemd.MPI extends the capabilities of pmemd by supporting Message Passing Interface (MPI) for parallelization across multiple nodes. By distributing computational tasks among multiple processors, pmemd.MPI can significantly accelerate simulations, making it particularly useful for simulating large biomolecular systems or for reducing simulation time in time-sensitive research projects.

amber_md_MPI.slurm:

#!/bin/bash
#SBATCH --job-name=amber_md_mpi
#SBATCH --partition=short-40core
#SBATCH --nodes=2
#SBATCH --ntasks-per-node=1
#SBATCH --cpus-per-task=40
#SBATCH --time=4:00:00
#SBATCH --output=amber_md_mpi.out

module load amber/22

mpirun -np 80 $AMBERHOME/bin/pmemd.MPI -O -i input.mdin -p topology.prmtop -c coordinates.rst7 \
-ref coordinates.rst7 -o output.mdout -r restart.rst7 -x trajectory.nc

 

Load the Slurm module and submit the job script to the scheduler:

module load slurm
sbatch amber_md_MPI.slurm

pmemd.cuda 

Pmemd.cuda is a variant of pmemd optimized for Graphics Processing Unit (GPU) acceleration. It leverages the parallel computing power of GPUs to accelerate molecular dynamics simulations. This acceleration is especially beneficial for large-scale systems, where traditional CPU-based simulations may be computationally expensive or time-consuming. Pmemd.cuda allows researchers to perform simulations faster, enabling more extensive exploration of molecular systems within a given time frame.

amber_md_cuda.slurm:*

#!/bin/bash
#SBATCH --job-name=amber_md_cuda
#SBATCH --partition=gpu
#SBATCH --nodes=1
#SBATCH --gpus-per-task=1
#SBATCH --time=4:00:00
#SBATCH --output=amber_md_gpu.out

module load amber/22

$AMBERHOME/bin/pmemd.cuda -O -i input.mdin -p topology.prmtop -c coordinates.rst7 \
-ref coordinates.rst7 -o output.mdout -r restart.rst7 -x trajectory.nc

*while this example uses the gpu queue, it can also be run using the A100 GPUs via the a100 queue.

 

Load the Slurm module and submit the job script to the scheduler:

module load slurm
sbatch amber_md_cuda.slurm

pmemd.cuda.MPI 

Pmemd.cuda.MPI combines the advantages of both CUDA optimization and MPI parallelization. It allows simulations to be distributed across multiple GPUs on multiple nodes, further enhancing the performance and scalability of molecular dynamics simulations. By harnessing the combined power of GPU acceleration and MPI parallelization, pmemd.cuda.MPI enables researchers to tackle even larger and more complex molecular systems efficiently.

amber_md_cuda_MPI.slurm:*

#!/bin/bash
#SBATCH --job-name=amber_md_cuda_mpi
#SBATCH --partition=gpu
#SBATCH --nodes=1
#SBATCH --gres=gpu:2
#SBATCH --time=4:00:00
#SBATCH --output=amber_md_gpu.out

module load amber/22

mpirun -np 2 $AMBERHOME/bin/pmemd.cuda.MPI -O -i input.mdin -p topology.prmtop \
-c coordinates.rst7 -ref coordinates.rst7 -o output.mdout -r restart.rst7 -x trajectory.nc

*while this example uses the gpu queue, it can also be run using the A100 GPUs via the a100 queue.

 

Load the Slurm module and submit the job script to the scheduler:

module load slurm
sbatch amber_md_cuda_MPI.slurm

Analysis

The submission scripts will produce several output files. 

  • amber_md.out -- standard output (stdout) from the simulation job.
  • mdinfo -- information about the molecular dynamics run including performance metrics and timing information.
  • output.mdout -- main output from the molecular dynamics simulation. 
  • restart.rst7 -- restart file saves the current state of the simulation. Allows the simulation to be paused and resumed where it left off, useful for long and complex simulations.
  • trajectory.nc -- contains the trajectory data generate during the simulation. This includes the positions, velocities, and properties of atoms over time.

 

The next step is to analyze the calculated trajectory. This can be done using the cpptraj program. Tutorials regarding trajectory analysis can be found here.


Visualizing the Results

Visualization of the trajectory is also possible using software such as VMD and Chimera. SeaWulf currently supports VMD 1.9.2 and Chimera 1.13.1.

 

Load the VMD module with:

module load vmd/1.9.2

Load the Chimera module with:

module load Chimera/1.13.1

 

Click here to view an extensive tutorial on using VMD in conjunction with AMBER.

 

Article Topic