Amber is a collection of software used for simulating molecules, managed by a dedicated developer community. It consists of two main parts: AmberTools23, which contains standalone packages for molecular simulations, and Amber22, an expansion that includes enhanced performance through the 'pmemd' program. This improvement makes molecular dynamics simulations significantly faster, especially when using multiple CPUs and GPUs.
Current Version
AmberTools23 and Amber22 are available on SeaWulf as a module. This can be accessed with:
module load amber/22
Amber Examples
The example simulation we will be running comes from a 50 nanoseconds (ns) explicit solvent molecular dynamics (MD) production run of a protein system. The system has undergone prior preparation, including building and equilibrating the protein structure in explicit solvent. While most simulations are designed to run over the course of several hours, this experiment has been significantly shortened to provide a more practical example.
AmberMD jobs will require three distinct files.
1. Input file (input.mdin) - This file specifies the parameters and settings for the simulation.
2. Topology file (topology.prmtop) - This file includes the necessary information regarding atom types, charges, and connectivity for defining the force field and structure of the system. This file is typically generated using software tools such as LeAP.
3. Coordinate file (coordinates.rst7) This file contains the initial coordinates of the atoms in the system. This file is typically generated from experimental data or previous simulations.
First make a new directory and copy the example slurm script, input file, topology file, and coordinate file from the samples folder:
cd /gpfs/projects/samples/amber_samples mkdir -p $HOME/amber_examples cp * $HOME/amber_examples && cd $_
Next you will be running the simulation via a slurm batch job. AmberMD has implemented multiple versions of the program to optimize your simulation for the available hardware. We have provided example slurm scripts for pmemd, pmemd.MPI, pmemd.cuda, and pmemd.cuda.MPI. Because of the large number of output files from these programs, it is unwise to submit multiple jobs from the same directory.
pmemd
Pmemd stands for Particle Mesh Ewald Molecular Dynamics. It's an optimized and high-performance molecular dynamics simulation program designed to efficiently utilize parallel processing. pmemd is well-suited for large-scale simulations and high-core-count computing environments, allowing for detailed studies of complex molecular systems.
The simulation will be run using the following command within a slurm script:
$AMBERHOME/bin/pmemd -O -i input.mdin -p topology.prmtop -c coordinates.rst7 \ -ref coordinates.rst7 -o output.mdout -r restart.rst7 -x trajectory.nc
Important flags to consider:
-O Overwrite output files
-i input file
-p topology file
-c the starting coordinate file
-ref reference coordinates; use the same as the starting coordinates
-o output file
-r restart file (last set of xyz coordinates from the simulation)
-x file with trajectory
amber_md.slurm:
#!/bin/bash #SBATCH --job-name=amber_md #SBATCH --partition=short-40core #SBATCH --nodes=1 #SBATCH --ntasks-per-node=1 #SBATCH --cpus-per-task=40 #SBATCH --time=4:00:00 #SBATCH --output=amber_md.out module load amber/22 $AMBERHOME/bin/pmemd -O -i input.mdin -p topology.prmtop -c coordinates.rst7 \ -ref coordinates.rst7 -o output.mdout -r restart.rst7 -x trajectory.nc
Then load the Slurm module and submit the job script to the scheduler:
module load slurm sbatch amber_md.slurm
pmemd.MPI
Pmemd.MPI extends the capabilities of pmemd by supporting Message Passing Interface (MPI) for parallelization across multiple nodes. By distributing computational tasks among multiple processors, pmemd.MPI can significantly accelerate simulations, making it particularly useful for simulating large biomolecular systems or for reducing simulation time in time-sensitive research projects.
amber_md_MPI.slurm:
#!/bin/bash #SBATCH --job-name=amber_md_mpi #SBATCH --partition=short-40core #SBATCH --nodes=2 #SBATCH --ntasks-per-node=1 #SBATCH --cpus-per-task=40 #SBATCH --time=4:00:00 #SBATCH --output=amber_md_mpi.out module load amber/22 mpirun -np 80 $AMBERHOME/bin/pmemd.MPI -O -i input.mdin -p topology.prmtop -c coordinates.rst7 \ -ref coordinates.rst7 -o output.mdout -r restart.rst7 -x trajectory.nc
Load the Slurm module and submit the job script to the scheduler:
module load slurm sbatch amber_md_MPI.slurm
pmemd.cuda
Pmemd.cuda is a variant of pmemd optimized for Graphics Processing Unit (GPU) acceleration. It leverages the parallel computing power of GPUs to accelerate molecular dynamics simulations. This acceleration is especially beneficial for large-scale systems, where traditional CPU-based simulations may be computationally expensive or time-consuming. Pmemd.cuda allows researchers to perform simulations faster, enabling more extensive exploration of molecular systems within a given time frame.
amber_md_cuda.slurm:*
#!/bin/bash #SBATCH --job-name=amber_md_cuda #SBATCH --partition=gpu #SBATCH --nodes=1 #SBATCH --gpus-per-task=1 #SBATCH --time=4:00:00 #SBATCH --output=amber_md_gpu.out module load amber/22 $AMBERHOME/bin/pmemd.cuda -O -i input.mdin -p topology.prmtop -c coordinates.rst7 \ -ref coordinates.rst7 -o output.mdout -r restart.rst7 -x trajectory.nc
*while this example uses the gpu queue, it can also be run using the A100 GPUs via the a100 queue.
Load the Slurm module and submit the job script to the scheduler:
module load slurm sbatch amber_md_cuda.slurm
pmemd.cuda.MPI
Pmemd.cuda.MPI combines the advantages of both CUDA optimization and MPI parallelization. It allows simulations to be distributed across multiple GPUs on multiple nodes, further enhancing the performance and scalability of molecular dynamics simulations. By harnessing the combined power of GPU acceleration and MPI parallelization, pmemd.cuda.MPI enables researchers to tackle even larger and more complex molecular systems efficiently.
amber_md_cuda_MPI.slurm:*
#!/bin/bash #SBATCH --job-name=amber_md_cuda_mpi #SBATCH --partition=gpu #SBATCH --nodes=1 #SBATCH --gres=gpu:2 #SBATCH --time=4:00:00 #SBATCH --output=amber_md_gpu.out module load amber/22 mpirun -np 2 $AMBERHOME/bin/pmemd.cuda.MPI -O -i input.mdin -p topology.prmtop \ -c coordinates.rst7 -ref coordinates.rst7 -o output.mdout -r restart.rst7 -x trajectory.nc
*while this example uses the gpu queue, it can also be run using the A100 GPUs via the a100 queue.
Load the Slurm module and submit the job script to the scheduler:
module load slurm sbatch amber_md_cuda_MPI.slurm
Analysis
The submission scripts will produce several output files.
- amber_md.out -- standard output (stdout) from the simulation job.
- mdinfo -- information about the molecular dynamics run including performance metrics and timing information.
- output.mdout -- main output from the molecular dynamics simulation.
- restart.rst7 -- restart file saves the current state of the simulation. Allows the simulation to be paused and resumed where it left off, useful for long and complex simulations.
- trajectory.nc -- contains the trajectory data generate during the simulation. This includes the positions, velocities, and properties of atoms over time.
The next step is to analyze the calculated trajectory. This can be done using the cpptraj program. Tutorials regarding trajectory analysis can be found here.
Visualizing the Results
Visualization of the trajectory is also possible using software such as VMD and Chimera. SeaWulf currently supports VMD 1.9.2 and Chimera 1.13.1.
Load the VMD module with:
module load vmd/1.9.2
Load the Chimera module with:
module load Chimera/1.13.1
Click here to view an extensive tutorial on using VMD in conjunction with AMBER.