GROMACS蛋白配体复合物模拟
编辑gromacs小分子的配体的电荷问题一直时gromacs模拟蛋白-配体复合物中的难点与重点~网上有许多相关方面的教程,但是都没有成为系统,应李继存老师之邀,把最近摸索的方法过程总结出来,分享给大家~感谢网络上的朋友对我的帮助~
一、配体文件构建
1. 高斯输入文件建立:
antechamber -i lig.mol2 -fi mol2 -o lig.gjf -fo gcrt -pf y -gm "%mem=32GB" -gn "%nproc=12" -nc 0 -gk "#HF/6-31G*SCF=tight Test Pop=MK iop(6/33=2) iop(6/42=6) iop(6/50=1) opt nosymm" -ge lig.gesp -gv 1 -at gaff2
当然也可以进行基组的更换:
antechamber -i lig.mol2 -fi mol2 -o lig.gjf -fo gcrt -pf y -gm "%mem=32GB" -gn "%nproc=12" -nc 0 -gk "#b3lyp/6-31G(d) SCF=tight Test Pop=MK iop(6/33=2) iop(6/42=6) iop(6/50=1) opt nosymm" -ge lig.gesp -gv 1 -at gaff2
高斯运行:
g16 lig.gjf
当然还有一种方法进行计算
antechamber -i lig2.pdb -fi pdb -o LIG.mol2 -fo mol2 -pf y -c bcc -s 2 -rn LIG -at gaff2
2.拟合成resp文件
antechamber -i lig.gesp -fi gesp -o LIG.mol2 -fo mol2 -pf y -c resp -rn LIG -at gaff2
3.生成缺失参数文件
parmchk2 -i LIG.mol2 -f mol2 -o lig.frcmod -s gaff2
生成AMBER参数文件以及坐标文件,由于现在最新的ff19sb力场在gromacs里面使用还是不那么完善,所以目前我还是用的
ff14sb
source leaprc.protein.ff14SB
source leaprc.gaff2
loadamberparams lig.frcmod
lig=loadmol2 LIG.mol2
check lig
saveamberparm lig lig.prmtop lig.inpcrd
quit
命名为leap.in
tleap -f leap.in
4. 力场转化
安装acpype,当然也可以用卢老师的sobtop,更加方便
conda create -n acpypemd python=3.8
conda activate acpypemd
conda install -c conda-forge acpype
力场转化:
acpype -p lig.prmtop -x lig.inpcrd -d
二、蛋白模拟
1. 蛋白力场生成
由于用的ff14sb,需要下载力场文件:
https://github.com/intbio/gromacs_ff
进行蛋白的模拟:
gmx_mpi pdb2gmx -f re.pdb -o protein_processed.gro -water tip3p -ignh
如果报错可以尝试使用pdbfixer进行修复一下:
conda install -c conda-forge pdbfixer
2. 复合物力场构建
复制一份protein_processed.gro并重新命名为complex.gro
将得到的LIG_GMX.top中顶部的[defaults],[ atomtypes ],底部的[ system ],[ molecules ]删除,改名为LIG.itp。记得[ moleculetype ]需要被保留!!
将LIG_GMX.gro中内容放入complex.gro中:
示例如下:
1 LIG C1 1 17.821 15.972 15.867
1 LIG C2 2 17.782 15.845 15.831
1 LIG C3 3 17.651 15.821 15.787
1 LIG C4 4 17.559 15.924 15.780
1 LIG C5 5 17.596 16.054 15.819
1 LIG C6 6 17.730 16.076 15.858
1 LIG C7 7 17.423 15.887 15.725
1 LIG O1 8 17.376 16.146 15.829
1 LIG C8 9 17.505 16.172 15.821
1 LIG C9 10 17.105 15.994 15.670
1 LIG C10 11 17.139 16.145 15.664
1 LIG C11 12 17.146 16.210 15.804
1 LIG C12 13 17.285 16.258 15.843
1 LIG C13 14 17.401 15.932 15.581
1 LIG C14 15 17.274 15.892 15.511
1 LIG C15 16 17.149 15.918 15.547
1 LIG O2 17 17.546 16.285 15.822
1 LIG C16 18 17.289 16.310 15.986
1 LIG O3 19 17.484 15.992 15.520
1 LIG O4 20 17.872 15.747 15.840
1 LIG H1 21 17.835 15.664 15.810
1 LIG O5 22 17.778 16.195 15.893
1 LIG H2 23 17.714 16.264 15.872
1 LIG H3 24 17.921 15.992 15.899
1 LIG H4 25 17.624 15.722 15.756
1 LIG H5 26 17.344 15.924 15.788
1 LIG H6 27 17.413 15.778 15.724
1 LIG H7 28 16.998 15.982 15.683
1 LIG H8 29 17.150 15.951 15.758
1 LIG H9 30 17.065 16.195 15.603
1 LIG H10 31 17.234 16.158 15.613
1 LIG H11 32 17.110 16.141 15.880
1 LIG H12 33 17.081 16.297 15.809
1 LIG H13 34 17.318 16.335 15.775
1 LIG H14 35 17.292 15.844 15.416
1 LIG H15 36 17.072 15.888 15.479
1 LIG H16 37 17.387 16.347 16.012
1 LIG H17 38 17.218 16.391 15.998
1 LIG H18 39 17.262 16.231 16.056
并且需要修改顶部的原子数目,改为两个gro文件之和
在topol文件中添加小分子的位置限制:
; Include Position restraint file
#ifdef POSRES
#include "posre.itp"
#endif
; Include ligand topology
#include "LIG.itp"
; Include water topology
#include "amber99sb.ff/tip3p.itp"
[ molecules ]
; Compound #mols
Protein 1
LIG 1
[注:增加了itp文件,以及molecules增加了小分子,即:
; Include ligand topology
#include "LIG.itp"
LIG 1
将[ atomtypes ]内的内容写入top文件中
位于[ moleculetype ]上部,例如:
;新增部分如下
[ atomtypes ]
;name bond_type mass charge ptype sigma epsilon Amb
ca ca 0.00000 0.00000 A 3.31521e-01 4.13379e-01 ; 1.86 0.0988
c3 c3 0.00000 0.00000 A 3.39771e-01 4.51035e-01 ; 1.91 0.1078
os os 0.00000 0.00000 A 3.15610e-01 3.03758e-01 ; 1.77 0.0726
c c 0.00000 0.00000 A 3.31521e-01 4.13379e-01 ; 1.86 0.0988
ce ce 0.00000 0.00000 A 3.31521e-01 4.13379e-01 ; 1.86 0.0988
c2 c2 0.00000 0.00000 A 3.31521e-01 4.13379e-01 ; 1.86 0.0988
o o 0.00000 0.00000 A 3.04812e-01 6.12119e-01 ; 1.71 0.1463
oh oh 0.00000 0.00000 A 3.24287e-01 3.89112e-01 ; 1.82 0.0930
ho ho 0.00000 0.00000 A 5.37925e-02 1.96648e-02 ; 0.30 0.0047
ha ha 0.00000 0.00000 A 2.62548e-01 6.73624e-02 ; 1.47 0.0161
hc hc 0.00000 0.00000 A 2.60018e-01 8.70272e-02 ; 1.46 0.0208
h1 h1 0.00000 0.00000 A 2.42200e-01 8.70272e-02 ; 1.36 0.0208
[ moleculetype ]
; Name nrexcl
Protein_chain_A 3
3.定义单元盒子
gmx_mpi editconf -f complex.gro -o newbox.gro -bt cubic -d 1.0
4.加水
gmx_mpi solvate -cp newbox.gro -cs spc216.gro -p topol.top -o solv.gro
5.添加离子
ions.mdp文件如下:
; LINES STARTING WITH ';' ARE COMMENTS
title = Minimization ; Title of run
; Parameters describing what to do, when to stop and what to save
integrator = steep ; Algorithm (steep = steepest descent minimization)
emtol = 1000.0 ; Stop minimization when the maximum force < 10.0 kJ/mol
emstep = 0.01 ; Energy step size
nsteps = 50000 ; Maximum number of (minimization) steps to perform
; Parameters describing how to find the neighbors of each atom and how to calculate the interactions
nstlist = 1 ; Frequency to update the neighbor list and long range forces
cutoff-scheme = Verlet
ns_type = grid ; Method to determine neighbor list (simple, grid)
rlist = 1.0 ; Cut-off for making neighbor list (short range forces)
coulombtype = cutoff ; Treatment of long range electrostatic interactions
rcoulomb = 1.0 ; long range electrostatic cut-off
rvdw = 1.0 ; long range Van der Waals cut-off
pbc = xyz ; Periodic Boundary Conditions
离子添加
gmx_mpi grompp -f ions.mdp -c solv.gro -p topol.top -o ions.tpr
gmx_mpi genion -s ions.tpr -o solv_ions.gro -p topol.top -pname NA -nname CL -neutral
替换 15 SOL内的元素
6.能量最小化
em.mdp文件内容如下:
; LINES STARTING WITH ';' ARE COMMENTS
title = Minimization ; Title of run
; Parameters describing what to do, when to stop and what to save
integrator = steep ; Algorithm (steep = steepest descent minimization)
emtol = 1000.0 ; Stop minimization when the maximum force < 10.0 kJ/mol
emstep = 0.01 ; Energy step size
nsteps = 50000 ; Maximum number of (minimization) steps to perform
; Parameters describing how to find the neighbors of each atom and how to calculate the interactions
nstlist = 1 ; Frequency to update the neighbor list and long range forces
cutoff-scheme = Verlet
ns_type = grid ; Method to determine neighbor list (simple, grid)
rlist = 1.2 ; Cut-off for making neighbor list (short range forces)
coulombtype = PME ; Treatment of long range electrostatic interactions
rcoulomb = 1.2 ; long range electrostatic cut-off
vdwtype = cutoff
vdw-modifier = force-switch
rvdw-switch = 1.0
rvdw = 1.2 ; long range Van der Waals cut-off
pbc = xyz ; Periodic Boundary Conditions
DispCorr = no
可以根据进行修改例如emtol等等的修改
gmx_mpi grompp -f em.mdp -c solv_ions.gro -p topol.top -o em.tpr
gmx_mpi mdrun -v -deffnm em
7. NVT
首先对配体进行限制以免爆开
gmx_mpi make_ndx -f LIG.gro -o index_lig.ndx
选择所有的非氢数据
> 0 & ! a H*
> q
gmx_mpi genrestr -f LIG.gro -n index_lig.ndx -o posre_lig.itp -fc 1000 1000 1000
增加配体约束在topol文件中的Include water topology之前
; Ligand position restraints
#ifdef POSRES
#include "posre_lig.itp"
#endif
例如:
; Include ligand topology
#include "LIG.itp"
; Ligand position restraints
#ifdef POSRES
#include "posre_lig.itp"
#endif
; Include water topology
#include "./amber14sb_parmbsc1_cufix.ff/tip3p.itp"
构建索引:
gmx_mpi make_ndx -f em.gro -o index.ndx
选择蛋白和配体
> 1 | 13
> q
构建 nvt.mdp文件例如如下:
title = Protein-ligand complex NVT equilibration
define = -DPOSRES ; position restrain the protein and ligand
; Run parameters
integrator = md ; leap-frog integrator
nsteps = 50000 ; 2 * 50000 = 100 ps
dt = 0.002 ; 2 fs
; Output control
nstenergy = 500 ; save energies every 1.0 ps
nstlog = 500 ; update log file every 1.0 ps
nstxout-compressed = 500 ; save coordinates every 1.0 ps
; Bond parameters
continuation = no ; first dynamics run
constraint_algorithm = lincs ; holonomic constraints
constraints = h-bonds ; bonds to H are constrained
lincs_iter = 1 ; accuracy of LINCS
lincs_order = 4 ; also related to accuracy
; Neighbor searching and vdW
cutoff-scheme = Verlet
ns_type = grid ; search neighboring grid cells
nstlist = 20 ; largely irrelevant with Verlet
rlist = 1.2
vdwtype = cutoff
vdw-modifier = force-switch
rvdw-switch = 1.0
rvdw = 1.2 ; short-range van der Waals cutoff (in nm)
; Electrostatics
coulombtype = PME ; Particle Mesh Ewald for long-range electrostatics
rcoulomb = 1.2 ; short-range electrostatic cutoff (in nm)
pme_order = 4 ; cubic interpolation
fourierspacing = 0.16 ; grid spacing for FFT
; Temperature coupling
tcoupl = V-rescale ; modified Berendsen thermostat
tc-grps = Protein_LIG Water_and_ions ; two coupling groups - more accurate
tau_t = 0.1 0.1 ; time constant, in ps
ref_t = 300 300 ; reference temperature, one for each group, in K
; Pressure coupling
pcoupl = no ; no pressure coupling in NVT
; Periodic boundary conditions
pbc = xyz ; 3-D PBC
; Dispersion correction is not used for proteins with the C36 additive FF
DispCorr = no
; Velocity generation
gen_vel = yes ; assign velocities from Maxwell distribution
gen_temp = 300 ; temperature for Maxwell distribution
gen_seed = -1 ; generate a random seed
其中nvt的时间可以修改,主要是看是否平衡,此处我未进行修改
执行nvt:
gmx_mpi grompp -f nvt.mdp -c em.gro -r em.gro -p topol.top -n index.ndx -o nvt.tpr
gmx_mpi mdrun -v -deffnm nvt
8. NPT
npt模拟的mdp如下:
title = Protein-ligand complex NPT equilibration
define = -DPOSRES ; position restrain the protein and ligand
; Run parameters
integrator = md ; leap-frog integrator
nsteps = 250000 ; 2 * 250000 = 500 ps
dt = 0.002 ; 2 fs
; Output control
nstenergy = 500 ; save energies every 1.0 ps
nstlog = 500 ; update log file every 1.0 ps
nstxout-compressed = 500 ; save coordinates every 1.0 ps
; Bond parameters
continuation = yes ; continuing from NVT
constraint_algorithm = lincs ; holonomic constraints
constraints = h-bonds ; bonds to H are constrained
lincs_iter = 1 ; accuracy of LINCS
lincs_order = 4 ; also related to accuracy
; Neighbor searching and vdW
cutoff-scheme = Verlet
ns_type = grid ; search neighboring grid cells
nstlist = 20 ; largely irrelevant with Verlet
rlist = 1.2
vdwtype = cutoff
vdw-modifier = force-switch
rvdw-switch = 1.0
rvdw = 1.2 ; short-range van der Waals cutoff (in nm)
; Electrostatics
coulombtype = PME ; Particle Mesh Ewald for long-range electrostatics
rcoulomb = 1.2
pme_order = 4 ; cubic interpolation
fourierspacing = 0.16 ; grid spacing for FFT
; Temperature coupling
tcoupl = V-rescale ; modified Berendsen thermostat
tc-grps = Protein_LIG Water_and_ions ; two coupling groups - more accurate
tau_t = 0.1 0.1 ; time constant, in ps
ref_t = 300 300 ; reference temperature, one for each group, in K
; Pressure coupling
pcoupl = C-rescale ; pressure coupling is on for NPT
pcoupltype = isotropic ; uniform scaling of box vectors
tau_p = 2.0 ; time constant, in ps
ref_p = 1.0 ; reference pressure, in bar
compressibility = 4.5e-5 ; isothermal compressibility of water, bar^-1
refcoord_scaling = com
; Periodic boundary conditions
pbc = xyz ; 3-D PBC
; Dispersion correction is not used for proteins with the C36 additive FF
DispCorr = no
; Velocity generation
gen_vel = no ; velocity generation off after NVT
其中npt的时间可以修改,主要是看是否平衡,此处我未进行修改
执行nvt:
gmx_mpi grompp -f npt.mdp -c nvt.gro -t nvt.cpt -r nvt.gro -p topol.top -n index.ndx -o npt.tpr
gmx_mpi mdrun -v -deffnm npt
9. MD 正式模拟
md的mdp文件如下:
title = Protein-ligand complex MD simulation
; Run parameters
integrator = md ; leap-frog integrator
nsteps = 5000000 ; 2 * 5000000 = 10000 ps (10 ns)
dt = 0.002 ; 2 fs
; Output control
nstenergy = 5000 ; save energies every 10.0 ps
nstlog = 5000 ; update log file every 10.0 ps
nstxout-compressed = 5000 ; save coordinates every 10.0 ps
; Bond parameters
continuation = yes ; continuing from NPT
constraint_algorithm = lincs ; holonomic constraints
constraints = h-bonds ; bonds to H are constrained
lincs_iter = 1 ; accuracy of LINCS
lincs_order = 4 ; also related to accuracy
; Neighbor searching and vdW
cutoff-scheme = Verlet
ns_type = grid ; search neighboring grid cells
nstlist = 20 ; largely irrelevant with Verlet
rlist = 1.2
vdwtype = cutoff
vdw-modifier = force-switch
rvdw-switch = 1.0
rvdw = 1.2 ; short-range van der Waals cutoff (in nm)
; Electrostatics
coulombtype = PME ; Particle Mesh Ewald for long-range electrostatics
rcoulomb = 1.2
pme_order = 4 ; cubic interpolation
fourierspacing = 0.16 ; grid spacing for FFT
; Temperature coupling
tcoupl = V-rescale ; modified Berendsen thermostat
tc-grps = Protein_LIG Water_and_ions ; two coupling groups - more accurate
tau_t = 0.1 0.1 ; time constant, in ps
ref_t = 300 300 ; reference temperature, one for each group, in K
; Pressure coupling
pcoupl = Parrinello-Rahman ; pressure coupling is on for NPT
pcoupltype = isotropic ; uniform scaling of box vectors
tau_p = 2.0 ; time constant, in ps
ref_p = 1.0 ; reference pressure, in bar
compressibility = 4.5e-5 ; isothermal compressibility of water, bar^-1
; Periodic boundary conditions
pbc = xyz ; 3-D PBC
; Dispersion correction is not used for proteins with the C36 additive FF
DispCorr = no
; Velocity generation
gen_vel = no ; continuing from NPT equilibration
正式模拟:
gmx_mpi grompp -f md.mdp -c npt.gro -t npt.cpt -p topol.top -n index.ndx -o md_0_10.tpr
gmx_mpi mdrun -v -deffnm md_0_10
三、数据分析
- 0
- 0
-
分享