Water molecules play significant roles in biological process through interacting with all kinds of solutes, such as proteins, organic compounds and ions. Many research have been performed on the thermal dynamics and kinetics of water molecules in solutions experimentally (infrared spectroscopy,Raman spectroscopy) and computationally (MD). They are so popular and tiny that makes them negligible to the research compared to large and distinct molecules. This study is to investigate the functionalities that water molecules might have on solute molecules through molecular simulation.
The anomalies from water were believed to be connected to the hydrogen bond network in liquid water. We will be studying the hydrogen bond network properties, and investigate how it is influenced by solutes in this network.
Table of Contents
1.Simulation Parameters for Water Molecule and Hydrogen Bond
1.1. Water Molecule
1.2. Hydrogen Bond
2.Simulation Protocol
3.Properties of Hydrogen Bond
3.1.Degree of Hydrogen Bond
3.2.Hydrogen Bond Lifetime
2.Hydrogen Bond Network in Liquid Water
2.1. Hydrogen Bond Lifetime
2.2. Complex Network Analysis
3.Water Bridges
3.1. Water Bridge Calculation
7.2Protein In Water Simulation
7.2Water Network & Water Bridge
7.2.1 Complex Network Analysis
1.Simulation Parameters for Water Molecule and Hydrogen Bond
Water molecule is V-shaped tetrahedrally arranged through sp3-hybridization. The ideal bond angle for perfect tetrahedral structure is 109.5º, but in case of water , the lone pairs are not positioned as they are with covalent bond, so the negative charge is more "evenly smeared out" between these two lines, this effect makes the hydrogen bonding less stringent and more flexible in angles.
The bond length of OH and angle HOH increase as from gaseous state to solid ice state.
gas | liquid | solid* | |
---|---|---|---|
r(OH) Å | 0.95718 | ~0.990 | ~1.01 |
HOH deg | 104.474° | ~105.5° | ~109.5° |
* https://en.wikipedia.org/wiki/Ice_Ih
below are two water models in my research:
TIP3P | SPC/E | |
---|---|---|
r(OH) Å | 0.9572 | 1.0 |
HOH deg | 104.52 | 109.47 |
q(O) | -0.834 | -0.8476 |
q(H) | +0.417 | +0.4238 |
Hydrogen bond is NOT just dipole-dipole interaction OR ion-dipole interaction, it is a weak covalent bond in which hydrogen bridges two different molecules, it is a partial covalent bond formation between H and A......, see IUPAC definition of hydrogen bond here.
The mean donor-acceptor distances in protein secondary structure elements are close to 3.0 Å, (Jeffrey). Jeffrey categorizes hbonds with donor-acceptor distances of 2.2-2.5 as "strong, mostly covalent", 2.5-3.2 Å as "moderate, mostly electrostatic", and 3.2-4.0 Å as "weak, electrostatic". Energies are given as 40-15, 14-4, and <4 kcal/mol respectively. Most hydrogen bonds in proteins are in the moderate category.
The parameters vary from their environment, hydrogen bonding, solutes, ions, etc..The molecular simulation employ fixed parameters which is not actual situation in reality, so we need to show some flexibility when we extract and analyze our data. In this research, we choose geometric definition and employ distance of O-H-O as 3.5Å, and angle between 150º~180º.
We will be using CHARMM, NAMD and AMBER to run the simulation, the detailed information can be found in their linked pages.
timestep | 0.002ps |
cutoff | 14 |
NPT | 1atm |
temp | 300K |
shake | off |
STEP1: load production trajectory into vmd and calculate all hbonds.
vmd md.psf md.dcd
open tcl console and type:
source ~/toolset/water_bridge.tcl
start_hbonding -water
Hbond information will be saved in ./hbonds_water.txt.
STEP2:Calculate degree of water hbonding.
R
>source("~/toolset/deg.R")
("watch -n 1 cat tmp" to watch the progress of calculation)
this will save degree information in deg.txt:
0.1626174 0.3690746 0.313681 0.1305475 0.02332362 0.0007558579 0
.....
every line indicate average frequency of 0,1,2,3,4,5,6 hydrogen bonds in one frame.
now, we plot the degree of hbonding:
R:
z<-read.table("deg.txt")
z.means<-colMeans(z)
df=data.frame(z.means)
df$degree<-c(0,1,2,3,4,5,6,7)
ggplot(df,aes(x=degree,y=z.means))+geom_bar(stat="identity",fill="red")+theme_bw()+xlab("degree of hydrogen bonding")+ylab("fraction")+theme(axis.text=element_text(size=20),axis.title=element_text(size=24,face="bold"))+scale_x_continuous(breaks=c(0,1,2,3,4,5,6,7))
library(data.table)
library(ggplot2)
library(RColorBrewer) #display.brewer.all() to see all color palettes
filelist<-c("./Pure_9261/deg.txt","./NaCl_1/deg.txt","./Na20Cl20/deg.txt","./Mg20Cl40_1/deg.txt","./Al20Cl60_1/deg.txt","./Na650Cl650/deg.txt","./Na1300Cl1300/deg.txt")
temp <- lapply(filelist, fread, sep=" ")
data<-melt(mapply(c, colMeans(temp[[1]]), colMeans(temp[[2]]),colMeans(temp[[3]]),colMeans(temp[[4]]),colMeans(temp[[5]]),colMeans(temp[[6]]),colMeans(temp[[7]])))
for (i in 1:nrow(data)) {if (data$Var1[i] ==1) {data$Var1[i]<-"Pure_9261"} else if (data$Var1[i] == 2) {data$Var1[i]<-"NaCl"} else if (data$Var1[i] == 3) {data$Var1[i]<-"Na20Cl20"} else if (data$Var1[i] ==4) {data$Var1[i]<-"Mg20Cl40"} else if (data$Var1[i] == 5) {data$Var1[i]<-"Al20Cl60"} else if (data$Var1[i] == 6) {data$Var1[i]<-"Na650Cl650"} else if (data$Var1[i] == 7) {data$Var1[i]<-"Na1300Cl1300"} }
for (i in 1:8) {nam<-paste("V",i,sep="");levels(data$Var2)[levels(data$Var2)==nam]<-i}
data$Var1<-factor(c("Pure_9261","NaCl","Na20Cl20","Mg20Cl40","Al20Cl60","Na650Cl650","Na1300Cl1300"),levels=c("Pure_9261","NaCl","Na20Cl20","Mg20Cl40","Al20Cl60","Na650Cl650","Na1300Cl1300"))
ggplot(data,aes(x=Var2,y=value,fill=Var1))+geom_bar(stat="identity",position="dodge")+labs(title="Comparison of Degree of HBonds",x="Degree of HBonds",y="Frequency")+theme(axis.title=element_text(size=20),axis.text=element_text(size=12),legend.key.size=unit(2.5,"cm"),legend.title=element_blank(),plot.title=element_text(size=30))+scale_fill_brewer(palette = "Greens")
As we see from graph above, the degree of hbonds in bulk water show a normal distribution like pattern, the pattern was altered by adding ions into it. Adding ions into the the box significantly increase the population of hydrogen bonds with less degree, while decreasing the population of hydrogen bonds with higher degree except Mg ion which show a similar pattern as bulk water. The high the concentration the distribution shift to less degree of hydrogen bonds.
One factor of dynamics of H-bond network is hydrogen bond lifetime,The relaxation time for water dipole reorientation at room temperature is about 10ps, the average hydrogen bond lifetime will roughly agree with (or smaller than) this value because reorientation is impossible without breaking at least one hbond. The picture below is calculation of average hbond lifetime () from reference 5. and 8.

( is the mean of the distribution of hbond lifetime P(t), which measures the probability that an initially bonded pair remains bonded at all times up to time t, and breaks at t. P(t) is obtained from simulations by building a histogram of hbond lifetimes for each configuration, .
In this research, for simplicity, we just calculate the discontinuous hbond lifetime, that is we sum up all times that a certain hydrogen bond spans over the whole trajectory no matter continuous or discontinuous in between.
Amber simulation of Pure water and water with salt in it can be found here.
Pure_6859(6859 water molecules)
Pure_8000(8000 water molecules)
Pure_9261(9261 water molecules)
NaCl(9259 water molecules, 1 Na+, 1 Cl-, 27779 atoms total,~0.007M)
Na20Cl20(9221 water molecules,20 Na+, and 20 Cl-,27703 atoms total,~0.13M)
Mg20Cl40(9261 water molecules,20 Mg2+, and 40Cl-, 27663 atoms total,~0.13M)
Al20Cl60(9181 water molecules, 20 Al3+, and 60 Cl-, 27623 atoms total, ~0.13M)
Na650Cl650(7961 water molecules, 650 Na+, 650 Cl-,25183 atoms total,~4.1M)
Na1300Cl1300(6661 water molecules, 1300 Na+, 1300 Cl-,22583 atoms total, ~8.2M)
Amber simulation of "TOP100" proteins can be found here. To study the influence of proteins on hydrogen bond network, we performed molecular simulation of 96 proteins from dynameo benchmark, each protein was downloaded from PDB data bank, and only amino residues were kept, and blocked by ACE and NME at the N-terminal and C-terminal respectively.
Following are steps to analyze the lifetime(discontinuous) of hydrogen bonds in pure water and salt solutions, run the same commands in each directory:
vmd water_box.prmtop -netcdf md.mdcrd
(delete beginning 9000 frames by choosing last=8999, we will only analyze last 1000 frames)
tcl/tk windows:
source ~/toolset/water_bridge.tcl
start_hbonding -water -residence
bash shell:
residence4.pl hbonds_residence_all.txt [name_of_system] 1
copy loophbond.sh loophbond.tcl to ./TOP100.
./loophbond.sh
./loopresidence4R.sh
cat *_hbonds_residence_all_R.txt>tmp.txt
awk '{$1="protein";print $0}' tmp.txt > protein_hbonds_residence_all_R.txt
R:
closeAllConnections()
rm(list=ls())
library(data.table)
library(ggplot2)
filelist<-c("./Pure_6859/hbonds_residence_all_R.txt","./Pure_8000/hbonds_residence_all_R.txt","./Pure_9261/hbonds_residence_all_R.txt","./NaCl_1/hbonds_residence_all_R.txt","./Na20Cl20_1/hbonds_residence_all_R.txt","./Mg20Cl40_1/hbonds_residence_all_R.txt","./Al20Cl60_1/hbonds_residence_all_R.txt","./Na650Cl650/hbonds_residence_all_R.txt","./Na1300Cl1300/hbonds_residence_all_R.txt","../../../work2/TOP100/protein_hbonds_residence_all_R.txt")
temp <- lapply(filelist, fread, sep=" ")
data<-melt(temp)
Na1300<-subset(data,V1=='Na1300Cl1300')
Na650<-subset(data,V1=='Na650Cl650')
rest<-subset(data,V1!='Na650Cl650' & V1!='Na1300Cl1300')
medx_Na1300<-median(Na1300$value)
medx_Na650<-median(Na650$value)
medx_rest<-median(rest$value)
ggplot(data,aes(x=value,colour=V1,fill=V1))+geom_density(alpha=0)+scale_colour_manual(values=c("black","black","red","black","green","black","blue","black","black","black"))+xlim(0,100)+theme(axis.title=element_text(size=20),axis.text=element_text(size=12),legend.key.size=unit(2.5,"cm"),legend.title=element_blank(),plot.title=element_text(size=30))+labs(title="lifetime of HBonds",x="steps(in hundreds)")+geom_vline(xintercept=medx_Na1300,colour="red",linetype="longdash")+geom_vline(xintercept=medx_Na650,colour="green",linetype="longdash")+geom_vline(xintercept=medx_rest,colour="black",linetype="longdash")+annotate("text",x=9,y=0.01,label="10",angle=90)+annotate("text",x=11,y=0.01,label="12",angle=90)+annotate("text",x=15,y=0.01,label="15",angle=90)
As we see from the graph above,majority of hbonds lasts within 5000 steps which is 10ps, the average lifetime for most system except Na1300Cl1300 and Na560Cl650 is about 10 steps(=10*0.002ps=0.02ps), while the average lifetime for Na1300Cl1300 and Na560Cl650 are 12 steps (12*0.002ps=0.024ps) and 16 steps (16* 0.002ps=0.036ps) respectively. As the ionic strength increase, the distribution of lifetime shift to longer life time, which means the existence of ions in solution extend the lifetime of hbonds. Since this is statistics about whole box of waters, a very small portion of ions will not significantly change the lifetime of hydrogen bonds of most water molecules inside the box, and obviously the existence of protein will not influence the lifetime of hydrogen bonds in bulk water as well.
To further invesigate the influence of solutes to hbonds lifetime, we need to take closer look at hydrogen bonds which are adjacent to solute molecules. To realize this, we invent a variale called adjacent factor, which takes the form of equation below:
in which:
0<base# <1
distance_to_# is the shortest distance from hbond donor to the terminal carbon atoms of classified groups which shows in the table below:
name description
cations AL,MG,NA
ions CL
protein_A Aminoacids with Positively Charged Side Chain:Arg,His or HIE,Lys
protein_B Aminoacids with Negtively Charged Side Chain:Asp,Glu
protein_C Aminoacids with Polar Uncharged Side Chain:Ser,Thr,Asn,Gln
protein_D Special Cases:Cys,Gly,Pro
Protein_E Aminoacids with Hydorphobic Side Chain:Ala,Ile,Leu,Met,Phe,Trp,Tyr,Val
Accumulated adjacent factor:
"It appears that although the occupancy and residence times of the majority of sites are rather bulk-like, the residence time distribution is shifted toward the longer components, relative to bulk. The sites with particularly long residence times are located only in the cavities and clefts of the protein. This indicates that other factors, such as hydrogen bonds and hydrophobicity of underlying protein residues, play a lesser role in determining the residence times of the longest-lived sites." (7)
filelist<-c("./Pure_9261/Pure9261_hbonds_water_residence.txt","./NaCl_1/NaCl_hbonds_water_residence.txt","./Na20Cl20_1/Na20Cl20_hbonds_water_residence.txt","./Mg20Cl40_1/Mg20Cl40_hbonds_water_residence.txt","./Al20Cl60_1/Al20Cl60_hbonds_water_residence.txt","./Na650Cl650/Na650Cl650_hbonds_water_residence.txt","./Na1300Cl1300/Na1300Cl1300_hbonds_water_residence.txt")
temp <- lapply(filelist, fread, sep=",",select=4:5)
data<-melt(temp,id.vars='V4')
data$L1<-as.character(data$L1)
ggplot(data,aes(x=V4,y=value,color=L1))+geom_point()+scale_colour_manual(values=c("red","yellow","orange","gray","green","blue","black"))
2.Hydrogeon Bond Network in Liquid Water
Hydrogen bonded networks play an important role in determining the physical properties of many molecular liquids and solids, and play a crucial role in the structure and fuction of most biomolecules, Among these, water is the most extensively studied molecule.(6,13,14) First Paper I found talk about water network in liquid water is ref.10, this paper discuss the distribution of non-short-circuited hydrogen-bond polygons. This hydrogeon bond network become complicated due to the variety ionization form of water molecules. The hydrogen ion must exist as ions with formula H3O+(H2O)n, H3O+(hydroxonium ion, hydronium ion is obsolete) H5O2+ H9O4+ H13O6+. Below are some popular hydrogen ions and hydroxide ions.
The dynamics of the H-bond network of water is believed to play a key role in many processes in physics, chemicstry, and biology. Experiments(depolarized rayleigh scattering, quasi-elastic incoherent neutron scattering) and MD simulation were employed to study the lifetime of H-bonds(9).
"In the context of network theory, a complex network is a graph (network) with non-trivial topological features—features that do not occur in simple networks such as lattices or random graphs but often occur in graphs modelling of real systems. "(https://en.wikipedia.org/wiki/Complex_network). The reason we perform complex network analysis on water network is to seek if there is any "non-trivial" substructures in water network which makes water unique thus become the popular solvent for biological processes.
The most popular network analysis tools include:
1.Boost Graph Library(BGL)http://www.boost.org/doc/libs/1_42_0/libs/graph/
2.QuickGraph http://www.codeplex.com/quickgraph
3.igraph http://igraph.sourceforge.net/
4.NetworkX http://networkx.lanl.gov/
Here we perform water network analysis by igraph in R.
STEP1: load trajectory into vmd and calculate hbond pairs, skip to step2 if this has already been done.
vmd md_npt.psf skip10000.dcd
open tcl console and type:
source ~/toolset/water_bridge.tcl
start_hbonding -water
This will save hbond information in ./current directory/hbonds_all.txt.
subset(data,L1=="7" & variable=="V1")
z<-read.table("deg.txt")
z.means<-colMeans(z) or apply(z,2,mean)
error.bar <- function(x, y, upper, lower=upper, length=0.1,...){
+if(length(x) != length(y) | length(y) !=length(lower) | length(lower) != length(upper))
+stop("vectors must be same length")
+arrows(x,y+upper, x, y-lower, angle=90, code=3, length=length, ...)
+}
z.sd<-apply(z,2,sd)
barx<-barplot(z.means,names.arg=0:7,axis.lty=1,ylim=c(0,0.5),xlab="number of hbonds",ylab="density",font.lab=2,cex.lab=2,col="red",main="hbonds of 100 frames(skip 10000)",cex.main=2)
error.bar(barx,z.means,z.sd)
3.1. Water Bridges Calculation
Water as a popular solvent plays a key role in biological reactions, for example protein folding and protein recoganition. The mechanism of these process has not been fully understood yet, a lot of focus has been given to protein as a big solute molecule, and shed much less light on this abundant and small molecule. In this part we will investigate the role of water molecules in stabilizing folded protein and to explore the characteristics of water bridge. Compared to direct hydrogen bond, interactions through water bridges are considered to be more weak, and the contribution of direct hydrogen bonds between amino residues to gibbs free energy to be around 25 kJ mol-1, the contribution of indrect hydrogen bond through one water molecule is about -10.5kJ mol-1, (page 248, ref. 18.).
In this study, I wrote a vmd script(water_bridge.tcl) to find all hydrogen bond information, then calculate water bridges that span over amino residues along the trajectory. Here we only calculate water bridge up to 3 water bridges for the sake of saving calculation time.
copy loophbondbridge.sh loophbondbridge.tcl to ./work2/TOP100/
./loophbondbridge.sh
this will generate *_bhonds_bridge.txt & *_hbonds_bridge_residence.txt
the formats of *_hbonds_bridge_residence.txt is:
atom index, atom index, atom name, atom name, residue name,residue name, resid, resid, backbone type(BB,BS,SB,SS),frames,total number of frames, bridge score
6,842,N,OG,ALA,SER,2,54,BS,3 5 8,3,0.125
*bridge score:
direct HB:1.0
one water bridge:0.5
two water bridge:0.25
three water bridge:0.125.
final bridge score = sum of bridge score/total frames that bridge exists
R
library(data.table)
filelist<-list.files(path=".",pattern="*_hbonds_bridge_residence.txt")
temp <- lapply(filelist, fread, sep=",",select=c(9,11))
data<-melt(temp)
tmp1<-aggregate(value~V9,data,sum)
names(tmp1)<-c("type","frame_sum")
tmp1$frequency<-tmp1$frame_sum/(sum(data$value))
tmp1[2,2:3]<-tmp1[2,2:3]+tmp1[3,2:3]
tmp1[2,1]<-"BS/SB"
tmp2<-tmp1[rownames(tmp1) != "3",]
type frame_sum frequency 1 BB 652440 0.3797902 2 BS/SB 613165 0.3569279 4 SS 452291 0.2632819 |
ggplot(tmp2,aes(x=type,y=frequency))+geom_bar(stat="identity",fill="red")+labs(title="Frequency of HBonds Type",x="hbonds type")
postscript("frequency_of_hydrogen_bond_type.eps",onefile=FALSE,horizontal=FALSE,paper="special",width=5,height=5)
ggplot(tmp2,aes(x=type,y=frequency))+geom_bar(stat="identity",fill="red")+labs(x="bridge type",y="frequency density")
dev.off()
Above is the frequency of appearing of hbonds between backbone and sidechain. As we can see, backbone to backbone are a little bit more popular than backbone sidechain and sidechain sidechain has least population.
library(data.table)
filelist<-list.files(path=".",pattern="*_hbonds_bridge_residence.txt")
temp <- lapply(filelist, fread, sep=",",select=c(9,12))
data<-melt(temp)
tmp1<-aggregate(value~V9,data,sum)
names(tmp1)<-c("type","score_sum")
tmp1[2,2:3]<-tmp1[2,2:3]+tmp1[3,2:3]
tmp1[2,2] <- tmp1[2,2]+tmp1[3,2]
tmp1[2,1]<-"BS/SB"
tmp1$average_score[tmp1$type=="BB"]<-tmp1$score_sum[tmp1$type=="BB"]/(nrow(data[data$V9=="BB",]))
tmp1$average_score[tmp1$type=="BS/SB"]<-tmp1$score_sum[tmp1$type=="BS/SB"]/(nrow(data[data$V9=="SB",])+nrow(data[data$V9=="SB",]))
tmp1$average_score[tmp1$type=="SS"]<-tmp1$score_sum[tmp1$type=="SS"]/(nrow(data[data$V9=="SS",]))
tmp2<-tmp1[rownames(tmp1) != "3",]
type score_sum average_score 1 BB 18326.82 0.4619003 2 BS/SB 29170.66 0.4026538 4 SS 19203.35 0.3504069 |
Above is average bridge score of each type, backbone tend to form direct hydrogen bond and less water bridge than side chain.
temp <- lapply(filelist, fread, sep=",",select=c(5,6,12))
data<-melt(temp)
tmp1<-aggregate(data$value,by=list(data$V5,data$V6),sum)
source("residue_residue_bridge.R")
#!/usr/bin/Rscript\ df<-{} df<-data.frame(RESI1=character(0),RESI2=character(0),value=numeric(0)) for (i in 1:nrow(tmp1)) { print(i) a<-0 if(tmp1[i,1]==tmp1[i,2]) { df<-rbindlist(list(df,as.list(c(tmp1[i,1],tmp1[i,2],tmp1[i,3])))) } else { for (j in 1:nrow(tmp1)) { if((tmp1[i,1]==tmp1[j,2])&(tmp1[i,2]==tmp1[j,1])&(tmp1[i,1] a<-1 break } else if ((tmp1[i,1]==tmp1[j,2])&(tmp1[i,2]==tmp1[j,1])&(tmp1[i,1]>tmp1[i,2])) { a<-1 break } else { } } if (a==0) {df<-rbindlist(list(df,as.list(c(tmp1[i,1],tmp1[i,2],tmp1[i,3]))))} } } |
library(ggthemes)
theme_set(theme_solarized())
df$value<-as.numeric(as.character(df$value))
d<-transform(df,new=round(value/5250.5,digits=2))
levels(df$RESI1)<-list("ALA"="ALA","ARG"="ARG","ASN"="ASN","ASP"="ASP","CYS"="CYS","CYX"="CYX","GLN"="GLN","GLU"="GLU","GLY"="GLY","HID"="HID","HIE"="HIE","HIP"="HIP","ILE"="ILE","LEU"="LEU","LYS"="LYS","MET"="MET","PHE"="PHE","PRO"="PRO","SER"="SER","THR"="THR","TRP"="TRP","TYR"="TYR","VAL"="VAL")
levels(df$RESI2)<-list("ALA"="ALA","ARG"="ARG","ASN"="ASN","ASP"="ASP","CYS"="CYS","CYX"="CYX","GLN"="GLN","GLU"="GLU","GLY"="GLY","HID"="HID","HIE"="HIE","HIP"="HIP","ILE"="ILE","LEU"="LEU","LYS"="LYS","MET"="MET","PHE"="PHE","PRO"="PRO","SER"="SER","THR"="THR","TRP"="TRP","TYR"="TYR","VAL"="VAL")
ggplot(data=d,aes(x=RESI1,y=RESI2,fill=new)) + geom_tile()+geom_text(aes(RESI1,RESI2,label = new), color = "#073642", size = 4)+scale_fill_gradient(name=expression("scale"), low = "#fdf6e3", high = "steelblue",breaks=seq(0, 1, by = 0.1),limits = c(0.0, 1.0))
Hydrogen bonded networks play an important rold in determining the phsical properties of many molecular liquids and solids, and play a crucial role in the structure and fuction of most biomolecules, Among these, water is the most extensively studied molecule.(6) First Paper I found talk about water network in liquid water is ref.10, this paper discuss the distribution of non-short-circuited hydrogen-bond polygons.
The hydrogen ion can not survive alone in the solution, it must exist as ions with formula H3O+(H2O)n, H3O+(hydroxonium ion, hydronium ion is obsolete) H5O2+ H9O4+ H13O6+ H3O+ can donate three hydrogen bonds and accept almost none, the strength of these donated hydrogen bonds being over twice as strong as those between H2O molecules in bulk water. The presence of an H3O+ ion influence at least 100 surrounding water molecules.
We will perform water simulation by using charmm,namd and amber14 for comparison.
/home/zwei/work/MD/WATERBOX_charmm/
/home/zwei/work/MD/WATERBOX_namd/
/home/zwei/work/MD/WATERBOX_amber/
A water box which has 9261 water molecules will be created by charmm:
charmm < water.inp (http://gene.bio.jhu.edu/charmmscripts/waterbox.inp)
This will creat a water box which has 21*21*21=9261 water molecules. output files:water_box.pdb/crd.
Next we minimize the system and then slowly heating the water box from 50K to 300K.
mpirun -np 12 charmm < heat_euqi_nvt.inp
*CHARMM
!open unit 1 card name /home/zwei/softwares/c37a2/toppar/top_all27_prot_na.rtf
open unit 1 card name water_box.pdb
mini sd nprint 10 inbfreq -1 nstep 1000 open unit 11 write file name heat_equi.dcd
shake bonh
open unit 1 write card name heat_equi.psf
open unit 1 write card name heat_equi.crd
open unit 1 write card name heat_equi.pdb |
Then we perform NPT equilibration on the system:
*CHARMM
open unit 1 card name /home/zwei/softwares/c37a2/toppar/top_all27_prot_na.rtf
open unit 1 card name heat_equi.pdb
crystal define cubic 64.0 64.0 64.0 90.0 90.0 90.0 !mini abnr nstep 1000 nprint 100 lattice open unit 11 write file name md_npt.dcd
dyna cpt start timestep 0.002 nstep 1000000 -
open unit 1 write card name md_npt.psf
open unit 1 write card name md_npt.crd |
Here we set nsavc=1, which is smaller than normally set values(50,100), This simulation will create ~300G trajectory file which inlude 2 ns of NPT pute water box simulation.
We can use catdcd to extract whatever frames we want to analyze from the original trajectory:
example: extract last 10000 frames:
catdcd -o last10000.dcd -first 990001 md_npt.dcd.
or 100 frames by skiping 10000 frames:
catdcd -o skip10000.dcd -stride 10000 md_npt.dcd
namd input configuration file:
# input
# Temperature control
# Simulation space partitioning
#PBC
#PME
# Multiple timestepping
|
Amber input scripts:
Minimization:
Minimization of water box &cntrl imin = 1, ! Turn on minimization ncyc=1000, ! If ncyc<maxcyc steepest descent before switching to conjugated gradient for remaining maxcyc = 5000, ! Total number of minimization cycles cut = 8, ! Nonbonded cutoff in angstroms ntpr=100, ! Print frequency / |
pmemd.cuda -O -i min.mdin -o min.out -c water_box.inpcrd -p water_box.prmtop -r min.rst
Heating:
heating and equilibrating &cntrl imin=0, irest=0, ntx=1,ntxo=2, ntpr=100, ntwx=100, nstlim=20000, dt=0.002, ntt=3, tempi=50.0, temp0=300, gamma_ln=5.0, ig=-1, ntc=2, ntf=2, cut=8, iwrap=1, ioutfm=1,ntrelax=100, / &wt type='TEMP0', istep=50.0,istep2=5000,value1=0.0,value2=300.0 / &wt type='TEMP0', istep=5001,istep2=20000,value1=300.0,value2=300.0 / &wt type='END' / |
pmemd.cuda -O -i heat.mdin -o heat.out -c min.rst -p water_box.prmtop -r heat.rst
Production Run:
Explicit solvent production run &cntrl imin=0, irest=1, ntx=5,ntxo=2, ntpr=100, ntwx=100, nstlim=1000000, dt=0.002, ntt=3,ntp=1, temp0=300, gamma_ln=5.0, ig=-1, ntc=2, ntf=2, cut=8, iwrap=1, ioutfm=1,ntrelax=100, / |
mpirun -np 2 pmemd.cuda.MPI -O -i md.mdin -o md.out -c heat.rst -p water_box.prmtop -r md.rst -x md.mdcrd
7.2 Protein In Water Simulation
7.2.1 1E0Q
/home/zwei/work/MD/1E0Q/amber
pdb4amber --model=1 --noter -i 1E0Q.pdb -o tmp.pdb
edit tmp.pdb(acetylation ACE and amidation NME)
move H3 to top of the file and replace H3 to C and residue name to ACE.
ATOM 11 H3 MET A 1 -14.434 0.885 -2.035 1.00 0.00 H
to
ATOM 11 C ACE A 1 -14.434 0.885 -2.035 1.00 0.00 C
&& move atom OXT from last residue to bottom of the file and replace OXT to N, residue name to NME
ATOM 276 OXT VAL A 17 -14.136 -2.108 2.130 1.00 0.00 O
to
ATOM 276 N NME A 17 -14.136 -2.108 2.130 1.00 0.00 N
tleap -f tleap.in
mpirun -np 12 pmemd.MPI -O -i min.mdin -c 1e0q_box.inpcrd -p 1e0q_box.prmtop -r min.rst
pmemd.cuda -O -i heat.mdin -o heat.out -c min.rst -p 1e0q_box.prmtop -r heat.rst -x heat.mdcrd
mpirun -np 2 pmemd.cuda.MPI -O -i md.mdin -o md.out -c heat.rst -p 1e0q_box.prmtop -r md.rst -x md.mdcrdmpirun -np 2 pmemd.cuda.MPI -O -i md.mdin -o md.out -c heat.rst -p 1e0q_box.prmtop -r md.rst -x md.mdcrd
TOP100 proteins bechmark:
copy loopwaterbridge.tcl, loop.sh, pipeline.sh to ./TOP100,
pipeline.sh(to prepare and generate trajectories)
loop.sh(to analyze the last 100 frames, and generate *_bridge_residence.txt *_bridges.txt)
7.2Water Network & Water Bridge
Water is a so common but most fascinating substances in the universe(12). It possesses a number of anomalous properties(high melting and boiling points relative to water's small molecular weight, water's expansion of volume upon freezing). All of the anomalous properties of water are connected to the fact that water forms an extensive hydrogen-bond network.
7.2.1 Complex Network Analysis
"In the context of network theory, a complex network is a graph (network) with non-trivial topological features—features that do not occur in simple networks such as lattices or random graphs but often occur in graphs modelling of real systems. "(https://en.wikipedia.org/wiki/Complex_network). The reason we perform complex network analysis on water network is to seek if there is any "non-trivial" substructures in water network which makes water unique thus become the popular solvent for biological processes.
The most popular network analysis tools include:
1.Boost Graph Library(BGL)http://www.boost.org/doc/libs/1_42_0/libs/graph/
2.QuickGraph http://www.codeplex.com/quickgraph
3.igraph http://igraph.sourceforge.net/
4.NetworkX http://networkx.lanl.gov/
Here we use igraph in R to analyze the water network structure.
STEP1: load trajectory into vmd and calculate hbond pairs
vmd md_npt.psf skip10000.dcd
open tcl console and type:
source ~/toolset/water_bridge.tcl
start_hbonding -water
This will save hbond information in ./current directory/hbonds.txt with the format as follows,
0 25443 18441
....
*0 is the frame number,
255443 and 18443 are index of water oxygens.
STEP2:Calculate degree of water hbonding.
R
>source("deg.R")
#!/usr/bin/Rscript\ library(igraph) rm(list=ls()) if(file.exists("deg.txt")){file.remove("deg.txt")} for (i in seq(from=0,to=99,by=1)) { name<-paste0("g",(i)) name1<-paste0("net",(i)) name2<-paste0("deg",(i)) name3<-paste0("H",(i)) write(i,file="tmp") assign(name,read.table(pipe("awk -v var=`cat tmp` '{if ($1==var) print $2,$3}' hbonds.txt"))) assign(name1,graph_from_data_frame(eval(as.symbol(name)),directed=F)) for (j in seq(0,27781,by=3)) {if(!j %in% V(eval(as.symbol(name1)))$name) {assign(name1,add_vertices(eval(as.symbol(name1)),1,name=j))}} assign(name2,degree(eval(as.symbol(name1)),mode="all")) assign(name3,hist(eval(as.symbol(name2)),plot=FALSE,breaks=c(-0.5,0.5,1.5,2.5,3.5,4.5,5.5,6.5,7.5))) write(eval(as.symbol(name3))$density,file="deg.txt",append=TRUE,ncolumns=8) |
("watch -n 1 cat tmp" to watch the progress of calculation)
this will save degree information in deg.txt:
0.1626174 0.3690746 0.313681 0.1305475 0.02332362 0.0007558579 0
.....
every line indicate average frequency of 0,1,2,3,4,5,6 hydrogen bonds in one frame.
now, we plot the degree of hbonding:
z<-read.table("deg.txt")
z.means<-colMeans(z) or apply(z,2,mean)
error.bar <- function(x, y, upper, lower=upper, length=0.1,...){
+if(length(x) != length(y) | length(y) !=length(lower) | length(lower) != length(upper))
+stop("vectors must be same length")
+arrows(x,y+upper, x, y-lower, angle=90, code=3, length=length, ...)
+}
z.sd<-apply(z,2,sd)
barx<-barplot(z.means,names.arg=0:7,axis.lty=1,ylim=c(0,0.5),xlab="number of hbonds",ylab="density",font.lab=2,cex.lab=2,col="red",main="hbonds of 100 frames(skip 10000)",cex.main=2)
error.bar(barx,z.means,z.sd)
STEP3:Calculation Commnunity Of Water Network
>source("community.R")
#!/usr/bin/Rscript\
if(file.exists("community_100.txt")){file.remove("community_100.txt")} |
This will save all communities into file community.txt, and community_100.txt, which includes commmunities that has more than 100 nodes, the first two columns are frame number and index of communities. Now let's look what the community looks like in water network.
in vmd console, type:
source water_community.tcl
start_community
Comparison between different simulation package:
datatable<-data.frame(
+V1=c(2.433754e-02,0.0360760180,0.0322772921)
+,V2=c(1.370284e-01,0.1746074940,0.1646366490)
+,V3=c(3.078944e-01,0.3371266640,0.3321099220)
+,V4=c(3.465382e-01,0.3134164810,0.3223874270)
+,V5=c(1.707310e-01,0.1280153320,0.1375423850)
+,V6=c(1.337652e-02,0.0106241232,0.0109156685)
+,V7=c(9.286254e-05,0.0001338948,0.0001306554)
)
barplot(as.matrix(datatable),beside=T,col=c("red","green","blue"),names.arg=0:6,xlab="number of hbonds",font.lab=2,cex.lab=2,ylab="density",main="degree of hbonds",cex.main=2)
red:charmm;green:namd;blue:amber
Below is result from reference 6.
datatable<-data.frame(
+V1=c(0.9991561979,5.282691e-02,0.0248968788)
+,V2=c(0.0008438001,2.185164e-01,0.1351150000)
+,V3=c(0.0000000000,3.585423e-01,0.3107580190)
+,V4=c(0.0000000000,2.752068e-01,0.3439336970)
+,V5=c(0.0000000000,9.156355e-02,0.1715516740)
+,V6=c(0.0000000000,3.320376e-03,0.0136108405)
+,V7=c(0.0000000000,2.375553e-05,0.0001338948)
+,V8=c(0.0000000000,0.000000e+00,0.0000000000)
)
mar.default <- c(5,4,4,2) + 0.1
par(mar = mar.default + c(0, 4, 0, 0))
png("degree_of_freedom.png",width=800,height=800)
barplot(as.matrix(datatable),beside=T,col=c("red","green","blue"),names.arg=0:7,xlab="number of HB neighbours",font.lab=2,cex.lab=2,ylab="Fraction of bonded neighbours",main="",cex.main=2)
dev.off()
in thesis calculation of hbond lifetime:/home/zwei/work2/WATERBOX_charmm/last_1000
hbond_lifetime.pl -inx 0:999 hbonds_water.txt hbond_lifetime.dat
gnuplot lifetime.gnu
plot network
plot(net0, edge.arrow.size=.4,vertex.label=NA,vertex.shape="none",layout=layout.auto)
Let's take a look at distribution of water communities:
in R terminal:
library(data.table)
library(ggplot2)
tmp<-count.fields("community.txt")
tmptable<-data.table(x=c(tmp),y=1)
tmptablex<-tmptable[order(x)]
ddd<-tmptablex[,sum(y*x),by=x]
write.table(ddd,file="community_stat.txt")
g<-read.table("community_stat.txt")
barplot(prop.table(g$V1),names.arg=g$x,main="density of water communities",col="red")
m<-ggplot(g,aes(x=g$x,y=g$V1))
m+geom_bar(aes(y=g$V1/sum(g$V1)),stat="identity")+xlab("community size")+ylab("population")+ggtitle("population of different community size")
Let's add cumulative density to the graph
m+geom_bar(aes(y=g$V1/sum(g$V1)),stat="identity")+xlab("community size")+ylab("population")+ggtitle("population of different community size")+geom_line(aes(y=cumsum(g$V1/sum(g$V1))))
now we compare water box with ions and proteins in it.
Pure<-count.fields(pipe("cat ./Pure1/community.txt ./Pure2/community.txt ./Pure3/community.txt"))
puretablex<-puretable[order(x)]
puresum<-puretablex[,sum(y*x),by=x]
Na20Cl20<-count.fields(pipe("cat ./Na20Cl20_1/community.txt ./Na20Cl20_2/community.txt ./Na20Cl20_3/community.txt"))
Na20Cl20table<-data.table(x=c(Na20Cl20),y=1)
Na20Cl20tablex<-Na20Cl20table[order(x)]
Na20Cl20sum<-Na20Cl20tablex[,sum(y*x),by=x]
Mg20Cl40<-count.fields(pipe("cat ./Mg20Cl40_1/community.txt ./Mg20Cl40_2/community.txt ./Mg20Cl40_3/community.txt"))
Mg20Cl40table<-data.table(x=c(Mg20Cl40),y=1)
Mg20Cl40tablex<-Mg20Cl40table[order(x)]
Mg20Cl40sum<-Mg20Cl40tablex[,sum(y*x),by=x]
Al20Cl60table<-data.table(x=c(Al20Cl60),y=1)
Al20Cl60tablex<-Al20Cl60table[order(x)]
Al20Cl60sum<-Al20Cl60tablex[,sum(y*x),by=x]
eq<-count.fields(pipe("cat /home/zwei/work/MD/1E0Q/amber/Pure1/community.txt /home/zwei/work/MD/1E0Q/amber/Pure2/community.txt /home/zwei/work/MD/1E0Q/amber/Pure3/community.txt"))
eqtable<-data.table(x=c(eq),y=1)
eqtablex<-eqtable[order(x)]
eqsum<-eqtablex[,sum(y*x),by=x]
m<-ggplot()+geom_line(data=puresum,aes(x=puresum$x,y=puresum$V1/sum(puresum$V1)),color="green")+geom_line(data=Na20Cl20sum,aes(x=Na20Cl20sum$x,y=Na20Cl20sum$V1/sum(Na20Cl20sum$V1)),color="red")
m+geom_line(data=puresum,aes(x=puresum$x,y=cumsum(puresum$V1/sum(puresum$V1))),color="blue")+geom_line(data=Na20Cl20sum,aes(x=Na20Cl20sum$x,y=cumsum(Na20Cl20sum$V1/sum(Na20Cl20sum$V1))),color="green")+geom_line(data=Mg20Cl40sum,aes(x=Mg20Cl40sum$x,y=cumsum(Mg20Cl40sum$V1/sum(Mg20Cl40sum$V1))),color="yellow")+geom_line(data=Al20Cl60sum,aes(x=Al20Cl60sum$x,y=cumsum(Al20Cl60sum$V1/sum(Al20Cl60sum$V1))),color="orange")+geom_line(data=eqsum,aes(x=eqsum$x,y=cumsum(eqsum$V1/sum(eqsum$V1))),color="red")
We plot the density by groups of communities:
in bash terminal:
awk '{if($2 <= 100) {sum1+=$3} else if ($2<=200 && $2>100) {sum2+=$3} else if ($2<=300 && $2>200) {sum3+=$3} else if ($2<=400 && $2>300) {sum4+=$3} else if ($2<=500 && $2>400) {sum5+=$3} else if ($2<=600 && $2>500) {sum6+=$3} else if ($2<=700 && $2>600) {sum7+=$3} else {sum8+=$3}}END {print "x<=100",sum1,"\n100<x<=200",sum2,"\n200<x<=300",sum3,"\n300<x<=400",sum4,"\n400<x<=500",sum5,"\n500<x<=600",sum6,"\n600<x<=700",sum7,"\nx>700",sum8}' community_stat.txt > community_bar.txt
awk '{if($2 <= 100) {sum1+=$3*$2} else if ($2<=200 && $2>100) {sum2+=$3*$2} else if ($2<=300 && $2>200) {sum3+=$3*$2} else if ($2<=400 && $2>300) {sum4+=$3*$2} else if ($2<=500 && $2>400) {sum5+=$3*$2} else if ($2<=600 && $2>500) {sum6+=$3*$2} else if ($2<=700 && $2>600) {sum7+=$3*$2} else {sum8+=$3*$2}}END {print "x<=100",sum1,sum1/(sum1+sum2+sum3+sum4+sum5+sum6+sum7+sum8),"\n100<x<=200",sum2,sum2/(sum1+sum2+sum3+sum4+sum5+sum6+sum7+sum8),"\n200<x<=300",sum3,sum3/(sum1+sum2+sum3+sum4+sum5+sum6+sum7+sum8),"\n300<x<=400",sum4,sum4/(sum1+sum2+sum3+sum4+sum5+sum6+sum7+sum8),"\n400<x<=500",sum5,sum5/(sum1+sum2+sum3+sum4+sum5+sum6+sum7+sum8),"\n500<x<=600",sum6,sum6/(sum1+sum2+sum3+sum4+sum5+sum6+sum7+sum8),"\n600<x<=700",sum7,sum7/(sum1+sum2+sum3+sum4+sum5+sum6+sum7+sum8),"\nx>700",sum8,sum8/(sum1+sum2+sum3+sum4+sum5+sum6+sum7+sum8)}' community_stat.txt > community_bar1.txt
[zwei@acer Al20Cl60_1]$ cat community_bar1.txt
in R:
g<-read.table("community_bar.txt")
barplot(prop.table(g$V2),names.arg=g$V1,main="density of water communities",col="red")
filelist<-c("./WATERBOX_charmm/community_bar.txt","./WATERBOX_namd/community_bar.txt","./WATERBOX_amber/community_bar.txt")
library(data.table)
temp <- lapply(filelist, fread, sep=" ")
data<-melt(temp)
for (i in 1:nrow(data)) {if (data$L1[i] ==1) {data$L1[i]<-"charmm"} else if (data$L1[i] == 2) {data$L1[i]<-"namd"} else if (data$L1[i] ==3) {data$L1[i]<-"amber"}}
x<-c("x<=100","100<x<=200","200<x<=300","300<x<=400","400<x<=500","500<x<=600","600<x<=700","x>700")
ggplot(data,aes(x=group,y=value,fill=L1))+geom_bar(stat="identity",position="dodge")+scale_x_discrete(label=x)
library(plyr)
library(data.table)
library(ggplot2)
filelist<-c("./Pure1/community_bar.txt","./Pure2/community_bar.txt","./Pure3/community_bar.txt","./NaCl_1/community_bar.txt","./NaCl_2/community_bar.txt","./NaCl_3/community_bar.txt","./Na20Cl20_1/community_bar.txt","./Na20Cl20_2/community_bar.txt","./Na20Cl20_3/community_bar.txt","./Mg20Cl40_1/community_bar.txt","./Mg20Cl40_2/community_bar.txt","./Mg20Cl40_3/community_bar.txt","./Al20Cl60_1/community_bar.txt","./Al20Cl60_2/community_bar.txt","./Al20Cl60_3/community_bar.txt")
temp <- lapply(filelist, fread, sep=" ")
data<-melt(temp,variable.name="group")
data$group<-as.character(data$group)
for (i in 1:nrow(data)) {if (data$L1[i] <=3) {data$group[i]<-"Pure"} else if ((data$L1[i] <=6) & (data$L1[i] >3)) {data$group[i]<-"NaCl"} else if ((data$L1[i] <=9) & (data$L1[i] >6)) {data$group[i]<-"Na20Cl20"} else if ((data$L1[i] <=12) & (data$L1[i] >9)) {data$group[i]<-"Mg20Cl40"} else if ((data$L1[i] <=15) & (data$L1[i] >12)) {data$group[i]<-"Al20Cl60"}}
data_summary <- function(data, varname, groupnames){
require(plyr)
summary_func <- function(x, col){
c(mean = mean(x[[col]], na.rm=TRUE),
sd = sd(x[[col]], na.rm=TRUE))
}
data_sum<-ddply(data, groupnames, .fun=summary_func,
varname)
data_sum <- rename(data_sum, c("mean" = varname))
return(data_sum)
}
df2<-data_summary(data,varname="value",groupnames=c("V1","group"))
x<-c("x<=100","100<x<=200","200<x<=300","300<x<=400","400<x<=500","500<x<=600","600<x<=700","x>700")
ggplot(df2,aes(x=V1,y=value,fill=group))+geom_bar(stat="identity",position=position_dodge())+geom_errorbar(aes(ymin=value-sd,ymax=value+sd),width=0.2,position=position_dodge(.9))+scale_x_discrete(label=x)
ggplot(data,aes(x=V1,y=value,fill=group))+geom_bar(stat="identity",position=position_dodge())+theme(axis.text=element_text(size=40),axis.title=element_text(size=40,face="bold"))+xlab("community size")+ylab("density")+theme(legend.text=element_text(size=40),legend.title=element_text(size=40)) (in thesis)
Preparation of node network:
>R
>library(reshape2)
>g<-read.table("residence.txt",sep=",",stringsAsFactors=F)
>seq1<-seq(0,0,length=max(g$V1)**2)
>mat1<-matrix(seq1,max(g$V1))
write.table(mat1,file="tmp",row.names=F,col.names=F)
>for (i in seq (1,length(g$V1))) { g$V3[i]=sapply(gregexpr("\\S+", g$V3[i]), length)}
>g$V3<-as.numeric(as.character(g$V3))
>x<-dcast(g,V1~V2,value.var='V3')
>write.table(x,file="contact.dat")
In bash:
pulling of BPTI
library(gridExtra)
library(ggplot2)
x<-read.table("protein_hbonds_bridge.txt")
y<-read.table("protein_hbondsonly.txt")
z<-read.table("protein_hbonds_bridgeonly.txt")
a<-cbind(x,appear="1")
b<-cbind(y,appear="1")
c<-cbind(z,appear="1")
a[,6]<-as.numeric(as.character(a[,6]))
b[,6]<-as.numeric(as.character(b[,6]))
c[,6]<-as.numeric(as.character(c[,6]))
atmp1<-aggregate(x=a$V5,by=list(a$V1),FUN=sum)
btmp1<-aggregate(x=b$V5,by=list(b$V1),FUN=sum)
ctmp1<-aggregate(x=c$V5,by=list(c$V1),FUN=sum)
m1<-ggplot()+geom_smooth(data=btmp1,aes(x=Group.1,y=x,colour="green"))+geom_smooth(data=ctmp1,aes(x=Group.1,y=x,colour="blue"))+labs(x="steps",y="bridge score")
postscript("bridge_score_smd.eps",onefile=FALSE,horizontal=FALSE,paper="special",width=10,height=5)
m1+scale_colour_manual(labels=c("water bridge","direct hydrogen bond"),values=c("green","blue"))
tmp2<-aggregate(x=a$appear,by=list(a$V1),FUN=sum)
m2<-ggplot(tmp2,aes(x=Group.1,y=x))+geom_smooth()+labs(x="steps",y="frequency")
grid.arrange(m1,m2)
frequency of hbond type:
set style line 1 lc rgb '#0060ad' lt 1 lw 2 pt 7 ps 1.5
set style line 2 lc rgb '#dd181f' lt 1 lw 2 pt 5 ps 1.5
plot 'frequency_of_hydrogen_bond_type.txt' u 2:xtic(1) with linespoints ls 2 title "frequency", 'frequency_of_hydrogen_bond_type.txt' u 3:xtic(1) with linespoints ls 1 title "bridge score"
misfoled
awk '{sum+=$5}END{print sum/20}' 6pti_aa9_md_hbonds_bridge.txt
plot 'rmsd_bridgescore.txt' u 2:3 pt 7 ps 2 ti "water bridge", 'rmsd_bridgescore.txt' u 2:4 pt 7 ps 2 ti "hydrogen bond"
Links:
http://www1.lsbu.ac.uk/water/index.html
http://book.bionumbers.org/what-is-the-energy-scale-associated-with-the-hydrophobic-effect/
http://ww2.chemistry.gatech.edu/~lw26/structure/molecular_interactions/mol_int.html
http://www.chem1.com/acad/webtext/states/interact.html#1A
https://answers.yahoo.com/question/index?qid=20130112050750AAj6Fcl
http://proteopedia.org/wiki/index.php/Hydrogen_bonds
https://upload.wikimedia.org/wikipedia/commons/b/b8/Reference_ranges_for_blood_tests_-_by_molarity.png ions concentration in blood
References:
1. JACQUELINE A. Empirical Correlation Between Hydorphobic Free Energy and Aqueous Cavity Surface Area.1974.
2.Real-Space Identification of Intermolecular Bonding with Atomic Force Microscopy
3.An Introduction to Hydrogen bonding. George A. Jeffrey. New York,1997.
4.Protons and hydroxide ions in aqueous systems, Chem. Rev. (2016).
5.fast and Slow Dynamics of Hydrogen Bonds in Liquid Water
6.Hydrogen bond network topology in liquid water and methanol a graph theory approach
7. Residence Times of Water Molecules in the Hydration Sites of Myoglobin.
8. Insights on hydrogen bond lifetimes in liquid and supercooled water
9.A molecular dynamics simulation study of hydrogen bonding in aqueous ionic solutions.
10. Hydrogeon-Bond Patterns in Liquid Water.1973
11.Water as a hydrogen bonding bridge between a phenol and imidazole A simple model for water binding in enzymes. 2006
The hydrogen bond between the phenolic hydroxyl and water appears to have about the same strength as the direct hydrogen bond to imidazole, suggesting that the structure can be a good model for hydrogen bonds that are mediated by a water molecule in enzymes.
12. Analysis of the hydrogen bond network of water using the TIP4P/2005 Model,2012
13.The hydrogen-bond network of water supports propagating optical phonon-like modes.2015
Our results indicate the dynamics of liquid water have more similarities to ice than previously
thought.
14.Hydrogen Bond Networks: Structure and Evolution after Hydrogen Bond Breaking.2004
15.Water in Carbon Nanotubes: The Peculiar Hydrogen Bond Network Revealed by Infrared Spectroscopy.2016.
16.Molecular level investigation of water's hydrogen bonding network in salt solutions.
https://phys.org/news/2016-11-molecular-hydrogen-bonding-network-salt.html
which implies that cations can structure the hydrogen-bond network to a certain extent.
17.Distinct roles of a tyrosine-associated hydrogen-bond network in fine-tuning the structure and function of heme proteins: two cases designed for myoglobin.2016
Tyr-associated H-bond network, plays key roles in regulating the structure and function of proteins.Tyr107 and Tyr138 form a distinct H-bond network involving water molecules and neighboring residues, which fine-tunes ligand binding to the heme iron and enhances the protein stability, respectively.
18.molecular theory of water and aqueous solutions. Arieh Ben-Naim.