Example usage

The following example requires that the MDAnalysisTests datafiles and seaborn packages are installed. Both of these can be installed through pip:

pip install MDAnalysisTests seaborn

pKa estimates from a molecular dynamics trajectory

We first import the paths to the test topology and trajectory and load them into an MDAnalysis Universe. This universe contains a trajectory of an adenylate kinase protein with 3341 atoms and 214 residues.

import MDAnalysis as mda
from MDAnalysisTests.datafiles import PSF, DCD

u = mda.Universe(PSF, DCD)

The pKa estimates of are generated by the PropkaTraj analysis class

import propkatraj

pkatraj = propkatraj.PropkaTraj(u)
pkatraj.run()

Once the run method has finished executing, the resulting pKa estimates, a function of residue number and time, can be found in the pkatraj.results.pkas as a panda dataframe. The columns of the dataframe are the residue numbers and the DataFrame index is the time in nanoseconds. These data can be visualized using matplotlib and seaborn:

import matplotlib.pyplot as plt
import seaborn as sns

fig = plt.figure(figsize=(28, 8))
ax = fig.add_subplot(1,1,1)
sns.boxplot(data=pkatraj.results.pkas, ax=ax)
ax.set_xlabel("residue number")
ax.set_ylabel(r"p$K_a$")
plt.show()

Residues can easily be selected by using information stored in the original Universe. For instance, assume we are only interested in the pKa distribution of the protein’s lysines. We can select all lysine residues by name and collect their resids, which are the columns in the DataFrame.

resides = u.select_atoms('resname LYS').residues.resids

sns.boxplot(data=pkatraj.results.pkas[resids])
ax = plt.gca()
ax.set_xlabel('residue number')
ax.set_ylabel(r'p$K_a$')
plt.show()