Significance of triple torsional correlations in proteins†
Abstract
The free energy landscape (FEL) of a given complex molecular system is fundamentally the joint probability density of its many comprising degrees of freedom (DOFs). Computation of a complete FEL at atomistic scale is unfortunately intractable for a typical biomolecular system. The challenge of entropy calculation comes from various correlations among different DOFs. The common strategy to treat such complexity is expansion of the full correlation into various orders of local correlations. In reality, expansion is usually cut off at the second order (i.e. pairwise interactions) for protein torsional correlations without reliable estimation of the resulting error. Here, we estimated the mutual information of different torsion sets and found that triple correlations were significant for both local/distant residue pairs and consecutive backbone torsional segments. As expected, the third order approximations were found to be consistently better than the second order approximations. These findings were true for all analyzed proteins with different folds, were independent of the two different force fields utilized to generate trajectory sets, and were therefore likely to be of general importance for proteins. Additionally, binning strategies are of universal importance for numerical computation of correlations, we here provided a detailed comparison between equal-width and equal-sample binning for different bin numbers and demonstrated the impact of binning strategies on variances and biases of calculated mutual information. Our observation suggested that caution should be taken when quantitative comparison of correlations were intended between different studies with different binning strategies.