The program calculates the correlation between each pair of columns [i, j]: one from the proteins alignment, and the other from the site alignment.
As a measure of correlation, the mutual information is used:
|X, Y||Arrays of 20 aminoacids and 4 nucleotides respectively.|
|Observed frequency of aminoacid x being in position i and nucleotide y being in position j.|
|Expected frequency with the hypothesis of absence of correlations between columns. Calculated as frequency of aminoacid x in column i multiplied by frequency of nucleotide y in column j|
Large difference of the and the expected distribution is reflected in high mutual information value