- long range contacts (separation >= 24);
- medium range contacts (12 <= separation <= 23);
- short range contacts (6 <= separation).

F1 = 2*precision*recall/(precision+recall).

where Np=TP+FP is the number of predicted contacts,

TP and FP are the numbers of correctly and incorrectly predicted contacts, correspondingly.

where TP is the number of correctly predicted contacts,

Nc is the number of all contacts in the target structure.

MCC = (TP*TN - FP*FN)/sqrt((TP + FP)(TP + FN)(TN + FP)(TN + FN))

where TP and FP are teh numbers of correctly and incorrectly predicted contacts, correspondingly,

TN is the number of non-contacts in the target structure not appearing in the prediction list,

FN is the number of contacts in the target structure missing in the prediction list.

The score calculates the relative drop of the entropy introduced by a set of distance constraints (in our case - correctly predicted residue-residue contacts) with the respect to the reference value of the entropy for the protein of a given length without constraints. The score is calculated by formula (ref.):

ES = 100% * (Entropy|0 - Entropy|C) / Entropy|0 ,

where

Entropy|0 is the entropy value for the protein without constraints,

Entropy|C is the entropy value given a set of constraints C.

Entropy|x = AVERAGE_over_all_pairs_of_residues (LOG(UpperLimit - LowerLimit)),

where

x = '0' or 'C',

LowerLimit (both for contacts and non-contacts) = 3.2Å

UpperLimit for contacts = 8Å

UpperLimit for non-contacts = diameter of gyration (DG).

The diameter of gyration is calculated by formula (ref):

DG=5.54L^0.34 (L - length of the protein sequence).

A version of the ES score (see above) with

UpperLimit for non-contacts = 3.8Å * N, where N is number of residues in the protein.

Prec(prob) = TP_pw/Np,

Recall(prob) = TP_pw/Nc,

where

TP_pw is the sum of predicted probabilities of correctly predicted contacts in the selected list size,

Np is the number of predicted contacts,

Nc is the number of contacts in target structure.

F1 = 2*Prec(prob)*Recall(prob)/(Prec(prob)+Recall(prob)).

Prec-FDR(prob) = (TP_pw-FP_pw)/Np,

where

TP_pw is the sum of probabilities of correctly predicted contacts,

FP_pw is the sum of probabilities of wrongly predicted contacts,

Np is the number of predicted contacts.

Prec(pwa) = TP_pw/(TP_pw+FP_pw),

Prec-FDR(pwa) = (TP_pw-FP_pw)/(TP_pw+FP_pw),

where

TP_pw is the sum of probabilities of correctly predicted contacts,

FP_pw is the sum of probabilities of wrongly predicted contacts.

Protein Structure Prediction Center