A trick to win CASP

A trick to win CASP

Postby Bharati on Fri Jan 16, 2009 10:11 pm

I found this trick when I tried to reproduce the data
http://predictioncenter.gc.ucdavis.edu/ ... alysis.cgi
by summing the GDT scores of all individual domains from the data in
http://predictioncenter.gc.ucdavis.edu/ ... esults.cgi

I noticed that when some groups submit their models into several segments,
i.e. in the format of
PARENT N/A
ATOM 1 N MET 1 -4.558 -1.223 0.345 1.00 0.00
ATOM 2 CA MET 1 -3.800 0.000 0.000 1.00 0.00
SEGMENT-1
TER
PARENT N/A
ATOM 396 N PRO 81 -4.492 -1.275 0.292 1.00 0.00
ATOM 397 CA PRO 81 -3.800 0.000 0.000 1.00 0.00
SEGMENT-2
END
the assessors took the sum of the GDT scores of two segments as the GDTscore
of the domain!!! So my trick is that if you split your models into N parts
while N is the number of secondary structure elements of the chain, then the
total GDT score will be close to 1 for all your models no matter how hard
the targets are! This is my trick to win CASP. :) ?

People may not really want to use the trick. But this does hurt those groups
who submitted their models within a whole-body, because if the inter-segment
orientation of their models is wrong there is actually only one segment which is
counted in the GDT calculation. But if they submit the model into multiple
segments, all the segments are counted. This did influence some of the top
players in CASP8. For example, in T0476_1, the IBT_LT Group submitted their
model in two pieces with individual GDT of 37.64 and 16.38, resulting in a total
GDT=54.02 which is significantly higher than all other groups. But if they put
these pieces together, I bet the GDT-score is not that high because the
orientation of the pieces is unknown (the key for the modeling). This also
happened in the Server Section. For example, RAPTOR submitted two pieces
for T0405_2 of GDT=17.79 and 13.58, which result in a total GDT=31.37 that
is higher than all other groups....

If taking only one maximum score piece from each model, here is a list of
the Top ten groups I saw (You will find they are slightly different from the
result listed in http://predictioncenter.gc.ucdavis.edu/ ... alysis.cgi.
BTW, I also do not understand why an average GDTscore rather than a
cumulative GDTscore should be listed, because all the groups can easily
improve their ranking on average GDT if they do not submit the targets
they do not have confidence. Because of the same reason, I suppose
that the cumulative Z-score should be used rather than only counting
positive Z-scores (just a thought)):

########### Server groups (164 domains) ################
groups rank cumul_GDT Z-score N_domain
---------------------------------------------------------
Zhang-Server 1 11216.90 127.810 164
RAPTOR 2 10834.29 93.640 164
pro-sp3-TASSER 3 10786.48 94.960 164
BAKER-ROBETTA 4 10726.91 95.140 164
Phyre_de_novo 5 10722.53 84.820 164
MULTICOM-CLUSTER 6 10639.43 78.000 164
METATASSER 7 10620.92 85.350 164
MULTICOM-REFINE 8 10589.24 73.480 164
MUProt 9 10547.74 69.980 164
HHpred4 10 10494.98 64.070 164

########### Human/Server groups (71 domains) ##########
groups rank cumul_GDT Z-score N_domain
---------------------------------------------------------
DBAKER 1 4361.50 73.690 71
Zhang 2 4293.69 62.350 71
IBT_LT 3 4271.41 57.590 70
fams-ace2 4 4239.12 59.350 71
Zhang-Server 5 4223.99 56.350 71
TASSER 6 4205.31 52.810 71
ZicoFullSTP 7 4180.33 58.420 71
Zico 8 4178.53 56.740 71
GeneSilico 9 4157.95 52.400 71
3DShot1 10 4143.30 49.090 71
Bharati
 
Posts: 1
Joined: Fri Jan 16, 2009 4:42 pm

Re: A trick to win CASP

Postby akryshtafovych on Wed Jan 28, 2009 11:51 am

The assumption is wrong - so the conclusions make no sense. Starting at least from CASP6, assessors used for their analysis only predictions that are 20 residues or longer. If there were several independent PARENT-TER segments in the prediction, the one with the highest GDT_TS (not sum of all of them!) was used as the prediction on the domain. This is written in the CASP papers and also mentioned in Roland Dunbrack's presentation that is available from the CASP6 web page. The same principles were applied for CASP8 automatic standard analysis and were recommended to the assessors by the organizers.
akryshtafovych
 
Posts: 10
Joined: Wed Jan 28, 2009 11:35 am


Return to previous CASPs

Who is online

Users browsing this forum: No registered users and 83 guests

cron