Column Definitions for GetDomainHits query resultset

Revised: 2004-03-23

Each row of the table corresponds to a single hit to an annotation source for a given protein. If a protein has multiple domains or multiple weak hits (or redundant strong hits) for a given domain that protein will have multiple rows in the table. Each column is described below, columns are occupied only when applicable.

Column Name Definition
Biosequence Set Name name of organism or sequence set
Biosequence Name gene/sequence name
Accession gene/sequence accession
Gene Symbol
Full gene name
Protein EC number EC number corresponding to
TMR Class Results of TMHMM. 0 == soluble protein, TM == transmembrane protein
Number of TMRs number of TM regions
pI isoelectric point
Match Source program used to derive/detect match. Pfam indicates that HMMER was used to find the match to Pfam. pdbblast indicates that PSI-BLAST was used to detect a match to a given PDB sequence. mamSum indicates that the match is the result of a Rosetta prediction.
Domain Index If protein is a multi-domain protein, as detected by ginzu, domains are labeled as domain 0-domainN.
Overall-P Probability that one of the top five rosetta predictions is the correct fold
BH Best Hit, Y if this is the Rosetta model for this protein with the highest Z-score to match to the PDB.
Cluster Rosetta model number. Cluster00 is the center of the largest cluster and thus the top Rosetta model prior to considering the Z-score of each models best match to the PDB.
Query start (stop) Start and stop of hit with reference to the query protein(domain)
Query length length of protein in case of single protein, length of domain if query is a single domain from Ginzu.
Match start(end) start and stop of match with reference to domain matched in external database (eg. PDB, Pfam, etc.)
Match length
Match Type possible types are pdbblast, pfam and PDB
Match name ID in external database (PDB id, Pfam domain)
Domain EC numbers EC number if hit/match has a corresponding EC number.
Prob Probability of individual Rosetta prediction being correct.
E value confidence of corresponding method
Score confidence of corresponding method
Z-score strength of match between Rosetta predicted structure and the PDB, expressed as a Z-score.
Second Reference CATH id of matched PDB.
Match Annotation Additional annotation information associated with this match in the match source