Column Definitions for GetDomainHits query resultset

Revised: 2004-03-23

Each row of the table corresponds to a single hit to an annotation source for a given protein. If a protein has multiple domains or multiple weak hits (or redundant strong hits) for a given domain that protein will have multiple rows in the table. Each column is described below, columns are occupied only when applicable.

Column Name	Definition
Biosequence Set Name	name of organism or sequence set
Biosequence Name	gene/sequence name
Accession	gene/sequence accession
Gene Symbol
Full gene name
Protein EC number	EC number corresponding to
TMR Class	Results of TMHMM. 0 == soluble protein, TM == transmembrane protein
Number of TMRs	number of TM regions
pI	isoelectric point
Match Source	program used to derive/detect match. Pfam indicates that HMMER was used to find the match to Pfam. pdbblast indicates that PSI-BLAST was used to detect a match to a given PDB sequence. mamSum indicates that the match is the result of a Rosetta prediction.
Domain Index	If protein is a multi-domain protein, as detected by ginzu, domains are labeled as domain 0-domainN.
Overall-P	Probability that one of the top five rosetta predictions is the correct fold
BH	Best Hit, Y if this is the Rosetta model for this protein with the highest Z-score to match to the PDB.
Cluster	Rosetta model number. Cluster00 is the center of the largest cluster and thus the top Rosetta model prior to considering the Z-score of each models best match to the PDB.
Query start (stop)	Start and stop of hit with reference to the query protein(domain)
Query length	length of protein in case of single protein, length of domain if query is a single domain from Ginzu.
Match start(end)	start and stop of match with reference to domain matched in external database (eg. PDB, Pfam, etc.)
Match length
Match Type	possible types are pdbblast, pfam and PDB
Match name	ID in external database (PDB id, Pfam domain)
Domain EC numbers	EC number if hit/match has a corresponding EC number.
Prob	Probability of individual Rosetta prediction being correct.
E value	confidence of corresponding method
Score	confidence of corresponding method
Z-score	strength of match between Rosetta predicted structure and the PDB, expressed as a Z-score.
Second Reference	CATH id of matched PDB.
Match Annotation	Additional annotation information associated with this match in the match source