|
|
Analyze DataAfter analysis through the pipeline the data can be loaded into the GetExpression SBEAMS table which allows a user to query the data receive a results set and then combine the data with other types of data for analysis in Cytoscape
Find a previous Analysis run | Follow to find a previous analysis run.
1) | Log into Sbeams
| |
2) | Select the Microarray Link in the main navigation bar
| |
3) | Select the Project of interest at the top of the page
| |
4) | Select the Data Pipeline Button on the main navigation bar
| |
5) | If the project has Affy data a link to "Affy Analysis Pipeline" will be displayed at the bottom of the page, Click the link
| |
6) | Click on the "Analysis Results" tab to see any previous runs
| |
7) | Click on the "Show files" link to view the run of interest
| |
8) | Click the link Under the heading Add Results to Get Expression --> Add Data Link
| |
| |
Adding data to GetExpression | If the names of the conditions about to be upload are unique there should be a simple looking form, similar to the one below. The condition names can be edited if you wish | |
Warnings Adding Data to GetExpression | If the Condition Name already exists in the database a warning will indicate so. There is two options at this point, change the name and click the button "Check condition Name" to see if it is unique or delete the data in the database by using the same condition name. Be careful you don't stomp someone else's data if you see a warning. | |
What Data is Being uploaded from SAM | The data from the SAM analysis is parsed and Added to certain database fields. See below for the mapping.
In addition an attempt is made to produce a Canonical Name that will be useful for merging an Affy Expression results sets with other types of data. See below for more details. | SAM Column Name | SBEAM Column Name | Data Description |
---|
Probe_set_id | reporter_name & gene_name | Affy Probe set id | Gene_Symbol | common_name | Gene Symbol, usually from NCBI | Gene_Title | full_name | Full gene Name | Unigene | external_identifier | Unigene ID | LocusLink | second_name | LocusLink ID | Public_ID | canonical_name | Canonical Name ***See below how it was derived from the Affy data | FDR | false_discovery_rate | False Discovery Rate | Log_10_Ratio | log10_ratio | Log 10 expression ratio | mu_X | mu_X | Mean linear expression value for the non-reference sample | mu_Y | mu_Y | Mean linear expression value for the Reference sample | D_stat | NOT USED | SAM statistic. Based on the ratio of change in gene expression to standard deviation in the data for a gene |
|
Choosing the canonical name | The Canonical name is chosen in a very simple manor. Initially during analysis Affymetrix provides a file with a bunch of annotation for each probe_set. One of the columns called 'Representative Public ID' holds a DNA accession number usually from GenBank. Initially this was used as the canonical name but it was difficult it to match it up with other data sets. So additional columns of annotation were pulled from the Affy annotation file which including 'RefSeq Protein ID' and 'Locus Link ID'. During the Upload process a better canonical name is chosen by seeing if a RefSeq Protein ID is present, if so it becomes the new canonical name. If not then the Unigene name is used and finally the original ID from Affymetrix is used the so called ' Representative Public ID'. In the event a column of annotation holds more then one accession number (which happens quite frequently) the first accession number is chosen
1) | Canonical Names in GetExpression Could come from a combination of places
| |
|
Potential Names Used for the Canonical Name
RefSeq Protein ID
LocusLink ID
Affy Representative Public ID
|
|
Go Back to the Pipeline Overview | Go Back to the Pipeline Overview |
|