
interactionLoader logical flowchart for error and dupe checking of bioentities and interactions

column order is checked 
	exits program if error
	
InteractionLoadingError.txt is created
	this is a tab delimited txt file 
	columns are set which are identical to the input file 	
	
InteractionErrorLog.txt is created
	this records errors resulting from database returns
	
Input file is processed
	skips all empty lines
	
The value of required fields can not be undef	

Bioentity1
	the value of required fields are checked
		I assume that there is always a bioentity1 present with a bioentity_common_name/or bioentity_canonical_name
		
		bioentity1_organism 
			 the value is validated against present database entries
		bioentity_type 			
			the value is validated against present database entries
		group
			the value is validated against present database entries
			if validated 
				the association between the group and organism is checked
				
	if any of the above fail, the offending row is written to InteractionLoadingError.txt, 
	a descriptive Error message is added, the row is purged from memory and the next line is processed
	
Bioentity2
	bioentity2 values are only checked if bioentity1 validation was successful
		presence of a bioentity2_common_name and/or bioentity2_canonical_name are checked
		same values as in bioentity1 are checked
	
	if any of the above fail, the offending row is written to InteractionLoadingError.txt, 
	a descriptive Error message is added, the row concerning bioentity2 is purged from memory and the next line is processed	
	
	At this point one could have a bioentity1 but no bioentity2
	No interaction check is done for such cases
	
	only for a complete set (bioentity1 and bioentity2 validation is successful) interaction requirements are checked

Interaction
	the value of the required fields are checked
		interaction_type
			 the value is validated against present database entries
		
		if this fails, the offending row is written to InteractionLoadingError.txt, 
		a descriptive Error message is added, this interaction is purged from memory and the next line is processed
		However, complete information regarding bioentity1 and bioentity2 are kept for database entry
		
		the values of bioentity1_state
									bioentity1_regulatory_feature
									bioentity2_regulatory_feature
									assay_type
									confidence_score
									pubMedId
		are checked. If any of them is defined than that value is validated against present database entries
		If any of them fail, the offending row is written to InteractionLoadingError.txt, 
		a descriptive Error message is added. The interaction is purged from memory and the next line is processed	
		However, complete information regarding bioentity1 and bioentity2 are kept for database entry
		
		
Database update or insert

at this point: bioentity1 >= bioentity2 >= interaction
all of these records are keyed on a common record_id

Bioentity1 and Bioentity2

Main Query:			
			Select BE.bioentity_id
			from $TBIN_BIOENTITY BE
			join $TBIN_BIOENTITY_TYPE BT on (BE.bioentity_type_id = BT.bioentity_type_id)
			full outer join $TBIN_INTERACTION I on (BE.bioentity_id = I.bioentity1_id)
			full outer join $TBIN_INTERACTION I2 on (BE.bioentity_id = I2.bioentity2_id)
			full outer join $TB_ORGANISM SDO on (BE.organism_id = SDO.organism_id)
			full outer join $TBIN_INTERACTION_GROUP IG on (SDO.organism_id = IG.organism_id)
			where BT.bioentity_type_name = \'$hashRef->{$record}->{'bioentityType'.$num}\'
			and SDO.organism_name ='$hashRef->{$record}->{'organismName'.$num}\' 
			and IG.interaction_group_name = \'$hashRef->{$record}->{group}\'";

the presence of bioentity1 and bioentity2 in the database is checked

	commonQuery: "and  bioentity_id from bioentity where bioentity_common_name = ? if (bioentity(1/2)_common_name)"		
	canonicalQuery: "and  bioentity_id from bioentity where bioentity_canonical_name = ? if (bioentity(1/2)_canonical_name)" 
		
	if both names are defined, 
	the result maybe:
		-	each returns 1 row and the same bioentity_id -- success  -- do an UPDATE, build an update query, use only defined value from the input file
			add the bioentity_id to the interaction record if it exists.  			
		-	returns >1 row and/or different bioentity_id for each  -- failure --  Error is written to the ErrorLog and the interaction record is deleted if it exists	 		
		-	one returns 0 rows and no bioentity_id  --  the record is checked further: 
				a new query "Select bioentity_common_name (bioentity_canonical_name) from bioentity where bioentity_canonical_name
									(bionetity_common_name) = ?"
				is performed to check that either name (common or canonical) is not associated with a defined value for the other
				if this query returns a defined value then it will be an Error. Error is written to the ErrorLog and the interaction record is deleted if it exists	  						
				otherwise -- success -- do an UPDATE, build an update query, use only defined values from the input file. 	add the bioentity_id to the interaction record if it exists.  
		- no rows are returned from either query -- success -- do an INSERT. Insert query, use only defined values from the input file. add the bioentity_id to the interaction record if it exists.  

	if one name is defined
		-	returns 1 row	 -- success -- 	do an UPDATE, build an update query, use only defined value from the input file
		add the bioentity_id to the interaction record if it exists
		- retuns 0 row -- success  --  do an UPDATE, build an update query, use only defined value from the input file
		add the bioentity_id to the interaction record if it exists
		- returns >1 row -- failure --  Error is written to the ErrorLog and the interaction record is deleted if it exists
	 		
at this point only complete interaction records are in the Interaction

Interaction Query
	"select Interaction_id from $TBIN_INTERACTION I	where bioentity1_id = $hashRef->{$record}->{bioentityID1} and 
	 bioentity2_id = $hashRef->{$record}->{bioentityID2}"
	 
	 - return 1 row  -- success --  do an INSERT.   
	 - return >1 row -- failure --  Error is written to the ErrorLog
	 
	 		
 
