-
Notifications
You must be signed in to change notification settings - Fork 1
Description
We noticed the following issue: from the term UBERON:0001295 endometrium, in platypus we can reach the following ancestors through indirect relations:
mysql> select t1.anatEntityTargetId, t3.anatEntityName from anatEntityRelation as t1 inner join anatEntityRelationTaxonConstraint as t2 on t1.anatEntityRelationId = t2.anatEntityRelationId inner join anatEntity as t3 on t1.anatEntityTargetId = t3.anatEntityId where (t2.speciesId is null or t2.speciesId = 9258) and t1.anatEntitySourceId = 'UBERON:0001295' and t1.relationType = 'is_a part_of' and t1.relationStatus = 'indirect';
+--------------------+----------------------------+
| anatEntityTargetId | anatEntityName |
+--------------------+----------------------------+
| UBERON:0000025 | tube |
| UBERON:0000060 | anatomical wall |
| UBERON:0000061 | anatomical structure |
| UBERON:0000062 | organ |
| UBERON:0000064 | organ part |
| UBERON:0000344 | mucosa |
| UBERON:0000465 | material anatomical entity |
| UBERON:0000467 | anatomical system |
| UBERON:0000468 | multi-cellular organism |
| UBERON:0000474 | female reproductive system |
| UBERON:0000480 | anatomical group |
| UBERON:0000990 | reproductive system |
| UBERON:0000993 | oviduct |
| UBERON:0003100 | female organism |
| UBERON:0003133 | reproductive organ |
| UBERON:0004111 | anatomical conduit |
| UBERON:0004120 | mesoderm-derived structure |
| UBERON:0004175 | internal genitalia |
| UBERON:0004923 | organ component layer |
| UBERON:0005156 | reproductive structure |
| UBERON:0013515 | subdivision of oviduct |
| UBERON:0013522 | subdivision of tube |
But we do not manage to reach the following structures by following the chain of direct relations stored in the database for platypus:
UBERON:0000025 tube
UBERON:0000993 oviduct
UBERON:0004111 anatomical conduit
UBERON:0004175 internal genitalia
UBERON:0013515 subdivision of oviduct
UBERON:0013522 subdivision of tube
They are all ancestors of uterus, which does not exist in platypus. Hmm... See the chain of direct relations in the database for platypus:
mysql> select t1.anatEntityTargetId, t3.anatEntityName from anatEntityRelation as t1 inner join anatEntityRelationTaxonConstraint as t2 on t1.anatEntityRelationId = t2.anatEntityRelationId inner join anatEntity as t3 on t1.anatEntityTargetId = t3.anatEntityId where (t2.speciesId is null or t2.speciesId = 9258) and t1.relationType = 'is_a part_of' and t1.relationStatus = 'direct' and t1.anatEntitySourceId = 'UBERON:0001295';
+--------------------+----------------------------+
| anatEntityTargetId | anatEntityName |
+--------------------+----------------------------+
| UBERON:0019042 | reproductive system mucosa |
+--------------------+----------------------------+
1 row in set (0.09 sec)
mysql> select t1.anatEntityTargetId, t3.anatEntityName from anatEntityRelation as t1 inner join anatEntityRelationTaxonConstraint as t2 on t1.anatEntityRelationId = t2.anatEntityRelationId inner join anatEntity as t3 on t1.anatEntityTargetId = t3.anatEntityId where (t2.speciesId is null or t2.speciesId = 9258) and t1.relationType = 'is_a part_of' and t1.relationStatus = 'direct' and t1.anatEntitySourceId = 'UBERON:0019042';
+--------------------+------------------------+
| anatEntityTargetId | anatEntityName |
+--------------------+------------------------+
| UBERON:0000344 | mucosa |
| UBERON:0005156 | reproductive structure |
+--------------------+------------------------+
2 rows in set (0.12 sec)
mysql> select t1.anatEntityTargetId, t3.anatEntityName from anatEntityRelation as t1 inner join anatEntityRelationTaxonConstraint as t2 on t1.anatEntityRelationId = t2.anatEntityRelationId inner join anatEntity as t3 on t1.anatEntityTargetId = t3.anatEntityId where (t2.speciesId is null or t2.speciesId = 9258) and t1.relationType = 'is_a part_of' and t1.relationStatus = 'direct' and t1.anatEntitySourceId in ('UBERON:0000344', 'UBERON:0005156');
+--------------------+-----------------------+
| anatEntityTargetId | anatEntityName |
+--------------------+-----------------------+
| UBERON:0004923 | organ component layer |
| UBERON:0000990 | reproductive system |
+--------------------+-----------------------+
2 rows in set (0.11 sec)
mysql> select t1.anatEntityTargetId, t3.anatEntityName from anatEntityRelation as t1 inner join anatEntityRelationTaxonConstraint as t2 on t1.anatEntityRelationId = t2.anatEntityRelationId inner join anatEntity as t3 on t1.anatEntityTargetId = t3.anatEntityId where (t2.speciesId is null or t2.speciesId = 9258) and t1.relationType = 'is_a part_of' and t1.relationStatus = 'direct' and t1.anatEntitySourceId in ('UBERON:0004923', 'UBERON:0000990');
+--------------------+----------------------------+
| anatEntityTargetId | anatEntityName |
+--------------------+----------------------------+
| UBERON:0004120 | mesoderm-derived structure |
| UBERON:0000060 | anatomical wall |
+--------------------+----------------------------+
2 rows in set (0.10 sec)
mysql> select t1.anatEntityTargetId, t3.anatEntityName from anatEntityRelation as t1 inner join anatEntityRelationTaxonConstraint as t2 on t1.anatEntityRelationId = t2.anatEntityRelationId inner join anatEntity as t3 on t1.anatEntityTargetId = t3.anatEntityId where (t2.speciesId is null or t2.speciesId = 9258) and t1.relationType = 'is_a part_of' and t1.relationStatus = 'direct' and t1.anatEntitySourceId in ('UBERON:0004120', 'UBERON:0000060');
+--------------------+----------------------+
| anatEntityTargetId | anatEntityName |
+--------------------+----------------------+
| UBERON:0000064 | organ part |
| UBERON:0000061 | anatomical structure |
+--------------------+----------------------+
2 rows in set (0.07 sec)
mysql> select t1.anatEntityTargetId, t3.anatEntityName from anatEntityRelation as t1 inner join anatEntityRelationTaxonConstraint as t2 on t1.anatEntityRelationId = t2.anatEntityRelationId inner join anatEntity as t3 on t1.anatEntityTargetId = t3.anatEntityId where (t2.speciesId is null or t2.speciesId = 9258) and t1.relationType = 'is_a part_of' and t1.relationStatus = 'direct' and t1.anatEntitySourceId in ('UBERON:0000064', 'UBERON:0000061');
+--------------------+----------------------------+
| anatEntityTargetId | anatEntityName |
+--------------------+----------------------------+
| UBERON:0000465 | material anatomical entity |
| UBERON:0000062 | organ |
+--------------------+----------------------------+
2 rows in set (0.08 sec)
mysql> select t1.anatEntityTargetId, t3.anatEntityName from anatEntityRelation as t1 inner join anatEntityRelationTaxonConstraint as t2 on t1.anatEntityRelationId = t2.anatEntityRelationId inner join anatEntity as t3 on t1.anatEntityTargetId = t3.anatEntityId where (t2.speciesId is null or t2.speciesId = 9258) and t1.relationType = 'is_a part_of' and t1.relationStatus = 'direct' and t1.anatEntitySourceId in ('UBERON:0000465', 'UBERON:0000062');
+--------------------+-------------------+
| anatEntityTargetId | anatEntityName |
+--------------------+-------------------+
| UBERON:0000467 | anatomical system |
+--------------------+-------------------+
1 row in set (0.10 sec)
mysql> select t1.anatEntityTargetId, t3.anatEntityName from anatEntityRelation as t1 inner join anatEntityRelationTaxonConstraint as t2 on t1.anatEntityRelationId = t2.anatEntityRelationId inner join anatEntity as t3 on t1.anatEntityTargetId = t3.anatEntityId where (t2.speciesId is null or t2.speciesId = 9258) and t1.relationType = 'is_a part_of' and t1.relationStatus = 'direct' and t1.anatEntitySourceId in ('UBERON:0000467');
+--------------------+------------------+
| anatEntityTargetId | anatEntityName |
+--------------------+------------------+
| UBERON:0000480 | anatomical group |
+--------------------+------------------+
1 row in set (0.09 sec)
mysql> select t1.anatEntityTargetId, t3.anatEntityName from anatEntityRelation as t1 inner join anatEntityRelationTaxonConstraint as t2 on t1.anatEntityRelationId = t2.anatEntityRelationId inner join anatEntity as t3 on t1.anatEntityTargetId = t3.anatEntityId where (t2.speciesId is null or t2.speciesId = 9258) and t1.relationType = 'is_a part_of' and t1.relationStatus = 'direct' and t1.anatEntitySourceId in ('UBERON:0000480');
+--------------------+-------------------------+
| anatEntityTargetId | anatEntityName |
+--------------------+-------------------------+
| UBERON:0000468 | multi-cellular organism |
+--------------------+-------------------------+
1 row in set (0.14 sec)
mysql> select t1.anatEntityTargetId, t3.anatEntityName from anatEntityRelation as t1 inner join anatEntityRelationTaxonConstraint as t2 on t1.anatEntityRelationId = t2.anatEntityRelationId inner join anatEntity as t3 on t1.anatEntityTargetId = t3.anatEntityId where (t2.speciesId is null or t2.speciesId = 9258) and t1.relationType = 'is_a part_of' and t1.relationStatus = 'direct' and t1.anatEntitySourceId in ('UBERON:0000468');
+--------------------+----------------------+
| anatEntityTargetId | anatEntityName |
+--------------------+----------------------+
| UBERON:0000061 | anatomical structure |
+--------------------+----------------------+
1 row in set (0.10 sec)
mysql> select t1.anatEntityTargetId, t3.anatEntityName from anatEntityRelation as t1 inner join anatEntityRelationTaxonConstraint as t2 on t1.anatEntityRelationId = t2.anatEntityRelationId inner join anatEntity as t3 on t1.anatEntityTargetId = t3.anatEntityId where (t2.speciesId is null or t2.speciesId = 9258) and t1.relationType = 'is_a part_of' and t1.relationStatus = 'direct' and t1.anatEntitySourceId in ('UBERON:0000061');
+--------------------+----------------------------+
| anatEntityTargetId | anatEntityName |
+--------------------+----------------------------+
| UBERON:0000465 | material anatomical entity |
+--------------------+----------------------------+
1 row in set (0.16 sec)
mysql> select t1.anatEntityTargetId, t3.anatEntityName from anatEntityRelation as t1 inner join anatEntityRelationTaxonConstraint as t2 on t1.anatEntityRelationId = t2.anatEntityRelationId inner join anatEntity as t3 on t1.anatEntityTargetId = t3.anatEntityId where (t2.speciesId is null or t2.speciesId = 9258) and t1.relationType = 'is_a part_of' and t1.relationStatus = 'direct' and t1.anatEntitySourceId in ('UBERON:0000465');
Empty set (0.08 sec)
The direct/indirect relations are retrieved in our pipeline, from our custom version of uberon in generated_files/uberon/custom_composite.obo, in the method org.bgee.pipeline.uberon.InsertUberon.generateRelationTOsFirstPass(Map, Map, Uberon, Set, Collection) of bgee_pipeline of BgeeDB/bgee_apps. This code uses OWLGraphWrapper. First, it retrieves all relations, with chains of object properties packed if possible, with the method OWLGraphWrapper.getOutgoingEdgesNamedClosureOverSupPropsWithGCI(OWLClass). Then it also retrieves direct relations with the method OWLGraphWrapper.getOutgoingEdgesWithGCI(OWLClass), that's how the distinction between direct and indirect relations is done.
=> it means that OWLGraphWrapper has inferred by relation reduction a set of indirect relations that we do not retrieve just by following the direct relations we have stored in the Bgee database. Need to investigate how the relations returned by getOutgoingEdgesNamedClosureOverSupPropsWithGCI are produced. Is it a bug, or is it all good?
IDEA: maybe the GCI relations do not consider taxon constraints on OWLClasses. We take them into account for insertion into the database. Maybe getOutgoingEdgesNamedClosureOverSupPropsWithGCI and getOutgoingEdgesWithGCI have indeed retrieved relations between endometrium and uterus in platypus, but we have removed them at time of insertion in the database because uterus does not exist in platypus. But then, we still have kept the relations that had been inferred thanks to the relations incoming/outgoing from uterus.
The fix would be to be able to discard the relations returned by getOutgoingEdgesNamedClosureOverSupPropsWithGCI if they go through an OWLClass that does not exist in the requested species
And if it is really the source of the problem, after the fix we can add a check after insertion in the database, walking the path of direct relations to check whether we retrieve exactly the same terms reached by indirect relations.
But, if it is not a bug and these ancestors are really reachable through relation reduction, we need to think about how to provide relations to topGO: we provide to it only the direct relations, so that it is not capable of reaching the terms we reach in Bgee through inferred indirect relations.