Skip to content

geneontology association table and uniprot table entry duplication #12

@sinanshi

Description

@sinanshi

http://geneontology.org/gene-associations/gene_association.goa_uniprot_noiea.gz
Run the script with both old and new files, errors occur telling

Can't execute: Duplicate entry 'UniProtKB-P04637-GO:0005737-IDA-PMID:16131611-' for key 'PRIMARY'
DBD::mysql::st execute failed: Duplicate entry 'UniProtKB-P04637-GO:0005737-IDA-PMID:16131611-' for key 'PRIMARY' at perl/yogy_add_go_assocs.pl line 94, <GO_TERMS> line 154471.

One example in gene_association.dictyBase:

dictyBase       DDB_G0268004    snrp70          GO:0005685      GO_REF:0000024  ISS     UniProtKB:Q00916        C       U1 small nuclear ribonucleoprotein 70 kDa protein               gene    taxon:44689     20060120        dictyBase
dictyBase       DDB_G0268004    snrp70          GO:0005685      GO_REF:0000024  ISS     UniProtKB:P08621        C       U1 small nuclear ribonucleoprotein 70 kDa protein               gene    taxon:44689     20131213        UniProt

Aparently these two entries are identical, except the last two columns, which are not used as the primary keys in SQL insert. So I guess the best will be remove the duplicated rows. Please let me know if you have some other concerns.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions