Skip to content

Mismatch between Illumina Mouse Methylation BeadChip annotation and genome browsers (UCSC/Ensembl) #3

@LuisaTurco

Description

@LuisaTurco

Hi all,

I'm currently analyzing data from the Illumina Infinium Mouse Methylation BeadChip and have encountered some inconsistencies between the annotation data (downloaded via BioMart, reference strain C57BL/6J) and public genome databases (e.g., UCSC, Ensembl).

When I manually map differentially methylated CpG sites (dmCpGs) based on the chromosomal coordinates provided in the annotation file, they often do not correspond to the same genes or regions shown in UCSC or Ensembl (tested with both mm10/GRCm38 and mm39/GRCm39). In some cases, the CpG is annotated to a gene in the Illumina manifest, but the coordinate does not fall within or near that gene in the genome browsers.

In addition, I would like to add to my analysis the exact position of the dmCpG sites relative to promoter regions and transcription start sites (TSS), but the current annotation inconsistencies make this unreliable.

Questions:

  1. What genome build was used to generate the current annotation files?
  2. Are the CpG probe annotations based on custom gene models or lifted-over between assemblies?
  3. Is there an updated annotation file that better aligns with UCSC/Ensembl references?
  4. Are there recommended tools or workflows to reliably annotate CpG sites in terms of proximity to TSS or promoter regions?

Thanks in advance for your help!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions