Announcement

Collapse
No announcement yet.

iGenomes data set....which to use?

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • iGenomes data set....which to use?

    Can anyone shed any light on the relative merits of the 3 sources of reference data available on iGenomes (UCSC/NCBI/Ensembl)?

    Thanks

    Huw

  • #2
    They all (currently) share the same genome assembly for humans and mice.
    The differ in the annotation (where exactly are genes, what are they called, how do the transcripts look like?).

    I personally prefer to work with Ensembl, since they have a clear release policy (making it easy to reference the exact data you were using) , archives of older releases , a consistent set of 'stable ids' for genes/transcripts/proteins, the largest set of (chordata) genomes and provide backlinks for the evidence for each transcript.

    UCSC is often a bit more cutting edge, they have somewhat of an ongoing release process, nowadays include ENCODE genes (which in turn come from Ensembl...).

    Can't really say anything about the NCBI.

    So long,
    Florian

    Comment


    • #3
      Good question. I've heard anecdotally that using cufflinks with RABT assembly using refseq/ucsc gives you MANY more novel isoforms than when using Ensembl. Why or whether that's a bad thing, I'm not sure.

      Comment


      • #4
        @turnersed: Guess the first question is whether they're real 'novel' isoforms, or had already been in Ensembl.

        Comment

        Working...
        X