Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • RNA-seq: reference genome of another species

    Hi,

    I am running tophat and cufflink of dog RNA-seq data using human reference genome. I got very few genes. I got a lot more genes using canine reference genome.

    I know I will probably have to relax some parameters in alignment step etc.
    has anyone here done this before? Could you please give me some advice?

    thanks

  • #2
    If you have a canine reference genome, why are you trying to align to the human reference genome? It really makes no sense to do this.

    That being said, I have a lot of experience of aligning to non-reference genomes, particularly in the Brassica's where until recently I did not have a reference. You can increase mapping by relaxing some of the parameters, but it will also create a lot of problems in the quality of your mapping, resulting in more multi-mapped genes. The approach I typically use is to align in multiple steps. I will first align everything using very stringent settings, extract the unmapped reads, and realign those at slightly relaxed settings. Repeat again, again relaxing the settings.

    Since there are a lot of Brassica varieties that can differ considerably, even within the same species, I have also taken the approach of doing an initial mapping and then call variants. I then use those variants to create a consensus genome sequence for each variety using GATK's alternative reference maker. I'll then realign back to this and obtain much better mapping. However this approach I have only used for highly divergent varieties within the same species and would probably not recommend it for such divergent species as humans and dogs.

    Comment


    • #3
      thanks for your input. I agree with you except human genome is much better annotated than dogs. There is a benefit for doing that if it works.

      I recently came across a paper (below) that showed very high alignment rate between dogs RNA and human genome, although it was unclear how they achieved that. They didn't use tuxedo software though.

      Regards


      *********************************************

      J Card Fail. 2012 Nov;18(11):872-8. doi: 10.1016/j.cardfail.2012.09.004.

      A global transcriptome analysis of a dog model of congestive heart failure with the human genome as a reference.

      Isono T, Matsumoto T, Wada A, Suzaki M, Chano T.


      Source

      Central Research Laboratory, Shiga University of Medical Science, Otsu, Shiga, Japan. [email protected]


      Abstract


      BACKGROUND:

      The global molecular changes in cardiac tissue during congestive heart failure (CHF) have not been fully examined. Transcriptome analysis with the use of next-generation sequencers is a useful tool for elucidating the pathogenesis of CHF. Although there are some advantages in a dog CHF model, transcriptome analyses in dogs are limited by the relative lack of genomic information.

      METHODS AND RESULTS:

      The transcriptome analysis of hearts from dogs with CHF was conducted with the use of a genome analyzer and the Casava software. The mRNA sequence reads showed alignments with ∼800 of 1,019 genes from the dog reference database. On the other hand, the reads aligned with ∼15,000 of the 21,407 genes in the hg19 human reference database. The correlation of expressed genes was extremely high (r = 0.93; P < .0001) between the dog and human databases. A pathway analysis using the hg19 reference revealed increased expression of p53 pathway-related (P < 10(-10)) and inflammatory interleukin-related (P < 10(-10)) genes in the CHF model.

      CONCLUSIONS:

      The use of the human genome as a reference in global transcriptome analyses of dogs is a useful approach for investigating diseases such as CHF. Such an approach would also be useful for analyzing disease models in other experimental animals.

      Copyright © 2012 Elsevier Inc. All rights reserved.

      Comment


      • #4
        I have the same problem when it comes to new plant genomes, which are poorly annotated in comparison to the Arabidopsis genome.

        My solution to this was to BLAST the transcripts from the other genomes against Arabidopsis, and use the top hit as my annotation. This was also the approach used just recently in a paper in Tomato and it is the basic principle between Blast2GO.

        My problem with aligning to a genome that is too divergent is that polyploidy is very common in plants, and in mapping Brassica reads onto Arabidopsis or some other species, I am missing some of the polyploidization events that occurred in the evolution of these species which can lead to some messy results.

        EDIT: I realize you are not working with plants, I am just pointing out that there are major issues in trying to do RNA-seq using reference genomes from other species and that is you have a reference, even if not as well annotated, mapping to another reference may not be the best approach.
        Last edited by chadn737; 07-09-2013, 12:41 PM.

        Comment


        • #5
          hiya all

          I am trying to do something similar (aligning tiger Illumina paired-end reads with the cat reference genome using BWA). Problem is: I am an extreme noobie. .

          chadn737 your technique of aligning to non reference species sounds like a good way to go, but would you (or anyone else) mind expanding on that a little (and if you can in the context of BWA) ? How do you re-align unmapped reads with less stringency whilst keeping the already aligned reads?

          Also, how far would you go (I have no idea how many unmapped reads I should expect or when I should stop relaxing settings) and which settings should/should I not change ? I am guessing that I would allow more gaps and mismatches but then do I need to decrease penalties too ?

          Comment


          • #6
            If you have reasonably deep data sets, why not do a de novo transcriptome assembly, and map the reads back to that for the expression testings? Trinity builds all this into a nice package.

            Comment


            • #7
              Originally posted by Wallysb01 View Post
              If you have reasonably deep data sets, why not do a de novo transcriptome assembly, and map the reads back to that for the expression testings? Trinity builds all this into a nice package.
              Oops sorry, I totally forgot to mention that I am working with genomic not transcriptomic data.

              Comment


              • #8
                @JQL: why do you want to align to human? You can do alignments, call DE genes or whatever you like, then find human orthologs for your dog genes using Biomart or such. In practical terms I think this might be a better idea for determining gene function.

                Comment

                Latest Articles

                Collapse

                • seqadmin
                  Strategies for Sequencing Challenging Samples
                  by seqadmin


                  Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                  03-22-2024, 06:39 AM
                • seqadmin
                  Techniques and Challenges in Conservation Genomics
                  by seqadmin



                  The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                  Avian Conservation
                  Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                  03-08-2024, 10:41 AM

                ad_right_rmr

                Collapse

                News

                Collapse

                Topics Statistics Last Post
                Started by seqadmin, 03-27-2024, 06:37 PM
                0 responses
                12 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 03-27-2024, 06:07 PM
                0 responses
                11 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 03-22-2024, 10:03 AM
                0 responses
                53 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 03-21-2024, 07:32 AM
                0 responses
                69 views
                0 likes
                Last Post seqadmin  
                Working...
                X