Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Unmapped ratio very high on mouse genome

    Hi,
    My problem regards RNA-Seq data. I've downloaded public data (SAGE libs w/ 6 different samples from mouse liver ) to analyse using ArrayStudio. When I try to map them on the B38 mus musculus genom I have an unmapped read % of approximatly 95 % on all the samples!!! Quality scores are correct around 40 read length is correct (35 bp) but the base distrib QC is just very heterogenous, I don't understand why... this the first time I work on mouse data.Does anybody shared the same problem or have an idea please regarding the mapping and/or the base distrib?

    Thanks, LN
    Gene R' Us!

  • #2
    You may want to do a FastQC run on the data first to check on the quality. The data you downloaded may be raw and you may need to trim/clean the data before doing analysis/alignments.

    Comment


    • #3
      They say there is a 16 bp adaptor on each read, but my reads are at the correct length 35 bp on the QC. Do I really need to trim them?
      Gene R' Us!

      Comment


      • #4
        Quality score histogrammes look very good for each sample.
        Gene R' Us!

        Comment


        • #5
          Originally posted by le.nono View Post
          They say there is a 16 bp adaptor on each read, but my reads are at the correct length 35 bp on the QC. Do I really need to trim them?
          Is this supposed to be an "inline" adapter that is part of the actual sequence? Are you able to tell by looking at the reads?

          Comment


          • #6
            I don't think so the reads are very short. What do you have in mind?
            Gene R' Us!

            Comment


            • #7
              Can you post a FastQC (or which ever kind of QC you used) graph of the base distribution?

              I was thinking that one way you would get 95% of reads unmapped is if the barcodes/adapter were still present in the reads (inline). Do you know if they have already been removed?

              Comment


              • #8
                no I don't have this information.

                Gene R' Us!

                Comment


                • #9
                  Maybe a better quality and size one.

                  Last edited by le.nono; 06-17-2013, 08:25 AM.
                  Gene R' Us!

                  Comment


                  • #10
                    All the sequences appear to be starting with exactly the same 4 nucleotides (GCCA). Is that a barcode?

                    Comment


                    • #11
                      Are you able to map other SAGE data with your pipeline? Maybe it is not set up for such short tags.

                      The 4bp starting sequence is the cut site, right?

                      Also, are these ditags of 16 bp? Those would not map unless you split them first.
                      Providing nextRAD genotyping and PacBio sequencing services. http://snpsaurus.com

                      Comment


                      • #12
                        I m gonna try to trim those 4bp first map the reads. I definitely need further informations on the reads... I dont much about those 16 bp adapter its just written in the abstract coming with the data. Do you say that what is display on the base distrib histrogrammes are ditags of 16 bp?
                        Last edited by le.nono; 06-17-2013, 10:44 AM.
                        Gene R' Us!

                        Comment


                        • #13
                          If it is SAGE data, then you should look here for an overview of the method:

                          (August 2004) With the advent of the human genome project, a vast amount of information about genes and gene structure is suddenly at our fingertips. But this information is limited. Every cell within an organism has the same genetic composition (with the exception of its gametes), and yet, obviously skin tissue is very different from


                          It is an older method meant to increase the sampling of transcripts with Sanger sequencing. There are some mouse mapping tools here:


                          But I suspect you'll want to find some newer RNA-Seq data that isn't SAGE based and you'll find it easier to go forward.
                          Providing nextRAD genotyping and PacBio sequencing services. http://snpsaurus.com

                          Comment


                          • #14
                            Ok it s becoming clearer now. I really need these data I use so i m gonna stick to them even if it s harder. I m gonna try to look for in the literature some RNA Seq with SAGE preps I think its been done before what do you think?
                            Gene R' Us!

                            Comment

                            Latest Articles

                            Collapse

                            • seqadmin
                              Best Practices for Single-Cell Sequencing Analysis
                              by seqadmin



                              While isolating and preparing single cells for sequencing was historically the bottleneck, recent technological advancements have shifted the challenge to data analysis. This highlights the rapidly evolving nature of single-cell sequencing. The inherent complexity of single-cell analysis has intensified with the surge in data volume and the incorporation of diverse and more complex datasets. This article explores the challenges in analysis, examines common pitfalls, offers...
                              06-06-2024, 07:15 AM
                            • seqadmin
                              Latest Developments in Precision Medicine
                              by seqadmin



                              Technological advances have led to drastic improvements in the field of precision medicine, enabling more personalized approaches to treatment. This article explores four leading groups that are overcoming many of the challenges of genomic profiling and precision medicine through their innovative platforms and technologies.

                              Somatic Genomics
                              “We have such a tremendous amount of genetic diversity that exists within each of us, and not just between us as individuals,”...
                              05-24-2024, 01:16 PM

                            ad_right_rmr

                            Collapse

                            News

                            Collapse

                            Topics Statistics Last Post
                            Started by seqadmin, Today, 08:58 AM
                            0 responses
                            8 views
                            0 likes
                            Last Post seqadmin  
                            Started by seqadmin, Yesterday, 02:20 PM
                            0 responses
                            14 views
                            0 likes
                            Last Post seqadmin  
                            Started by seqadmin, 06-07-2024, 06:58 AM
                            0 responses
                            181 views
                            0 likes
                            Last Post seqadmin  
                            Started by seqadmin, 06-06-2024, 08:18 AM
                            0 responses
                            231 views
                            0 likes
                            Last Post seqadmin  
                            Working...
                            X