Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Cutadapt Poor Mapping

    Hello.

    I am still researching this issue

    but essentially I am mapping cut adapt trims for Rnaseq data and when I trim them according to the manual and align them as single end reads with lib type -unstranded I get the map percent as 91%

    however if I align them as lib type first strand, or lib type second strand using tophat2 I get the map percent as 4%.

    what is the cause of this? I do have my qc reports.

  • #2
    This sounds like it could be an issue related to concordant reads.

    Have you tried mapping the reads using bowtie2 (which might give you more verbose statistics)? What is the fragment size of the reads (and did you specify that in the tophat2 command)? Did you prime tophat with a transcriptome GTF file?

    Comment


    • #3
      Length Sizes/ Frag Size

      Hello. thank you for the response.

      I am not sure how to use bowtie2 mapper and I will have to look into that.

      As for your questions

      1) The fragment size I am not too sure how to find, below is the PDF of the sequence distr from Fast QC, so the red line is the median seq. length. there is not a "single length" but I have a distribution of lengths. As for the parameter for this in TH2, i put the mater inner distance pair as 100. which parameter should I input to account for the frag size?

      2) Yes I did include the GTF transcriptome I am almost certain that was not an issue because the Singl end reads had fantastic results.

      thank you again.

      Could you give some insight regarding how concordant reads work? it seems as the the lengths are not matching up, or the lengths between the two reads is out of sound; and thus the matching fails.
      Attached Files

      Comment


      • #4
        The fragment size I am not too sure how to find, below is the PDF of the sequence distr from Fast QC, so the red line is the median seq. length. there is not a "single length" but I have a distribution of lengths.
        Fragment size (the length of template from which the reads are sequenced) doesn't necessarily relate to the read length (what you are describing). You need to ask for that information from the person who did the sequencing. With 150bp paired-end, I would expect a fragment size of about 400bp, so mate inner distance of 100bp (i.e. what you specified -- that's the correct parameter), but you do need to find that out.

        Could you give some insight regarding how concordant reads work? it seems as the the lengths are not matching up, or the lengths between the two reads is out of sound; and thus the matching fails.
        The bowtie2 manual discusses this:

        A pair that aligns with the expected relative mate orientation and with the expected range of distances between mates is said to align "concordantly". If both mates have unique alignments, but the alignments do not match paired-end expectations (i.e. the mates aren't in the expcted relative orientation, or aren't within the expected disatance range, or both), the pair is said to align "discordantly". Discordant alignments may be of particular interest, for instance, when seeking structural variants.

        Comment


        • #5
          so i contacted the lab that conducted the sequencing, and found that the fragment size has an insert size of 170bp. should I change the TH2 alignment parameters?
          thanks again

          Comment


          • #6
            Is that total length 470bp (in which case inner distance should be 170), or total length 170bp (inner distance -130)? Either way, it would be a good idea to set the tophat value to what it actually is.

            Comment


            • #7
              I just ran the bowtie aligner

              31584036 reads; of these:
              31584036 (100.00%) were paired; of these:
              6417877 (20.32%) aligned concordantly 0 times
              12924721 (40.92%) aligned concordantly exactly 1 time
              12241438 (38.76%) aligned concordantly >1 times
              ----
              6417877 pairs aligned concordantly 0 times; of these:
              583467 (9.09%) aligned discordantly 1 time
              ----
              5834410 pairs aligned 0 times concordantly or discordantly; of these:
              11668820 mates make up the pairs; of these:
              9583027 (82.13%) aligned 0 times
              1226455 (10.51%) aligned exactly 1 time
              859338 (7.36%) aligned >1 times
              84.83% overall alignment rate


              this is strange because TH2 give alignment rate as 4% for both library types. very odd. I am not sure how to process this.

              Comment


              • #8
                Calculating the Mate Inner Distance

                The lab responded and said The average insert size is 170 bp., however, how do I find the total length inner distance?

                Thank you in advance.

                Comment


                • #9
                  however, how do I find the total length inner distance?
                  Statistics relating to fragment length are confusing, and each program seems to chose a different statistic for its testing. "average insert size" could be the same thing as "inner distance" as defined by tophat, or it could be the total fragment size (including sequences at both ends), or it could be the distance from the start of one end to the start of the other end. I've already given answers for the first two cases (which are most likely).

                  I just ran the bowtie aligner... this is strange because TH2 give alignment rate as 4% for both library types
                  Okay, good. Now you need to tweak the fragment length using bowtie2 parameters -I and -X to match what tophat2 was doing, and that depends on what the mean fragment length was when the sequencing was carried out, which is difficult to do unless you know precisely what the lab's "average insert size" statistic relates to.

                  Comment

                  Latest Articles

                  Collapse

                  • seqadmin
                    Best Practices for Single-Cell Sequencing Analysis
                    by seqadmin



                    While isolating and preparing single cells for sequencing was historically the bottleneck, recent technological advancements have shifted the challenge to data analysis. This highlights the rapidly evolving nature of single-cell sequencing. The inherent complexity of single-cell analysis has intensified with the surge in data volume and the incorporation of diverse and more complex datasets. This article explores the challenges in analysis, examines common pitfalls, offers...
                    Yesterday, 07:15 AM
                  • seqadmin
                    Latest Developments in Precision Medicine
                    by seqadmin



                    Technological advances have led to drastic improvements in the field of precision medicine, enabling more personalized approaches to treatment. This article explores four leading groups that are overcoming many of the challenges of genomic profiling and precision medicine through their innovative platforms and technologies.

                    Somatic Genomics
                    “We have such a tremendous amount of genetic diversity that exists within each of us, and not just between us as individuals,”...
                    05-24-2024, 01:16 PM

                  ad_right_rmr

                  Collapse

                  News

                  Collapse

                  Topics Statistics Last Post
                  Started by seqadmin, Yesterday, 08:18 AM
                  0 responses
                  13 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, Yesterday, 08:04 AM
                  0 responses
                  12 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 06-03-2024, 06:55 AM
                  0 responses
                  13 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 05-30-2024, 03:16 PM
                  0 responses
                  27 views
                  0 likes
                  Last Post seqadmin  
                  Working...
                  X