Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • questions on my bowtie2 alignmet results

    Hi every one,
    I am trying to identify the up-regulated or down-regulated genes by a compound. Here is the results of alignments for one control with different parameters.

    $~/my_rnaseq_dat$ ~/bin/bowtie2 -p 8 --sensitive -x Amhg45 -U ~/sequencing_data2013/CK_2.fastq -S CK_2.sam
    56874766 reads; of these:
    56874766 (100.00%) were unpaired; of these:
    9752857 (17.15%) aligned 0 times
    43225883 (76.00%) aligned exactly 1 time
    3896026 (6.85%) aligned >1 times
    82.85% overall alignment rate

    $~/my_rnaseq_dat$ ~/bin/bowtie2 -p 8 --very-sensitive -x Amhg45 -U ~/sequencing_data2013/CK_2.fastq -S ~/sequencing_data2013/CK_2.sam
    56874766 reads; of these:
    56874766 (100.00%) were unpaired; of these:
    9695758 (17.05%) aligned 0 times
    43182043 (75.92%) aligned exactly 1 time
    3996965 (7.03%) aligned >1 times
    82.95% overall alignment rate

    Is the 83% aligned reads for a sample enough for my purpose?

    Althoug I changed the parameter, I could not inrease the percent of the reads aligned exactly 1 time and decrease the percent of the reads aligned >1 times.

    What does "reads (7%) aligned >1 time" mean, and can I used them for the following analysis? If they could not be used, how can I remove it from the sam file?

    Thanks a lot?

    Richard

  • #2
    83% is an OK alignment rate (I assume that there's not much splicing in your organism, otherwise it'd be lower). You can probably increase that percentage by adapter/quality trimming your reads (see trimmomatic, cut_adapt, or trim_galore). The reads that aligned more than once are often called "multimappers". For these, it's ambiguous where in the genome/transcriptome they originate, since they align equally well multiple places. With increased read length or paired-end reads, this number would likely decrease, but probably not to 0 (there are a LOT of repetitive elements out there).

    Comment


    • #3
      Hi dpryan,
      thanks for your answers.
      Can I use the reads aligned >1 time for the following analysis?
      Last edited by wmseq; 10-30-2013, 09:26 AM.

      Comment


      • #4
        It depends on how you do it. For the normal count-based methods (DESeq/edgeR/etc.), no you can't use those. Having said that, you have pretty good depth on that sample, so I wouldn't stress about it.

        Comment


        • #5
          dpryan,
          Thanks a lot!!
          Yes, I will use edgeR and DESeq to indentify the up-regulated or down-regulated genes. Therefore, I have to revome the reads aligned > 1 times, right?

          Do I need removing them just after their alignments or when I use edgeR/DESeq?

          Comment


          • #6
            If you use htseq-count, just have it only count reads with MAPQ scores sufficiently high (maybe 10 or 20 to get only reliable alignments). Multimappers in bowtie2 have MAPQs of 0 or 1.

            Comment


            • #7
              You can filter out the bowtie reads aligned >1 times by using 'samtools view -q 5', which will remove reads with a Phred score of less than 5. A read that maps equally to two separate positions has a maximum match likelihood of 50% (0.5), which corresponds to a Phred score of just over 3.

              Comment


              • #8
                @gringer: FYI for bowtie2, MAPQ != -10*log10(probability alignment is incorrect). The situation you describe producing a MAPQ of 3 will actually produce one of 0 or 1 (the latter if both alignments have the maximum possible alignment score given the read length).

                Comment

                Latest Articles

                Collapse

                • seqadmin
                  Recent Advances in Sequencing Analysis Tools
                  by seqadmin


                  The sequencing world is rapidly changing due to declining costs, enhanced accuracies, and the advent of newer, cutting-edge instruments. Equally important to these developments are improvements in sequencing analysis, a process that converts vast amounts of raw data into a comprehensible and meaningful form. This complex task requires expertise and the right analysis tools. In this article, we highlight the progress and innovation in sequencing analysis by reviewing several of the...
                  05-06-2024, 07:48 AM
                • seqadmin
                  Essential Discoveries and Tools in Epitranscriptomics
                  by seqadmin




                  The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
                  04-22-2024, 07:01 AM

                ad_right_rmr

                Collapse

                News

                Collapse

                Topics Statistics Last Post
                Started by seqadmin, Today, 02:46 PM
                0 responses
                10 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 05-07-2024, 06:57 AM
                0 responses
                13 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 05-06-2024, 07:17 AM
                0 responses
                16 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 05-02-2024, 08:06 AM
                0 responses
                23 views
                0 likes
                Last Post seqadmin  
                Working...
                X