Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • EGrassi
    Member
    • Oct 2010
    • 66

    Tophat + htseq_count

    Hello, I'm performing RNAseq analyses and I've stumbled upon some puzzling results.
    I aligned some data with tophat2 (default settings) and as long as the results were disappointing (only about 5% of properly paired reads) I changed the -r and --mate-std-dev parameters and gotten to 60% (I know, still not very high). I ran htseq_count on the resulting bam alignments and comparing the two results I see no differences.
    Am I missing something? Does htseq_count use the information about properly paired reads or not? By these results I am prone to say no, I will check the code...
    Last edited by EGrassi; 01-14-2013, 05:20 AM.
  • EGrassi
    Member
    • Oct 2010
    • 66

    #2
    A quick check on the htseq_count code tells me that it never uses the reads "mate_aligned" attribute and just considers all of the paired reads. Does this seem a strange behaviour only to me? I don't see in any place a check on wheter the two reads fall at a sensible distance to be reliably considered in the counts.

    Comment

    • Simon Anders
      Senior Member
      • Feb 2010
      • 995

      #3
      The "mate_aligned" bit in the FLAG field indicates, in my reading of the SAM spec, that an alignment for the mate is given in the SAM file, not that this alignment is considered plausible. If TopHat really changes the mate_aligned field according to the distance, I'd consider this a very odd behaviour. In my opinion, it should set the alignment quality (5th field in the SAM file) to a low value to indicate that an alignment is reported but should not be trusted.

      htseq-count, by the way, filters by the alignment quality only if you use the -a option. I guess I should change this to be the default.

      Comment

      • EGrassi
        Member
        • Oct 2010
        • 66

        #4
        Originally posted by Simon Anders View Post
        The "mate_aligned" bit in the FLAG field indicates, in my reading of the SAM spec, that an alignment for the mate is given in the SAM file, not that this alignment is considered plausible. If TopHat really changes the mate_aligned field according to the distance, I'd consider this a very odd behaviour. In my opinion, it should set the alignment quality (5th field in the SAM file) to a low value to indicate that an alignment is reported but should not be trusted.
        As long as the samtools flagstat percentage of properly paired reads gotten on the accepted_hits changed setting the -r tophat parameter I believed that the ones reported as not properly aligned were in the sam file but should not be considered as aligned in the analyses.

        (filtering on quality only with an option is fine in my opinion by the way ).

        Comment

        Latest Articles

        Collapse

        • SEQadmin2
          Nine Things a Sample Prep Scientist Thinks About Before Sequencing
          by SEQadmin2


          I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.


          Here are nine questions we think about, in roughly the order they matter, before...
          06-18-2026, 07:11 AM
        • SEQadmin2
          From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
          by SEQadmin2


          Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


          The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
          ...
          06-02-2026, 10:05 AM

        ad_right_rmr

        Collapse

        News

        Collapse

        Topics Statistics Last Post
        Started by SEQadmin2, 06-17-2026, 06:09 AM
        0 responses
        30 views
        0 reactions
        Last Post SEQadmin2  
        Started by SEQadmin2, 06-09-2026, 11:58 AM
        0 responses
        96 views
        0 reactions
        Last Post SEQadmin2  
        Started by SEQadmin2, 06-05-2026, 10:09 AM
        0 responses
        116 views
        0 reactions
        Last Post SEQadmin2  
        Started by SEQadmin2, 06-04-2026, 08:59 AM
        0 responses
        109 views
        0 reactions
        Last Post SEQadmin2  
        Working...