Seqanswers Leaderboard Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • apredeus
    Senior Member
    • Jul 2012
    • 151

    what command line options do you mean? stranded=yes is default behaviour, is it not? I'm using "-t exon" options and that's about it, really. Default strandedness and default "overlap" mode all work for me...

    Comment

    • gringer
      David Eccles (gringer)
      • May 2011
      • 845

      The default options are fine for doing this.

      Comment

      • apredeus
        Senior Member
        • Jul 2012
        • 151

        great, thank you!

        Comment

        • dlepe
          Junior Member
          • Jul 2014
          • 8

          Any idea about post #161???

          Comment

          • apredeus
            Senior Member
            • Jul 2012
            • 151

            flags 0/16 should be enough to tell the strand.

            did you try to visualize your bam (or small portion of it) in IGV or something like that? Make sure you have a picture you would expect to see.

            Comment

            • kmcarr
              Senior Member
              • May 2008
              • 1181

              Originally posted by apredeus View Post
              what command line options do you mean? stranded=yes is default behaviour, is it not? I'm using "-t exon" options and that's about it, really. Default strandedness and default "overlap" mode all work for me...
              Originally posted by gringer View Post
              The default options are fine for doing this.
              Be aware that there are two possible --stranded options: "yes" and "reverse". (I guess there's really 3 if you also count "no".) Which one is correct depends on the method used to create the strand specific RNA-Seq library. "yes" tells htseq that the first read for paired end reads (or only read for single end reads) will match the sense strand of the mRNA. "reverse" means that read 1 will match the anti-sense strand. If your libraries were prepared using an Illumina TruSeq Stranded RNA library prep kit (uses the dUTP method) then you should use '--stranded=reverse' for htseq. For other library prep kits/methods carefully check the manufacturer's documentation to determine which orientation the library is.

              Comment

              • kmcarr
                Senior Member
                • May 2008
                • 1181

                Originally posted by dlepe View Post
                Since my protocol wasn't stranded I should be losing half the counts when --stranded=yes but as you can see this was not the case.. I tried the same for some Illumina data I have access to and got this, which I think its alright.
                Are you positive that the method/kit used to prepare the library is not strand-specific? The Life Tech SOLiD™ Total RNA-Seq Kit does generate strand-specific libraries. How was your library prepared?

                Comment

                • liaojinyue
                  Junior Member
                  • Oct 2012
                  • 8

                  Error: 'pair_alignments' needs a sequence of paired-end alignments

                  Hi Simon,

                  I'm new to HTSeq and encountered an error when analyzing our stranded paired-end RNA-seq data.

                  Let me describe the procedure of my analysis.
                  1. Use cutadapt to trim adapters and trim by quality
                  2. Use tophat2 to map to the mouse genome
                  3. Sort the accepted_hits.bam by name with samtools
                  4. Error reported when I was trying to count the read using htseq-count

                  'pair_alignments' needs a sequence of paired-end alignments.

                  I search the forum, this error is reported when combining paired-end and single end read accepted_hits.bam into one input file for htseq-count. I know htseq-count handles orphan reads from tophat output .bam file well. Is it possible cutadapt causes some problem in our case? Thanks.

                  Jason

                  Comment

                  • emolinari
                    Member
                    • May 2013
                    • 47

                    Hi everyone,

                    sorry for sneaking in this thread, but I guess you can definitively help and maybe it is also an easy one (- I am not a bioinformatician...yet I am trying to do my best!)


                    I am doing RNA-seq on human samples. I have run several samples on 2x75 Illumina HiSeq200, then processed raw reads with Tophat, run samtools to convert bam into sam files, sorted sam file and proceeded with Htseq. *flagstat results from samtools are all ok (excellent mapping and paired mapping rate).


                    Here my Htseq commands:

                    htseq-count -m HTSeq.scripts.count -m intersection-nonempty -s yes -i gene_id file.sam annotation.gtf > HTSeq_counts.txt 2> HTSeq_sterr.txt

                    Although HTSeq count provides the .txt file with all the ensemble ids and relative counts, the sterror file is a huge (3-4Gb) text file full of warnings:
                    Warning: Read HWI-ST1144:606:H9U7WADXX:1:2208:15447:12819 claims to have an aligned mate which could not be found. (Is the SAM file properly sorted?)

                    So I figured out that maybe I had to sort the sam file. I did it with this command, which I found in other threads:

                    sort -k 3,3 -k 4,4n hits.sam > hits.sam.sorted

                    Now I run HTSeq again (same commands, sorted sam file as input) and the sterror file comes out gigantic as before.

                    I really don't know how to proceed...
                    1) is this error file a relatively acceptable thing?
                    2) am I doing something terribly wrong which I am not aware of?

                    I appreciate any help...
                    thanks!!!
                    Manu

                    Comment

                    • dlepe
                      Junior Member
                      • Jul 2014
                      • 8

                      Originally posted by kmcarr View Post
                      Are you positive that the method/kit used to prepare the library is not strand-specific? The Life Tech SOLiD™ Total RNA-Seq Kit does generate strand-specific libraries. How was your library prepared?
                      I'm not sure how the library was prepared but I was told that it was not stranded and when I look at my mapping file I see about half of the reads aligned to each of the strands which is what you'd expect for a no stranded protocol..

                      Comment

                      • dlepe
                        Junior Member
                        • Jul 2014
                        • 8

                        Originally posted by apredeus View Post
                        flags 0/16 should be enough to tell the strand.

                        did you try to visualize your bam (or small portion of it) in IGV or something like that? Make sure you have a picture you would expect to see.
                        As I stated in the post previous to this one, it looks ok to me..

                        Comment

                        • fanli
                          Senior Member
                          • Jul 2014
                          • 197

                          Originally posted by emolinari View Post
                          sort -k 3,3 -k 4,4n hits.sam > hits.sam.sorted
                          This sorts your alignment by genomic position. You want to sort by read name:

                          sort -k 1,1 hits.sam > hits.sam.sorted

                          Comment

                          • emolinari
                            Member
                            • May 2013
                            • 47

                            Originally posted by fanli View Post
                            This sorts your alignment by genomic position. You want to sort by read name:
                            Hi fanli,

                            thanks for your reply! Meanwhile I had done samtools sort -n on my .bam file, samtools view -h to create a .sam file and run HTSeq again.
                            I guess it worked, as the sterror file created was way way smaller, and contains this:
                            100000 sam line pairs processed.
                            200000 sam line pairs processed.
                            300000 sam line pairs processed.
                            500000 sam line pairs processed.
                            600000 sam line pairs processed.
                            700000 sam line pairs processed.
                            800000 sam line pairs processed.
                            900000 sam line pairs processed.
                            1000000 sam line pairs processed.
                            1100000 sam line pairs processed.
                            1200000 sam line pairs processed.
                            1300000 sam line pairs processed.
                            1400000 sam line pairs processed.
                            1500000 sam line pairs processed.
                            1600000 sam line pairs processed.
                            1700000 sam line pairs processed.
                            1800000 sam line pairs processed.
                            1900000 sam line pairs processed.
                            2000000 sam line pairs processed.
                            2100000 sam line pairs processed.
                            2200000 sam line pairs processed.
                            2300000 sam line pairs processed.
                            2500000 sam line pairs processed.
                            [...]
                            31936941 sam line pairs processed.

                            So I guess this is correct, right?
                            So I guess my question is: does it make any difference practically if you sort the .bam and then make it .sam or if you sort a .sam file with the command you suggested? I am too ignorant to appreciate the difference!

                            Thanks again!!!

                            Comment

                            • fanli
                              Senior Member
                              • Jul 2014
                              • 197

                              Yep, that output looks good.

                              samtools sort -n gives the same result as Unix sort -k 1,1 on the SAM file.

                              Comment

                              • emolinari
                                Member
                                • May 2013
                                • 47

                                Originally posted by fanli View Post
                                Yep, that output looks good.

                                samtools sort -n gives the same result as Unix sort -k 1,1 on the SAM file.
                                That's reassuring!!!
                                thanks indeed

                                Manu

                                Comment

                                Latest Articles

                                Collapse

                                • seqadmin
                                  Pathogen Surveillance with Advanced Genomic Tools
                                  by seqadmin




                                  The COVID-19 pandemic highlighted the need for proactive pathogen surveillance systems. As ongoing threats like avian influenza and newly emerging infections continue to pose risks, researchers are working to improve how quickly and accurately pathogens can be identified and tracked. In a recent SEQanswers webinar, two experts discussed how next-generation sequencing (NGS) and machine learning are shaping efforts to monitor viral variation and trace the origins of infectious...
                                  03-24-2025, 11:48 AM
                                • seqadmin
                                  New Genomics Tools and Methods Shared at AGBT 2025
                                  by seqadmin


                                  This year’s Advances in Genome Biology and Technology (AGBT) General Meeting commemorated the 25th anniversary of the event at its original venue on Marco Island, Florida. While this year’s event didn’t include high-profile musical performances, the industry announcements and cutting-edge research still drew the attention of leading scientists.

                                  The Headliner
                                  The biggest announcement was Roche stepping back into the sequencing platform market. In the years since...
                                  03-03-2025, 01:39 PM
                                • seqadmin
                                  Investigating the Gut Microbiome Through Diet and Spatial Biology
                                  by seqadmin




                                  The human gut contains trillions of microorganisms that impact digestion, immune functions, and overall health1. Despite major breakthroughs, we’re only beginning to understand the full extent of the microbiome’s influence on health and disease. Advances in next-generation sequencing and spatial biology have opened new windows into this complex environment, yet many questions remain. This article highlights two recent studies exploring how diet influences microbial...
                                  02-24-2025, 06:31 AM

                                ad_right_rmr

                                Collapse

                                News

                                Collapse

                                Topics Statistics Last Post
                                Started by seqadmin, 03-20-2025, 05:03 AM
                                0 responses
                                41 views
                                0 reactions
                                Last Post seqadmin  
                                Started by seqadmin, 03-19-2025, 07:27 AM
                                0 responses
                                44 views
                                0 reactions
                                Last Post seqadmin  
                                Started by seqadmin, 03-18-2025, 12:50 PM
                                0 responses
                                35 views
                                0 reactions
                                Last Post seqadmin  
                                Started by seqadmin, 03-03-2025, 01:15 PM
                                0 responses
                                191 views
                                0 reactions
                                Last Post seqadmin  
                                Working...