Seqanswers Leaderboard Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • hbt
    Member
    • Jan 2011
    • 20

    Bowtie and Tophat

    We are currently trying to analyse Solid RNA-seq data using tophat and/or bowtie.
    With default settings tophat is able to align about 35% of reads, whereas bowtie on default is able to align 53%!!
    I'm assuming that tophat first runs bowtie then tried to align reads that span exon splice junctions, how is it therefore aligning less reads that bowtie alone!?
    With some parameter adjustments (i.e. --best flag) we are able to get up to 70% reads mapped using bowtie alone but in doing this we are not able to map the splice junction reads and the --best flag cannot be set for bowtie when running it through tophat.
    Does anyone know if it is possible to get tophat to output only the splice juntion reads (i.e. everything bowtie would not have aligned) so they can be added to the bowtie output, or is there a fairly simple way of altering the bowtie parameters when running it within tophat?

    As you might have guessed, I am only just begining my journey into the worderful world of bioinformatics so go easy on me!

    Any help or advise would be greatly received!!

    Huw
  • hlwright
    Member
    • Feb 2011
    • 30

    #2
    We also can't understand why tophat is less able to map reads than bowtie despite it using bowtie for the initial mapping?

    Comment

    • adumitri
      Member
      • Jan 2010
      • 27

      #3
      Hi,

      Hopefully, this is the right place to post my question:

      I ran TopHat v1.2.0 on single reads (40 ntds) generated with Illumina GA IIx. These were the used options:

      tophat --max-multihits 2\
      --segment-mismatches 2\
      --library-type fr-unstranded\
      -p 4\
      -o sample_tophat_out\
      hg19 sample.fastq

      After getting the results, I tried to collect some statistics about the runs. Interestingly, in the generated log folder for each of the samples, there are two files that contain statistics data such as:

      ==> fileq5okX8.log <==
      # reads processed: 28183908
      # reads with at least one reported alignment: 1081639 (3.84%)
      # reads that failed to align: 26764052 (94.96%)
      # reads with alignments suppressed due to -m: 338217 (1.20%)
      Reported 1219857 alignments to 1 output stream(s)

      ==> fileueW7Yw.log <==
      # reads processed: 28183908
      # reads with at least one reported alignment: 20657663 (73.30%)
      # reads that failed to align: 3984728 (14.14%)
      # reads with alignments suppressed due to -m: 3541517 (12.57%)
      Reported 23408344 alignments to 1 output stream(s)

      There is no documentation on TopHat's website on what these two files actually represent. As you can see, the summaries look pretty different - I am not sure why there are two files in the first place, nor do I understand why there is such a large difference between the % of aligned reads.

      There are also two additional files that seem to contain relevant data: reports.log and prep_reads.log. Does anyone know what the results presented in all these files are?

      Thank you so much!
      Alexandra

      Comment

      • ttnguyen
        Member
        • Mar 2010
        • 41

        #4
        Looks like the files in log folder are not much informative. It is quite easy to collect statistics about the mapping by looking at 'accepted_hits.bam'.

        Comment

        • sterding
          Member
          • Sep 2010
          • 36

          #5
          This post helps a lot to the question here:

          Comment

          Latest Articles

          Collapse

          • seqadmin
            New Genomics Tools and Methods Shared at AGBT 2025
            by seqadmin


            This year’s Advances in Genome Biology and Technology (AGBT) General Meeting commemorated the 25th anniversary of the event at its original venue on Marco Island, Florida. While this year’s event didn’t include high-profile musical performances, the industry announcements and cutting-edge research still drew the attention of leading scientists.

            The Headliner
            The biggest announcement was Roche stepping back into the sequencing platform market. In the years since...
            03-03-2025, 01:39 PM
          • seqadmin
            Investigating the Gut Microbiome Through Diet and Spatial Biology
            by seqadmin




            The human gut contains trillions of microorganisms that impact digestion, immune functions, and overall health1. Despite major breakthroughs, we’re only beginning to understand the full extent of the microbiome’s influence on health and disease. Advances in next-generation sequencing and spatial biology have opened new windows into this complex environment, yet many questions remain. This article highlights two recent studies exploring how diet influences microbial...
            02-24-2025, 06:31 AM

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by seqadmin, 03-20-2025, 05:03 AM
          0 responses
          17 views
          0 reactions
          Last Post seqadmin  
          Started by seqadmin, 03-19-2025, 07:27 AM
          0 responses
          19 views
          0 reactions
          Last Post seqadmin  
          Started by seqadmin, 03-18-2025, 12:50 PM
          0 responses
          19 views
          0 reactions
          Last Post seqadmin  
          Started by seqadmin, 03-03-2025, 01:15 PM
          0 responses
          186 views
          0 reactions
          Last Post seqadmin  
          Working...