Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • TopHat+BWT = reads that failed to align ~ 99%

    I ran Tophat and cufflinks successfully!
    But I think some the output results went wrong (red colour ) but I'm not sure.

    I ran 17246957 reads and my TopHat outputs are
    accepted_hits.sam = 152434
    junctions.bed = 12672

    logs
    $ cat file2lqYQ1.log
    Code:
    # reads processed: 17184877
    # reads with at least one reported alignment: 17194 (0.10%)
    [COLOR="Red"][B]# reads that failed to align: 17156835 (99.84%)[/B][/COLOR]
    # reads with alignments suppressed due to -m: 10848 (0.06%)
    Reported 127509 alignments to 1 output stream(s)
    $ cat fileIw0Tgh.log

    Code:
    # reads processed: 17184877
    # reads with at least one reported alignment: 69661 (0.41%)
    [COLOR="Red"][B]# reads that failed to align: 17098690 (99.50%)[/B][/COLOR]
    # reads with alignments suppressed due to -m: 16526 (0.10%)
    Reported 487470 alignments to 1 output stream(s)
    $ cat prep_reads.log
    Code:
    prep_reads v1.0.13
    ---------------------------
    62080 out of 17246957 reads have been filtered out
    Code:
    segment_juncs v1.0.13
    ---------------------------
    Loading reference sequences...
            Loading chr1...done
            Loading chr2...done
            Loading chr3...done
            Loading chr4...done
            Loading chr5...done
            Loading chr6...done
            Loading chr7...done
            Loading chr8...done
            Loading chr9...done
            Loading chr10...done
            Loading chr11...done
            Loading chr12...done
            Loading chr13...done
            Loading chr14...done
            Loading chr15...done
            Loading chr16...done
            Loading chr17...done
            Loading chr18...done
            Loading chr19...done
            Loading chr20...done
            Loading chr21...done
            Loading chr22...done
            Loading chrX...done
            Loading chrY...done
            Loading chrM...done
    Found 0 potential split-segment junctions
    Indexing extensions in Tophat_Brain/tmp//left_kept_reads_missing.fq
    Total extensions: 394607205
    Looking for junctions by island end pairings
    Adding hits from segment file 0 to coverage map
    Map covers 382631 bases
    Map covers 374273 bases in sufficiently long segments
    Map contains 8351 good islands
    417440 are left looking bases
    417332 are right looking bases
    Collecting potential splice sites in islands
    reporting synthetic splice junctions...
    Examining donor-acceptor pairings in chr20
    Examining donor-acceptor pairings in chr21
    Examining donor-acceptor pairings in chr22
    Examining donor-acceptor pairings in chr19
    Examining donor-acceptor pairings in chr18
    Examining donor-acceptor pairings in chr11
    Examining donor-acceptor pairings in chr10
    Examining donor-acceptor pairings in chr13
    Examining donor-acceptor pairings in chr12
    Examining donor-acceptor pairings in chr15
    Examining donor-acceptor pairings in chr14
    Examining donor-acceptor pairings in chr17
    Examining donor-acceptor pairings in chr16
    Examining donor-acceptor pairings in chrX
    Examining donor-acceptor pairings in chrY
    Examining donor-acceptor pairings in chr2
    Examining donor-acceptor pairings in chr3
    Examining donor-acceptor pairings in chr1
    Examining donor-acceptor pairings in chr6
    Examining donor-acceptor pairings in chr7
    Examining donor-acceptor pairings in chr4
    Examining donor-acceptor pairings in chr5
    Examining donor-acceptor pairings in chr8
    Examining donor-acceptor pairings in chr9
    Found 865 potential island-end pairing junctions
    done
    Looking for junctions between and within islands
    Adding hits from segment file 0 to coverage map
    Recording coverage islands
    Found 62807 islands covering 2120609 bases
    Collecting potential splice sites in islands
    reporting synthetic splice junctions...
    Examining donor-acceptor pairings in chr20
    Examining donor-acceptor pairings in chr21
    Examining donor-acceptor pairings in chr22
    Examining donor-acceptor pairings in chr19
    Examining donor-acceptor pairings in chr18
    Examining donor-acceptor pairings in chr11
    Examining donor-acceptor pairings in chr10
    Examining donor-acceptor pairings in chr13
    Examining donor-acceptor pairings in chr12
    Examining donor-acceptor pairings in chr15
    Examining donor-acceptor pairings in chr14
    Examining donor-acceptor pairings in chr17
    Examining donor-acceptor pairings in chr16
    Examining donor-acceptor pairings in chrX
    Examining donor-acceptor pairings in chrY
    Examining donor-acceptor pairings in chr2
    Examining donor-acceptor pairings in chr3
    Examining donor-acceptor pairings in chr1
    Examining donor-acceptor pairings in chr6
    Examining donor-acceptor pairings in chr7
    Examining donor-acceptor pairings in chr4
    Examining donor-acceptor pairings in chr5
    Examining donor-acceptor pairings in chr8
    Examining donor-acceptor pairings in chr9
    Found 173876 potential intra-island junctions
    done
    Reporting potential splice junctions...done
    Reported 174467 total possible splices

  • #2
    What's your mapping rate if you just run them through bowtie instead of tophat?

    Comment


    • #3
      it is the same just with bowtie

      why ~90% are failed ?
      I just used Eric Wang's public data means it is good data.

      Comment


      • #4
        Did you remember to clip the adapters?

        From experience I once aligned a Fastq file containing adapters and received a very similar low percentage alignment.

        That's my guess.

        Comment


        • #5
          Hey repinementer,

          did you solve this problem by clipping adapters and do you know the actual meaning of the different log files you posted? I get similar log files with different postfixes (fileXYZ.log) that contain different entries and I cannot make much sense out of them.

          Besides, does someone know if it is appropriate to just divide the number of lines in the accepted_hits.sam file from TopHat by the total number of reads in order to get the overall ratio of aligned reads?

          Best Moritz

          Comment


          • #6
            hi,drd2009,
            could you tell me how to clip the adapters?

            Comment


            • #7
              Originally posted by northbio View Post
              hi,drd2009,
              could you tell me how to clip the adapters?
              suggest you trying to find out the pipeline for seq-deal
              It is not difficult to get the "clean" sequence

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Non-Coding RNA Research and Technologies
                by seqadmin




                Non-coding RNAs (ncRNAs) do not code for proteins but play important roles in numerous cellular processes including gene silencing, developmental pathways, and more. There are numerous types including microRNA (miRNA), long ncRNA (lncRNA), circular RNA (circRNA), and more. In this article, we discuss innovative ncRNA research and explore recent technological advancements that improve the study of ncRNAs.

                Nobel Prize for MicroRNA Discovery
                This week,...
                10-07-2024, 08:07 AM
              • seqadmin
                Recent Developments in Metagenomics
                by seqadmin





                Metagenomics has improved the way researchers study microorganisms across diverse environments. Historically, studying microorganisms relied on culturing them in the lab, a method that limits the investigation of many species since most are unculturable1. Metagenomics overcomes these issues by allowing the study of microorganisms regardless of their ability to be cultured or the environments they inhabit. Over time, the field has evolved, especially with the advent...
                09-23-2024, 06:35 AM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, Today, 06:35 AM
              0 responses
              7 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, Yesterday, 02:44 PM
              0 responses
              7 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 10-11-2024, 06:55 AM
              0 responses
              15 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 10-02-2024, 04:51 AM
              0 responses
              111 views
              0 likes
              Last Post seqadmin  
              Working...
              X