Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • TopHat 1.1 failing on colorspace SE reads

    I'm trying to analyze a single end dataset from SRA with the brand new version of TopHat (1.1). TopHat crashes with the below error message, and looking at this & the code it appears that even with single ends it is trying to run a validation check on the 2nd set of reads


    Code:
      File "/usr/local/bin/tophat", line 2093, in main
        params.read_params = check_reads(params.read_params, left_reads_list + "," + right_reads_list)
    TypeError: cannot concatenate 'str' and 'NoneType' objects

    If I replace line 2093 with

    Code:
            if params.skip_check_reads == False:
                if right_reads_list !=None:
                    params.read_params = check_reads(params.read_params, left_reads_list + "," + right_reads_list)
                else:
                    params.read_params = check_reads(params.read_params, left_reads_list)
    Then I get farther but hit a new error

    Code:
    Length mismatch between sequence and quality strings for SRR040290.1 VAB_ugc_85__100_137__138_121__123_bc_Frag50_solid0032_20090715_ugc_121__1231_49_36 length=50 (51 vs 51).
    I'm too worn out to puzzle how to get past that one -- my best guess is this is related to the "extra" colorspace value which bowtie option "--col-keepends" deals with

  • #2
    Yep, I just got the same error message (the first one; haven't tried to modify the code). I'm also using single-end color space reads (.csfasta + .qual)

    Comment


    • #3
      Yes, Just to confirm that you are not the only one, I'm getting this error too, but on standard single end reads (not color-space).

      Comment


      • #4
        Google search the same error message leads me here. Same problem with single-end color space reads.

        Comment


        • #5
          same here.

          Comment


          • #6
            Hi guys,

            I'm Daehwan, who made this bug and fixed it, you can grab a fixed version at http://tophat.cbcb.umd.edu/index.html

            Thanks

            Comment


            • #7
              Thanks for the rapid fix!!

              Comment


              • #8
                great, i will try it right now! thx for the great support!

                Comment


                • #9
                  Great. Thanks for the speedy fix. It looks like its working fine so far (fingers crossed )

                  Comment


                  • #10
                    Hello. I can now successfully start and get passed the first error encountered here. However, I still run into the next error mentioned above:

                    Code:
                    Tue Oct  5 16:10:32 2010] Beginning TopHat run (v1.1.0)
                    -----------------------------------------------
                    [Tue Oct  5 16:10:32 2010] Preparing output location /home/schaefer/tophat/RBM20/Sample14//
                    [Tue Oct  5 16:10:32 2010] Checking for Bowtie index files
                    [Tue Oct  5 16:10:32 2010] Checking for reference FASTA file
                    [Tue Oct  5 16:10:32 2010] Checking for Bowtie
                    	Bowtie version:			 0.12.3.0
                    [Tue Oct  5 16:10:32 2010] Checking for Samtools
                    	Samtools version:		 0.1.8.0
                    [Tue Oct  5 16:10:39 2010] Checking reads
                    
                    Error encountered parsing file ...fastq:
                     Length mismatch between sequence and quality strings for 853_8_25/1 (49 vs 49).
                    When I check the fastq file, everything seems fine:

                    Code:
                    @853_8_25/1
                    GNNGTGNTNCANNCGTNNGAGNNCACNNACANCCGANNACGNAAAGNAN
                    +
                    *""%%%"%"%%""%%%""%)&""%%%""%%+"&'%&""'(%"'))'"&"
                    @853_8_35/1
                    CNNACGNANACNNACCNNCCGNNTAANNNNGNGAACNNCNANCNCNNTN
                    +
                    :""=54"@"=+""A98""745"";98""""2"=@>8""<"4"<"6"";"
                    @853_8_75/1
                    GNNACCNCNTCNNAACNNTACNNCGANNGTGNGGACNNGTCNCGAGNCN
                    +
                    ="";<7"0";4""=;:"">;=""94<"".,5".;26""9%)"(%(("("
                    @853_8_96/1
                    ...
                    Is this error still occuring to someone else?

                    Comment


                    • #11
                      DerSeb, what's your command?

                      Comment


                      • #12
                        This is my command:

                        Code:
                        tophat -G /data/genetics/datasets/genome-annotation/ensembl-56/Rattus_norvegicus.RGSC3.4.56.gtf -o /home/schaefer/tophat/Sample14 -C rn4_c /home/schaefer/tophat/fastq/Sample_14_Qual.fastq

                        Comment


                        • #13
                          Since you are using -C, which is for colorspace read, you need to use colorspace reads instead of nucleotide reads.

                          Comment


                          • #14
                            I see... I converted my CS reads to fastq, using scripts supplied with MAQ (solid2fastq.pl or fq_all2std.pl csfa2std). They convert the cs values to letters mimicking a "pseudo" genetic sequence.

                            I will look into this and see how I can change this.

                            Thx!

                            Comment


                            • #15
                              I have now started a thread dedicated to reformatting SOLiD reads for TopHat:

                              Discussion of next-gen sequencing related bioinformatics: resources, algorithms, open source efforts, etc

                              Comment

                              Latest Articles

                              Collapse

                              • seqadmin
                                Strategies for Sequencing Challenging Samples
                                by seqadmin


                                Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                                03-22-2024, 06:39 AM
                              • seqadmin
                                Techniques and Challenges in Conservation Genomics
                                by seqadmin



                                The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                                Avian Conservation
                                Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                                03-08-2024, 10:41 AM

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by seqadmin, Yesterday, 06:37 PM
                              0 responses
                              10 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, Yesterday, 06:07 PM
                              0 responses
                              9 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 03-22-2024, 10:03 AM
                              0 responses
                              49 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 03-21-2024, 07:32 AM
                              0 responses
                              67 views
                              0 likes
                              Last Post seqadmin  
                              Working...
                              X