Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • HSV-1
    Member
    • Jul 2012
    • 38

    Bowtie error when mapping ABI RNA-seq data with Tophat

    Hi all,
    I got errors like below:

    No,1
    =============
    2012-07-26 16:37:43] Beginning TopHat run (v2.0.4)
    -----------------------------------------------
    [2012-07-26 16:37:43] Checking for Bowtie
    Bowtie version: 0.12.8.0
    [2012-07-26 16:37:44] Checking for Samtools
    Samtools version: 0.1.13.0
    [2012-07-26 16:37:44] Checking for Bowtie index files
    [2012-07-26 16:37:44] Checking for reference FASTA file
    Warning: Could not find FASTA file hg19_c.fa
    [2012-07-26 16:37:44] Reconstituting reference FASTA file from Bowtie index
    Executing: /home/xiaoyu/bin/bowtie-inspect hg19_c > VVinfect0h/tmp/hg19_c.fa
    [2012-07-26 16:42:49] Generating SAM header for hg19_c
    format: fastq
    quality scale: phred33 (default)
    [2012-07-26 16:43:42] Reading known junctions from GTF file
    [2012-07-26 16:43:54] Preparing reads
    left reads: min. length=50, max. length=50, 49053373 kept reads (712587 discarded)
    [2012-07-26 17:00:17] Creating transcriptome data files..
    [2012-07-26 17:01:36] Building Bowtie index from hg19UCSC.fa
    [2012-07-26 17:53:34] Mapping left_kept_reads to transcriptome hg19UCSC with Bowtie
    [FAILED]
    Error running bowtie:
    Too few quality values for read: 460T3#
    are you sure this is a FASTQ-int file?
    Command: /home/xiaoyu/bin/bowtie -q -C --col-keepends -v 1 -k 60 -m 60 -S -p 2 --sam-nohead --max /dev/null VVinfect0h/tmp/hg19UCSC -


    No2
    ==============


    [2012-07-26 16:43:15] Beginning TopHat run (v2.0.4)
    -----------------------------------------------
    [2012-07-26 16:43:15] Checking for Bowtie
    Bowtie version: 0.12.8.0
    [2012-07-26 16:43:15] Checking for Samtools
    Samtools version: 0.1.13.0
    [2012-07-26 16:43:15] Checking for Bowtie index files
    [2012-07-26 16:43:15] Checking for reference FASTA file
    Warning: Could not find FASTA file hg19_c.fa
    [2012-07-26 16:43:15] Reconstituting reference FASTA file from Bowtie index
    Executing: /home/xiaoyu/bin/bowtie-inspect hg19_c > VVinfect4h/tmp/hg19_c.fa
    [2012-07-26 16:48:16] Generating SAM header for hg19_c
    format: fastq
    quality scale: phred33 (default)
    [2012-07-26 16:48:22] Reading known junctions from GTF file
    [2012-07-26 16:48:33] Preparing reads
    left reads: min. length=50, max. length=50, 34854552 kept reads (1262790 discarded)
    [2012-07-26 16:59:38] Creating transcriptome data files..
    [2012-07-26 17:01:05] Building Bowtie index from hg19UCSC.fa
    [2012-07-26 17:51:34] Mapping left_kept_reads to transcriptome hg19UCSC with Bowtie
    [FAILED]
    Error running bowtie:
    Too few quality values for read: 51300T13
    are you sure this is a FASTQ-int file?
    Command: /home/xiaoyu/bin/bowtie -q -C --col-keepends -v 1 -k 60 -m 60 -S -p 2 --sam-nohead --max /dev/null VVinfect4h/tmp/hg19UCSC -


    How to fix it? Is it possible to trimm the reads "460T3#" "51300T13" out? If yes, how? Please help.
    Last edited by HSV-1; 07-26-2012, 04:51 AM. Reason: ask more
  • goudurix
    Junior Member
    • Mar 2012
    • 6

    #2
    This happen to me one time when using -C with an illumina dataset. Are you sure your reads are in color space ?
    Cheers

    Comment

    • HSV-1
      Member
      • Jul 2012
      • 38

      #3
      Originally posted by goudurix View Post
      This happen to me one time when using -C with an illumina dataset. Are you sure your reads are in color space ?
      Cheers


      The data is from ABI-Solid. And I open the data file there are no ACGTs,but 1,2,3,...
      They are color space.

      Comment

      • sonia.bao
        Member
        • May 2012
        • 12

        #4
        Got the same error here when feeding tophat2 with csfastq files as the input.

        I checked the csfastq file, and found nothing wrong with it. No truncated reads or qual values. Then I tried bowtie to align the reads to the reference genome using the csfastq file as the input - bowtie finished without any error, and over 90% of the reads were mapped.

        Try feeding tophat2 with csfasta+qual files as the input instead of csfastq. I tried that and tophat2 ran through successfully.
        Last edited by sonia.bao; 08-01-2012, 12:07 AM.

        Comment

        • HSV-1
          Member
          • Jul 2012
          • 38

          #5
          Originally posted by sonia.bao View Post
          Got the same error here when feeding tophat2 with csfastq files as the input.

          I checked the csfastq file, and found nothing wrong with it. No truncated reads or qual values. Then I tried bowtie to align the reads to the reference genome using the csfastq file as the input - bowtie finished without any error, and over 90% of the reads were mapped.

          Try feeding tophat2 with csfasta+qual files as the input instead of csfastq. I tried that and tophat2 ran through successfully.
          Thanks for your reply. How to get csfasta files and qual files from the same csfastq?

          Comment

          • sonia.bao
            Member
            • May 2012
            • 12

            #6
            Try this python script - it takes color space .fastq file as the input and outputs 2 files, .csfasta and .QV.qual.

            (It was not written by me. Someone wrote this script and shared it on this board (much appreciated!!!). If anybody knows who the author is, please let me know and I'll update it)

            csfastq2solid.py
            Code:
            import sys
            fq = sys.argv[1]
            
            base = fq.split(".fastq")[0]
            quals = open(base + ".QV.qual", "w")
            seq = open(base + ".csfasta", "w")
            
            for i, line in enumerate(open(fq)):
            
                mod = i % 4
                if mod == 0: # name
                    assert line[0] == "@"
                    quals.write(">" + line[1:])
                    seq.write(">" + line[1:])
                elif mod == 1: # cseq
                    seq.write(line)
                elif mod == 3:
                    print >>quals, " ".join((str(ord(q) - 33) for q in line.rstrip("\r\n")))
            
            seq.close(); quals.close()
            print >>sys.stderr, "wrote %s, %s" % (quals.name, seq.name)
            Last edited by sonia.bao; 08-01-2012, 01:04 AM. Reason: Added description

            Comment

            Latest Articles

            Collapse

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by SEQadmin2, Yesterday, 11:58 AM
            0 responses
            13 views
            0 reactions
            Last Post SEQadmin2  
            Started by SEQadmin2, 06-05-2026, 10:09 AM
            0 responses
            25 views
            0 reactions
            Last Post SEQadmin2  
            Started by SEQadmin2, 06-04-2026, 08:59 AM
            0 responses
            36 views
            0 reactions
            Last Post SEQadmin2  
            Started by SEQadmin2, 06-02-2026, 12:03 PM
            0 responses
            60 views
            0 reactions
            Last Post SEQadmin2  
            Working...