Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to change Bowtie settings under TopHat

    Dear all

    After setting bowtie and tophat under a path I ran tophat on 20MB fastq reads.
    And I found only few aligned reads and junctions in output. When I checked logs the results made me suspicious.

    After that
    when i ran just default bowtie command, 90% of my reads are failed to align. Does any one know why ?

    If I need to change any parameters in bowtie where and how could I change them so that tophat recognizes them every time when i ran tophat?

    My reads are 35bp single end illumina-solexa fastq reads (hg18). It would be great if you provide any parameters if I need to change in bowtie especially for these kinda reads.

    Thanx in advance
    Last edited by repinementer; 08-09-2010, 01:36 AM.

  • #2
    You may not be able to do it as TopHat needs Bowtie to run in a specific way. The only way to crack is to re-compile TopHat, changing the way it calls Bowtie, and you're at the risk of crashing TopHat.

    Comment


    • #3
      Actually, the 'tophat' command itself is a python script that could potentially be edited if you wanted to hack at that. Save a copy first in case a change makes it stop working.

      Comment


      • #4
        Here is the "call bowtie" part in tophat python script. From what I can see (by a quick search), only -k, -m, -v (or -n) are set (see the red part below), which means only output the reads that have mismatch number <= a specific number (set by -n or -v) and total number of qualified reads <= maxhit (set by -k and -m, in tophat it's -g/--max-multihits). So, by default, tophat call bowtie as "bowtie -v N -k 20 -m 20". You can change it if you want to use parameters like "-a --best --strata" (be aware that -a and -k are mutually exclusive).


        # Call bowtie
        def bowtie(params,
        bwt_idx_prefix,
        reads_list,
        reads_format,
        num_mismatches,
        mapped_reads,
        unmapped_reads,
        extra_output = "",
        t_mapping = False,
        multihits_out = None): #only --prefilter-multihits should activate this parameter for the initial prefilter search
        start_time = datetime.now()
        bwt_idx_name = bwt_idx_prefix.split('/')[-1]
        reads_file=reads_list[0]
        readfile_basename=getFileBaseName(reads_file)
        if t_mapping:
        print >> sys.stderr, "[%s] Mapping %s against transcriptome %s with Bowtie %s" % (start_time.strftime("%c"),
        readfile_basename, bwt_idx_name, extra_output)
        else:
        print >> sys.stderr, "[%s] Mapping %s against %s with Bowtie %s" % (start_time.strftime("%c"),
        readfile_basename, bwt_idx_name, extra_output)
        bwt_log = open(logging_dir + 'bowtie.'+readfile_basename+'.fixmap.log', "w")
        #bwt_mapped=mapped_reads
        unmapped_reads_out=unmapped_reads
        if use_BWT_FIFO:
        global unmapped_reads_fifo
        if os.path.exists(unmapped_reads_fifo):
        os.remove(unmapped_reads_fifo)
        try:
        os.mkfifo(unmapped_reads_fifo)
        except OSError, o:
        die(fail_str+"Error at mkfifo("+unmapped_reads_fifo+'). '+str(o))
        unmapped_reads_out=unmapped_reads_fifo

        # Launch Bowtie
        try:
        bowtie_cmd = [bowtie_path]

        if reads_format == "fastq":
        bowtie_cmd += ["-q"]
        elif reads_format == "fasta":
        bowtie_cmd += ["-f"]

        if params.read_params.color:

        bowtie_cmd += ["-C", "--col-keepends"]
        unzip_cmd=None
        if use_zpacker:
        unzip_cmd=[ params.system_params.zipper ]
        unzip_cmd.extend(params.system_params.zipper_opts)
        zip_cmd=unzip_cmd[:]
        unzip_cmd+=['-cd']
        zip_cmd+=['-c']

        fifo_pid=None
        if unmapped_reads:
        bowtie_cmd += ["--un", unmapped_reads_out]

        if use_BWT_FIFO:
        print >> run_log, ' '.join(zip_cmd)+' < '+ unmapped_reads_fifo + ' > '+ unmapped_reads + ' & '
        fifo_pid=os.fork()
        if fifo_pid==0:
        def on_sig_exit(sig, func=None):
        os._exit(os.EX_OK)
        signal.signal(signal.SIGTERM, on_sig_exit)
        subprocess.call(zip_cmd,
        stdin=open(unmapped_reads_fifo, "r"),
        stdout=open(unmapped_reads, "wb"))
        os._exit(os.EX_OK)

        fix_map_cmd = [prog_path('fix_map_ordering')]

        max_hits = params.max_hits
        if t_mapping:
        max_hits = params.t_max_hits
        if params.bowtie_alignment_option == "-n" and num_mismatches>3:
        num_mismatches=3
        bowtie_cmd += [params.bowtie_alignment_option, str(num_mismatches),
        "-p", str(params.system_params.num_cpus),
        "-k", str(max_hits),
        "-m", str(max_hits)]

        if multihits_out:
        bowtie_cmd += ["--max", multihits_out]
        else:
        bowtie_cmd += ["--max", "/dev/null"]


        bowtie_cmd += [ bwt_idx_prefix ]
        bowtie_proc=None
        shellcmd=""
        unzip_proc=None
        zip_proc=None
        if multihits_out:
        #special prefilter bowtie run: we use prep_reads on the fly
        #in order to get multi-mapped reads to exclude later
        prep_cmd = prep_reads_cmd(params, ",".join(reads_list))
        preplog_fname=logging_dir + "prep_reads_prefilter.log"
        prepfilter_log = open(preplog_fname,"w")
        unzip_proc = subprocess.Popen(prep_cmd,
        stdout=subprocess.PIPE,
        stderr=prepfilter_log)
        shellcmd=' '.join(prep_cmd) + "|"
        else:
        z_input=use_zpacker and reads_file.endswith(".z")
        if z_input:
        unzip_proc = subprocess.Popen(unzip_cmd,
        stdin=open(reads_file, "rb"),
        stdout=subprocess.PIPE)
        shellcmd=' '.join(unzip_cmd) + "< " +reads_file +"|"
        else:
        #must be uncompressed fastq input (e.g. unmapped reads from a previous run)
        bowtie_cmd += [reads_file]
        bowtie_proc = subprocess.Popen(bowtie_cmd,
        stdout=subprocess.PIPE,
        stderr=bwt_log)
        if unzip_proc:
        #input is compressed OR prep_reads is used as a filter
        bowtie_cmd += ['-']
        bowtie_proc = subprocess.Popen(bowtie_cmd,
        stdin=unzip_proc.stdout,
        stdout=subprocess.PIPE,
        stderr=bwt_log)
        unzip_proc.stdout.close() # see http://bugs.python.org/issue7678

        fix_map_cmd += ['-']
        shellcmd += ' '.join(bowtie_cmd) + '|' + ' '.join(fix_map_cmd)
        if use_zpacker:
        shellcmd += "|"+ ' '.join(zip_cmd)
        fix_order_proc = subprocess.Popen(fix_map_cmd,
        stdin=bowtie_proc.stdout,

        stdout=subprocess.PIPE)
        zip_proc = subprocess.Popen(zip_cmd,
        preexec_fn=subprocess_setup,
        stdin=fix_order_proc.stdout,
        stdout=open(mapped_reads, "wb"))
        fix_order_proc.stdout.close()
        else:
        fix_order_proc = subprocess.Popen(fix_map_cmd,
        preexec_fn=subprocess_setup,
        stdin=bowtie_proc.stdout,
        stdout=open(mapped_reads, "w"))
        bowtie_proc.stdout.close()
        shellcmd += " > " + mapped_reads
        print >> run_log, shellcmd
        if use_zpacker:
        zip_proc.communicate()
        else:
        fix_order_proc.communicate()
        if use_BWT_FIFO:
        if fifo_pid and not os.path.exists(unmapped_reads):
        try:
        os.kill(fifo_pid, signal.SIGTERM)
        except:
        pass
        except OSError, o:
        die(fail_str+"Error: "+str(o))

        # Success
        #finish_time = datetime.now()
        #duration = finish_time - start_time
        #print >> sys.stderr, "\t\t\t[%s elapsed]" % formatTD(duration)
        if use_BWT_FIFO:
        try:
        os.remove(unmapped_reads_fifo)
        except:
        pass
        if multihits_out and not os.path.exists(multihits_out):
        open(multihits_out, "w").close()
        return (mapped_reads, unmapped_reads)
        Last edited by sterding; 04-04-2012, 07:52 AM.

        Comment

        Latest Articles

        Collapse

        • seqadmin
          Non-Coding RNA Research and Technologies
          by seqadmin




          Non-coding RNAs (ncRNAs) do not code for proteins but play important roles in numerous cellular processes including gene silencing, developmental pathways, and more. There are numerous types including microRNA (miRNA), long ncRNA (lncRNA), circular RNA (circRNA), and more. In this article, we discuss innovative ncRNA research and explore recent technological advancements that improve the study of ncRNAs.

          Nobel Prize for MicroRNA Discovery
          This week,...
          10-07-2024, 08:07 AM
        • seqadmin
          Recent Developments in Metagenomics
          by seqadmin





          Metagenomics has improved the way researchers study microorganisms across diverse environments. Historically, studying microorganisms relied on culturing them in the lab, a method that limits the investigation of many species since most are unculturable1. Metagenomics overcomes these issues by allowing the study of microorganisms regardless of their ability to be cultured or the environments they inhabit. Over time, the field has evolved, especially with the advent...
          09-23-2024, 06:35 AM

        ad_right_rmr

        Collapse

        News

        Collapse

        Topics Statistics Last Post
        Started by seqadmin, 10-02-2024, 04:51 AM
        0 responses
        103 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 10-01-2024, 07:10 AM
        0 responses
        112 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 09-30-2024, 08:33 AM
        1 response
        115 views
        0 likes
        Last Post EmiTom
        by EmiTom
         
        Started by seqadmin, 09-26-2024, 12:57 PM
        0 responses
        21 views
        0 likes
        Last Post seqadmin  
        Working...
        X