Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Using Tophat-Fusion to detect Structural Variation in E. coli

    Hello,

    I'm trying to use Tophat-Fusion on a set of E. coli for structural variation detection. The program seems to run with no errors, but the output doesn't seem correct.

    I have a few simple questions first.

    When giving paired end reads to Tophat-Fusion, how should they be passed? Should they be in separate files with similar names like set_of_reads_1.fastq and set_of_reads_2.fastq with corresponding read names for the pairs? Or should they be merged into one file like set_of_reads.fastq such that as the file is being read, every 2 reads is a pair?

    I have built my bowtie files like so:
    bowtie-build REL606.5.gbk bowtie_REL606.5
    Which seems to build correctly.

    My reads are 50 bps in length with a gap size of 100. I then call tophat like so:

    tophat-fusion -p 12 --solexa-quals -r 100 --mate-std-dev 20 -o paired_tophat bowtie_REL606.5 set_1.fastq set_2.fastq

    tophat-fusion -p 12 --solexa-quals -r 100 --mate-std-dev 20 -o merged_tophat bowtie_REL606.5 set.fastq

    I have called it in 2 different ways because I'm unsure of the read method I mentioned above.

    When it completed, I tried examining the sam file but the samtools view command fails with the following error:
    [sam_read1] reference 'REL606.5-REL606.5' is recognized as '*'.
    Parse error at line 2428: invalid CIGAR operation

    How would I be able to examine the sam file?

    Platform: Linux

    Versions:
    TopHat v0.1.0 (Beta)
    bowtie version 0.12.7
    Samtools Version: 0.1.15 (r949:203)

    Thanks
    Last edited by aaronreba; 06-27-2012, 12:48 PM.

  • #2
    I've found out my questions for anyone else interested. I was apparently using an old version of Tophat. I learned this after downloading only the binaries of Tophat2. The reads also must be in separate files like so:

    File 1:

    read1.1
    atgatgc...
    +
    #$@#$...
    read2.1
    atgatgc...
    +
    #$@#$...

    File 2:

    read1.2
    atgatgc...
    +
    #$@#$...
    read2.2
    atgatgc...
    +
    #$@#$...
    Last edited by aaronreba; 06-28-2012, 10:52 AM.

    Comment

    Latest Articles

    Collapse

    • seqadmin
      Non-Coding RNA Research and Technologies
      by seqadmin




      Non-coding RNAs (ncRNAs) do not code for proteins but play important roles in numerous cellular processes including gene silencing, developmental pathways, and more. There are numerous types including microRNA (miRNA), long ncRNA (lncRNA), circular RNA (circRNA), and more. In this article, we discuss innovative ncRNA research and explore recent technological advancements that improve the study of ncRNAs.

      Nobel Prize for MicroRNA Discovery
      This week,...
      10-07-2024, 08:07 AM
    • seqadmin
      Recent Developments in Metagenomics
      by seqadmin





      Metagenomics has improved the way researchers study microorganisms across diverse environments. Historically, studying microorganisms relied on culturing them in the lab, a method that limits the investigation of many species since most are unculturable1. Metagenomics overcomes these issues by allowing the study of microorganisms regardless of their ability to be cultured or the environments they inhabit. Over time, the field has evolved, especially with the advent...
      09-23-2024, 06:35 AM

    ad_right_rmr

    Collapse

    News

    Collapse

    Topics Statistics Last Post
    Started by seqadmin, Today, 06:35 AM
    0 responses
    7 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, Yesterday, 02:44 PM
    0 responses
    7 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 10-11-2024, 06:55 AM
    0 responses
    15 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 10-02-2024, 04:51 AM
    0 responses
    111 views
    0 likes
    Last Post seqadmin  
    Working...
    X