Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Using Tophat-Fusion to detect Structural Variation in E. coli

    Hello,

    I'm trying to use Tophat-Fusion on a set of E. coli for structural variation detection. The program seems to run with no errors, but the output doesn't seem correct.

    I have a few simple questions first.

    When giving paired end reads to Tophat-Fusion, how should they be passed? Should they be in separate files with similar names like set_of_reads_1.fastq and set_of_reads_2.fastq with corresponding read names for the pairs? Or should they be merged into one file like set_of_reads.fastq such that as the file is being read, every 2 reads is a pair?

    I have built my bowtie files like so:
    bowtie-build REL606.5.gbk bowtie_REL606.5
    Which seems to build correctly.

    My reads are 50 bps in length with a gap size of 100. I then call tophat like so:

    tophat-fusion -p 12 --solexa-quals -r 100 --mate-std-dev 20 -o paired_tophat bowtie_REL606.5 set_1.fastq set_2.fastq

    tophat-fusion -p 12 --solexa-quals -r 100 --mate-std-dev 20 -o merged_tophat bowtie_REL606.5 set.fastq

    I have called it in 2 different ways because I'm unsure of the read method I mentioned above.

    When it completed, I tried examining the sam file but the samtools view command fails with the following error:
    [sam_read1] reference 'REL606.5-REL606.5' is recognized as '*'.
    Parse error at line 2428: invalid CIGAR operation

    How would I be able to examine the sam file?

    Platform: Linux

    Versions:
    TopHat v0.1.0 (Beta)
    bowtie version 0.12.7
    Samtools Version: 0.1.15 (r949:203)

    Thanks
    Last edited by aaronreba; 06-27-2012, 12:48 PM.

  • #2
    I've found out my questions for anyone else interested. I was apparently using an old version of Tophat. I learned this after downloading only the binaries of Tophat2. The reads also must be in separate files like so:

    File 1:

    read1.1
    atgatgc...
    +
    #$@#$...
    read2.1
    atgatgc...
    +
    #$@#$...

    File 2:

    read1.2
    atgatgc...
    +
    #$@#$...
    read2.2
    atgatgc...
    +
    #$@#$...
    Last edited by aaronreba; 06-28-2012, 10:52 AM.

    Comment

    Latest Articles

    Collapse

    • seqadmin
      Genetic Variation in Immunogenetics and Antibody Diversity
      by seqadmin



      The field of immunogenetics explores how genetic variations influence immune responses and susceptibility to disease. In a recent SEQanswers webinar, Oscar Rodriguez, Ph.D., Postdoctoral Researcher at the University of Louisville, and Ruben Martínez Barricarte, Ph.D., Assistant Professor of Medicine at Vanderbilt University, shared recent advancements in immunogenetics. This article discusses their research on genetic variation in antibody loci, antibody production processes,...
      11-06-2024, 07:24 PM
    • seqadmin
      Choosing Between NGS and qPCR
      by seqadmin



      Next-generation sequencing (NGS) and quantitative polymerase chain reaction (qPCR) are essential techniques for investigating the genome, transcriptome, and epigenome. In many cases, choosing the appropriate technique is straightforward, but in others, it can be more challenging to determine the most effective option. A simple distinction is that smaller, more focused projects are typically better suited for qPCR, while larger, more complex datasets benefit from NGS. However,...
      10-18-2024, 07:11 AM

    ad_right_rmr

    Collapse

    News

    Collapse

    Topics Statistics Last Post
    Started by seqadmin, Today, 11:09 AM
    0 responses
    23 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, Today, 06:13 AM
    0 responses
    20 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 11-01-2024, 06:09 AM
    0 responses
    30 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 10-30-2024, 05:31 AM
    0 responses
    21 views
    0 likes
    Last Post seqadmin  
    Working...
    X