Seqanswers Leaderboard Ad

**vinay052003** · 07-11-2012, 10:10 AM

I am sure you have run the tophat-fusion-post to remove the potential false positive. Beyond that, PCR experiments are the only way to validate the potential fusions.

"How do you go about to find the true positives and potention de novo fusions?"

That's the question I have been trying to answer for. There is no way TopHat fusion can tell whether a fusion is due to a de novo rearrangement or result of trans-splicing or some cDNA library artifact.

**NKAkers** · 07-11-2012, 03:53 PM

I have been looking into this a bit lately. My results are dominated by "fusions" of two genes on the same chromosome, generally about 200kbp apart. After reading the methods of the tophat-fusion paper, I'm fairly convinced these are read-through transcripts. I didn't know what that was, so I read a few papers on the topic, this is a particularly interesting one: http://genome.cshlp.org/content/16/1/30.full

Once those are eliminated, it's less clear. The tophat-fusion authors used different filters for different datasets, determined empirically so that they found the known fusions in their datasets. For example they varied the minimum supporting pairs and spanning reads needed to call a fusion. It makes some sense to me that these requirements should get more restrictive as your number of reads goes up, but picking the right filters for my data is guesswork at this point, as I have no known fusions as a positive control.

As vinay points out, PCR in the wet lab is probably the only way to make a convincing argument that you've got a fusion. Hopefully you have DNA from your samples in addition to the RNA.

**bharati** · 09-08-2012, 03:49 AM

Confusion in understanding the tophat fusion results

can any one explain me the tophat output file.

For each predicted fusion candidate in the first row, column 9 and column 10 are the number of bases in both left and right side of the fusion, ie their sum should be 100, as I used 100b PE illumina data.
I couldnt figure out the 11 th column of first row for each candidate fusion.
Even I am unable to understand the meaning of second row too .

Kindly help me out to figure out these confusions and oblige.

**NKAkers** · 09-10-2012, 12:28 PM

bharati,

Although I can't help with column 11 or the second row, I may be able to help with columns 9 and 10.

These are "the number of bases on the left and right sides of a fusion, respectively, covered by spanning reads". So in the case where your left read only spans the fusion by 1 base, you should have 100-1=99 bases covered on the left side. Likewise for the right side. I believe that with good coverage these should each be about equal to the size of your read length.

**bharati** · 11-08-2012, 03:05 AM

Hi NKAkers,
This is not the case always, as some of the reads having 133 and 54 or 6 and 66 or 17 and 259 or even 36 and 710 at their 9 and 10 columns respectively (as shown below). I am unable to figure out this.
can u please guide and help.
chr7-chr9 20414 141122659 fr 1 2 0 17 17 259 33.000000 @ 2 2 2 2 2 @ AGATCAGTGATAGGGCATGGTGTGGATATTATTACATTAGTATTGGAAGC GATGGTGTGGATTAGATCAGTGATAGGGCATGGTGTGGATATTATTACAT @ AGATCAGTGATAGGGCATGGTGTGGATATTATTACATTAGTATTGGAAGT GATGGTGTGTATTAGATCAGTGATAGGGCATGGTGTGGATATTATTACAT @ 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 @ 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 @ -55:-288 -127:-893

chr16-chr1 65607 15922 ff 1 3 0 1 36 710 14.000000 @ 2 3 6 6 6 @ CAGCTTGGCGGATGGACTCTAGCAGAGTGGCCCAGCCACCGGAGGGGTCG ACCACTTCTCTGGGAGCTCCCTGGACTGGAGCCGGGAGGTGGGGAACAGG @ CCAGCTTGGCGGATGGACTCTAGCAGAGTGGCCAGCCACCGGAGGGGTCA ACCACTTCCCTGGGAGCTCCCTGGACTGGAGCCGGGAGGTGGGGAACAGG @ 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 @ 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 @ -249:253 -249:310 -1470:1446

**NKAkers** · 11-08-2012, 10:47 AM

Hi bharati,

The manual has a whole section on this output. http://tophat.cbcb.umd.edu/fusion_manual.html

In your specific cases, you're getting weird numbers because you're looking at false positives. The strings of 1s after the sequences are the number of reads that support the fusion at that site. And 1 read depth isn't enough to call anything. In essence, your chr7-chr9 fusion is called based on two reads in different spots. I would strongly recommend running tophat-fusion-post before trying to analyze any of the 'results' from tophat fusion. It is unfiltered and ridden with false positives.

**bharati** · 12-04-2012, 10:21 PM

Explanation for result.html

Hi NKAkers,
Can you please explain the result.html file generated by Tophat-fusion-post, As I am unable to correlate this (result.html) file with its potential_fusions.txt file.

Thanx

**bharati** · 12-05-2012, 05:53 AM

table description in result.html

Hi
In the table description of result.html generated by tophat-fusion-post I am unable to understand the last line ie "If you follow the the 9th column, it shows coordinates "number1:number2" where one end is located at a distance of "number1" bases from the left genomic coordinate of a fusion and "number2 is similarly defined".
As we know the table has following feilds:

chr12-chr8 fr
RB005 PCBP2 chr12 53858636 FLJ39080 chr8 75515898 139 1509 175

So the above Red line is about which feild?
Please explain if anybody can.

Thanx

**NKAkers** · 12-05-2012, 06:01 PM

Hi bharati,

If you open the result.html file, then click on the 9th column hyperlink (ie the '1509') it should move you view down your html page to a heading that says 1509 pairs, and below that heading 1509 lines, looking something like:

1509 pairs
29:10
1311:-75
1328:-78
-2434:-68
-3770:187
-3745:252
...

each line giving coordinates relative to the proposed fusion site of mate pairs from one read.

I'm not certain what the purpose of potential_fusions.txt is, apart from giving some of the info from result.html in text form.

**bharati** · 12-06-2012, 09:33 PM

Difference between Spanning Reads and Spanning Matepairs

hi
can you please explain the difference between Spanning Reads and Spanning Mate pairs. As much I could understand the number of Spanning mate pairs should be lesser than spanning reads, but this is not the case in my results.

thanx

**bharati** · 12-08-2012, 05:01 AM

Confusion between Spanning reads and spanning mate pairs

Hi NKAkers,

What I understand is Spanning reads are those reads which do not harbor the fusion point but Split reads do harbor it, but Spanning mate pairs are those spanning reads which are supported by their mate pairs.

Are Spanning Reads those singular reads which do not have their mate pairs?

Topics	Statistics	Last Post
Gene Misexpression in the Healthy Human Population by seqadmin Started by seqadmin, Yesterday, 06:46 AM	0 responses 9 views 0 likes	Last Post by seqadmin Yesterday, 06:46 AM
New Method for Rapid Genetic Diagnosis of Mendelian Disorders by seqadmin Started by seqadmin, 07-24-2024, 11:09 AM	0 responses 26 views 0 likes	Last Post by seqadmin 07-24-2024, 11:09 AM
Advancing Nanopore Technology for Portable Sensing Devices by seqadmin Started by seqadmin, 07-19-2024, 07:20 AM	0 responses 160 views 0 likes	Last Post by seqadmin 07-19-2024, 07:20 AM
New RNA-Based Gene Writing Technology Achieves Precise Gene Integration by seqadmin Started by seqadmin, 07-16-2024, 05:49 AM	0 responses 127 views 0 likes	Last Post by seqadmin 07-16-2024, 05:49 AM

Seqanswers Leaderboard Ad

Announcement

Analyze Tophat Fusion Output

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News