I am confused about the inner distance setting. From the manual file, it should be set to (fragementlength-2*readslength , eg: 300-2*50=100). But if the distance counting is based on genome location, then the distance between the pairs should be (fragementlength-2*readslength+inserted_introns_length). Does anybody know how tophat manage the intron insertion?
Seqanswers Leaderboard Ad
Collapse
Announcement
Collapse
No announcement yet.
X
-
I think that the mate inner distances is set in order to detect whether the mates are in different exons etc. That is, if the distance is much larger/smaller than the 200 (300-2*50) that the software has been told to expect, something interesting might be going on.
-
Hi snp_analyser. I'm afraid there is no good value - the inner distances depend on the sizes of your fragments and the lengths of your reads, which are experiment specific.
We typically use a tool like Bowtie to help us find our fragment sizes empirically. We run a paired-end alignment with Bowtie, using default parameters for -I and -X. We then examine the output to see, in general, how far apart the reads in a pair as aligned. This indicates the mate inner distance.
In terms of terminology, the "gap" or "inner distance" is the distance between the aligned reads (not counting the reads themselves). The "insert size", on the other hand, includes the reads themselves, so can be thought of as the "fragment size".
If you look at Bowtie output, the alignment position of a read is the position of the first base in the read (from the perspective of the forward reference strand). This means that if you subtract the alignment positions of the two reads in a pair, the result is actually "inner distance" + "read length". So you will need to subtract the read length to get the inner distance.
You should probably write (or find) a script to do this for you to ensure you examine enough pairs to get a representative feel for the value. Our data typically shows a normally distributed inner distance.
Comment
Latest Articles
Collapse
-
by seqadmin
The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...-
Channel: Articles
04-22-2024, 07:01 AM -
-
by seqadmin
Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...-
Channel: Articles
04-04-2024, 04:25 PM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, Today, 08:47 AM
|
0 responses
12 views
0 likes
|
Last Post
by seqadmin
Today, 08:47 AM
|
||
Started by seqadmin, 04-11-2024, 12:08 PM
|
0 responses
60 views
0 likes
|
Last Post
by seqadmin
04-11-2024, 12:08 PM
|
||
Started by seqadmin, 04-10-2024, 10:19 PM
|
0 responses
59 views
0 likes
|
Last Post
by seqadmin
04-10-2024, 10:19 PM
|
||
Started by seqadmin, 04-10-2024, 09:21 AM
|
0 responses
54 views
0 likes
|
Last Post
by seqadmin
04-10-2024, 09:21 AM
|
Comment