What is the alignment difference between a single end and paired end read?
Hi, I'm currently working on a alignment program (bwa). And to verify functionality, I need to run tests with paired end reads. I know how paired end reads are made, but how would you make a sample paired end read from a reference genome? For a single end read I just take any random 35 bp sequence, but what do i do for a paired end read?
Thanks,
Matt
Seqanswers Leaderboard Ad
Collapse
Announcement
Collapse
No announcement yet.
X
-
Browsing through the old posts and found this quite useful. But, isn't it a deletion when the distance is 100bp and a insertion if the reads are 1kb away?
The reads, which come from your sequence are ~500 bases apart. They are always 500 bases apart. That is a biological fact, assuming that you did the laboratory work correctly.
If you map your reads onto the reference genome and find that they are ~500 bases apart then you know that there is no insertion or deletion -- or at least no single indel event.
If you map your reads on the reference and find that they are 100 bases apart then you have to think -- how did those 100 bases now become the biological 500 bp reads? Either your genome had a insertion compared to the reference. Or the reference had an deletion compared to your sequence. The original post said:
Same thing with an insertion, if your reads mapped 100bp apart on the reference, this suggests that your genome has an insertion.
As I said in my first paragraph, the confusion may be arising from which genome you are talking about. As I was writing this message my mind kept flipping back and forth between the genomes. Usually what I want to know about is my genome -- did it have an insertion or deletion. But the mapping is done to the reference genome and it there that we find smaller or larger pairing distances ... these are inverse of what biologically happened to my sequence.
Leave a comment:
-
Originally posted by ECO View PostStructural rearrangements can be deduced when your read pairs map to a reference at a distance that is substantially different from how that library was constructed (~500bp in the above example). Let's say you had two reads that mapped to your reference 1000bp apart...this suggests there has been a deletion between those two sequence reads within your genome. Same thing with an insertion, if your reads mapped 100bp apart on the reference, this suggests that your genome has an insertion.
Leave a comment:
-
Paired end (mate pair) sequencing explanation
biocc,
"paired end" or "mate pair" refers to how the library is made, and then how it is sequenced. Both are methodologies that, in addition to the sequence information, give you information about the physical distance between the two reads in your genome.
For example, you shear up some genomic DNA, and cut a region out at ~500bp. Then you prepare your library, and sequence 35bp from each end of each molecule. Now you have three pieces of information:
--the tag 1 sequence
--the tag 2 sequence
--that they were 500bp ± (some) apart in your genome
This gives you the ability to map to a reference (or denovo for that matter) using that distance information. It helps dramatically to resolve larger structural rearrangements (insertions, deletions, inversions), as well as helping to assemble across repetitive regions.
Structural rearrangements can be deduced when your read pairs map to a reference at a distance that is substantially different from how that library was constructed (~500bp in the above example). Let's say you had two reads that mapped to your reference 1000bp apart...this suggests there has been a deletion between those two sequence reads within your genome. Same thing with an insertion, if your reads mapped 100bp apart on the reference, this suggests that your genome has an insertion.
Mapping over repeats is similar...if one read is unmappable because it falls in a very repetitive region (eg. LINE, LTR, SINE), but the other is unique, you can again use that distance information to map both reads. The first read would likely come from the repeat that is ~500bp away from your unique second read.
Hope that helps. It's a weird concept at first, but very useful for all types of sequencing. It's been around at some levels since the days of shotgun sequencing.
And lastly, the terminology between "paired end" and "mate pair" is typically that "paired end" refers to sequencing both ends of the same molecule, while "mate pair" (in ABI's case) refers to sequencing only two tags (made by Type IIS restriction enzymes a la SAGE) from the ends of a typically much larger molecule. I could be wrong here though...
Leave a comment:
-
I hope I understand what you're asking, and that my answers are not too basic...
No, the reads won't be complementary unless you're sequencing very short molecules so that a read from each end simply sequences the other strand. Generally, though, the molecule is longer, so you get the read from one end of the molecule and the read from the other end on the other strand. You don't know what the sequence is in the central section of the molecule because the reads are not long enough to span all the way across the molecule. So basically, you have no way of knowing, just by looking at two sequences, whether they're pairs or not.
Leave a comment:
-
Originally posted by ScottC View PostThe term 'paired ends' refers to the two ends of the same DNA molecule. So you can sequence one end, then turn it around and sequence the other end. The two sequences you get are 'paired end reads'. Sometimes they're called 'mate pairs' (but with Illumina technology, I think what they call 'mate pair' and 'paired end' methodology is different). Is that what you want to know?Last edited by biocc; 08-20-2008, 06:54 PM.
Leave a comment:
-
The term 'paired ends' refers to the two ends of the same DNA molecule. So you can sequence one end, then turn it around and sequence the other end. The two sequences you get are 'paired end reads'. Sometimes they're called 'mate pairs' (but with Illumina technology, I think what they call 'mate pair' and 'paired end' methodology is different). Is that what you want to know?
Leave a comment:
-
what is a paired-end read?
When I read papers, I find paired -end read and single-end reads are mentioned many times. But what is a paired-end read? I am not very clearly.
just like:1 1 119 395 GAAGAGGAGATAAATAAAACTCAAAATACAGCTGAA
1 1 852 893 GTTATTAATATTATTGATGTATTCATCTTTTCTTTT
1 1 814 900 GTTAAAGCATTAAGAAAAGATGTACTTGCAAAATGC
1 1 241 454 GGTGGAAGAGATGTCATTGGAGAAGCCCAAACAGGT
1 1 759 899 GTGTGCTTTTTGAATGAGTAGGTATTGTAATTAGCT
1 1 123 438 GAAAGCCAAACTTTTCATAAAAGCCTTCCTTGCCAT
which are generated by Solexa. Are They paired-end reads?
ThanksTags: None
Latest Articles
Collapse
-
by seqadmin
The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...-
Channel: Articles
04-22-2024, 07:01 AM -
-
by seqadmin
Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...-
Channel: Articles
04-04-2024, 04:25 PM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, Yesterday, 11:49 AM
|
0 responses
15 views
0 likes
|
Last Post
by seqadmin
Yesterday, 11:49 AM
|
||
Started by seqadmin, 04-24-2024, 08:47 AM
|
0 responses
16 views
0 likes
|
Last Post
by seqadmin
04-24-2024, 08:47 AM
|
||
Started by seqadmin, 04-11-2024, 12:08 PM
|
0 responses
62 views
0 likes
|
Last Post
by seqadmin
04-11-2024, 12:08 PM
|
||
Started by seqadmin, 04-10-2024, 10:19 PM
|
0 responses
60 views
0 likes
|
Last Post
by seqadmin
04-10-2024, 10:19 PM
|
Leave a comment: