I can't seem to find an answer to this simple question, although this must be a fairly common issue. I have Illumina paired-end reads from an RNA-Seq run, and have run a script to filter adapter dimers. This operation has resulted in my having different numbers of reads in my _R1 and _R2 files. Does Bowtie identify mate pairs in these files simply using their order, or does it use the read IDs? If it goes in order, then Bowtie will fail to match the correct pairs from my filtered files. If it uses the IDs, then Bowtie should be ok, unless it crashes when a read doesn't have a mate. Does anyone know how this works? Should I just run this data as single-end to avoid these issues?
Seqanswers Leaderboard Ad
Collapse
Announcement
Collapse
No announcement yet.
X
-
Dear Volklor,
Paired inputs for Bowtie2:
Pairs are often stored in a pair of files, one file containing the mate 1s and the other containing the mates 2s. The first mate in the file for mate 1 forms a pair with the first mate in the file for mate 2, the second with the second, and so on. When aligning pairs with Bowtie 2, specify the file with the mate 1s mates using the -1 argument and the file with the mate 2s using the -2 argument. This causes Bowtie 2 to take the paired nature of the reads into account when aligning them.
(http://bowtie-bio.sourceforge.net/bowtie2/manual.shtml)
Please write a simple perl script to order your reads.
Best wishes,
RahulRahul Sharma,
Ph.D
Frankfurt am Main, Germany
-
Thanks for your reply, Rahul. I assume that Bowtie 1 (the version I'm using) works the same way as Bowtie 2. In my case, it is not that read order is the only issue; it's that certain reads don't have mates because they've been filtered out. I think my best bet will be to run this data as single-end.
Comment
-
Hi Volklor,
If you still have the original files somewhere it might be worth running a trimming program that is aware of paired-ends, such as Trimmomatic. We have also written a wrapper around Cutadapt that can do this (trim galore), even though it was initially destined for some other stuff. It would be a shame to let the paired-end information go to waste, wouldn't it?
Comment
Latest Articles
Collapse
-
by seqadmin
The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...-
Channel: Articles
04-22-2024, 07:01 AM -
-
by seqadmin
Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...-
Channel: Articles
04-04-2024, 04:25 PM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, Yesterday, 11:49 AM
|
0 responses
15 views
0 likes
|
Last Post
by seqadmin
Yesterday, 11:49 AM
|
||
Started by seqadmin, 04-24-2024, 08:47 AM
|
0 responses
16 views
0 likes
|
Last Post
by seqadmin
04-24-2024, 08:47 AM
|
||
Started by seqadmin, 04-11-2024, 12:08 PM
|
0 responses
61 views
0 likes
|
Last Post
by seqadmin
04-11-2024, 12:08 PM
|
||
Started by seqadmin, 04-10-2024, 10:19 PM
|
0 responses
60 views
0 likes
|
Last Post
by seqadmin
04-10-2024, 10:19 PM
|
Comment