Thanks to everyone who contributes to these basic threads. I know they may get tedious for the old-timers, but they're vital for newer folks. Even if the same info is technically available at reference sites, it's often easier to understand when the answer is to a specific, basic question, rather than organized as vendor documentation or generalized teaching material!
Seqanswers Leaderboard Ad
Collapse
Announcement
Collapse
No announcement yet.
X
-
This helped but not enough ! I think I get it but then questions pop up - this mates in a pair, paired read stuff pops up when I look a the specification for the SAM format output from the bowtie program. There are flags values that are returned to indicate if the aligned read is one of pair and/or first one in a pair and/or second one in a pair and so on..
But how does it, meaning bowtie, know ? Am I right in thinking that is specified in the input record ? I can see the .fastq input format has a place holder for this, so I'm guessing that other input formats eg. sra also have it ? And if they don't then there's no way for bowtie ( or other algorithms ) to derive it ?
Seeing as this is all equipment specific I'm going to need to look at some videos that describe the front parts of the this entire operation. Any ideas where ? I was hoping to just pick it up and work with it from the point of sequences of characters - an IT perspective - but guess not.
Comment
-
Originally posted by edilana.gomes View PostHi!!
Can anyone clarify the difference between mate pair, pair end and single end reads?
Thanks.
Mate-pairs and paired-end reads have been covered. Single end reads are just that... a single read from one end of each sheared DNA fragment.
Scott.
Comment
-
Originally posted by raonyguimaraes View PostHello,
Suppose I have two reads from an exome in fastq. How to determine if they are pair-end or single-end or mate-pair ?
-A
Comment
-
Originally posted by raonyguimaraes View PostHello,
Suppose I have two reads from an exome in fastq. How to determine if they are pair-end or single-end or mate-pair ?
Paired ends run towards each other, and are about 100-500 bp apart. Mate pairs run away from each other, and tend to be a few kb apart, but I belive sometimes they are contaminated with ordinary paired end data.
Comment
-
I have some questions about aligning paired end sequencing reads. I am using BWA sampe function to align my paired end reads and it worked, but surprisingly almost all reads are being paired with reads on a different chromosomes, resulting in a lot of "improper reads". I don't understand why BWA did that and I wonder if it was because I used the command "bwa sampe -a 15000 -A" to force bwa to not run smith waterman alignment for unmapped reads.
Also, if paired end reads share the same x and y coordinates, which are indicated by the first line of their fastq files, why doesn't bwa just pair them up by their coordinates? That seems like the most straightforward way to find the right pair to me.
Comment
-
Hi all I am still struggling with using BWA to align my paired end reads. I used the command:
bwa sampe -P -s hg19.fasta CATTCG_1.sai CATTCG_3.sai CATTCG_1.fastq CATTCG_3.fastq > CATTCG_PE.sam
and the first few lines of the program running look like this:
[bwa_sai2sam_pe_core] convert to sequence coordinate...
[infer_isize] fail to infer insert size: too few good pairs
[bwa_sai2sam_pe_core] time elapses: 10.96 sec
[bwa_sai2sam_pe_core] changing coordinates of 6 alignments.
[bwa_sai2sam_pe_core] align unmapped mate...
[bwa_sai2sam_pe_core] time elapses: 0.00 sec
[bwa_sai2sam_pe_core] refine gapped alignments... 0.82 sec
[bwa_sai2sam_pe_core] print alignments... 1.99 sec
[bwa_sai2sam_pe_core] 262144 sequences have been processed.
[bwa_sai2sam_pe_core] convert to sequence coordinate...
[infer_isize] (25, 50, 75) percentile: (3520, 39961, 70863)
[infer_isize] low and high boundaries: 94 and 205549 for estimating avg and std
[infer_isize] inferred external isize from 27 pairs: 37726.370 +/- 35311.353
[infer_isize] skewness: 0.341; kurtosis: -1.395; ap_prior: 1.00e-05
[infer_isize] inferred maximum insert size: 251007 (6.04 sigma)
[bwa_sai2sam_pe_core] time elapses: 10.87 sec
[bwa_sai2sam_pe_core] changing coordinates of 178 alignments.
[bwa_sai2sam_pe_core] align unmapped mate...
[bwa_sai2sam_pe_core] time elapses: 0.00 sec
[bwa_sai2sam_pe_core] refine gapped alignments... 0.82 sec
[bwa_sai2sam_pe_core] print alignments... 1.97 sec
[bwa_sai2sam_pe_core] 524288 sequences have been processed.
DJB775P1_0215:5:1101:1262:2347#0 65 chr13 92966174 37 94M chr5 33346129 0 AATAACCACCTAGATAAATGTTCACTCATCTCGCCTGTCTAGCCTGTCTTGAGGCCGGTTTCATCATGAGTCACTCCACCAATTACTTCAAAAC cgggeghhhhfhhfffgffegbcfgffdhfffhhhdb^efgfddfhhhhffS\eefgg\W`c]RZ^__GU]UMMZ_Z]\_a^_TR_bbbbYYYW XT:A:U NM:i:0 SM:i:37 AM:i:37 X0:i:1 X1:i:0 XM:i:0 XO:i:0 XG:i:0 MD:Z:94
DJB775P1_0215:5:1101:1495:2155#0/3 129 chr5 33346129 37 94M chr13 92966174 0 AATTAACTTCCTTTTTTTGTCTTCATATAACACTGTTGACCTACTCATATTGAGCCCTCAGTCTTTTTTGTACACATGCTCATCCCTGGCATGT ceggggiiiiiiiiiiiiihiiiiihiiihihifgffhiig`fgfghhfhghffhhihigfgfgeeceacabbcccb`b_bcccccc_X[`b^Y XT:A:U NM:i:0 SM:i:37 AM:i:37 X0:i:1 X1:i:0 XM:i:0 XO:i:0 XG:i:0 MD:Z:94
DJB775P1_0215:5:1101:1465:2351#0 81 chr14 44601925 37 94M chr5 33346129 0 TCTCCTACCTCCTCTCCCTTATAGAAATCCCTGTGATTCCATTAGTCTCACCTGGATAAACCAGAGTATTCTTATTATCCCAAGATCCCCATCT XTR^YGG\^ZZZRbb]]ccc_db]d^dgfcb`bZZe_bSfc`fgf_fc^Iffhhhhfhfagddhfhhhhfffgeefddfedbbd_agfcgebe\ XT:A:U NM:i:1 SM:i:37 AM:i:37 X0:i:1 X1:i:0 XM:i:1 XO:i:0 XG:i:0 MD:Z:38A55
DJB775P1_0215:5:1101:1483:2169#0/3 161 chr5 33346129 37 94M chr14 44601925 0 AATTAACTTCCTTTTTTTGTCTTCATATAACACTGTTGACCTACTCATATTGAGCCCTCAGTCTTTTTTGTACACATGCTCATCCCTGGCATGT ceYbgae_egihffhhd_^efhihfhhfXcghfcgacgfI^[cba\eecgZ_HWWHLaZ`VVVb`gacccZb_RZ]`]bbcbbbb^bc^`X][S XT:A:U NM:i:0 SM:i:37 AM:i:37 X0:i:1 X1:i:0 XM:i:0 XO:i:0 XG:i:0 MD:Z:94
DJB775P1_0215:5:1101:1317:2430#0 81 chr9 8239023 37 94M chr12 51288201 0 TTAAGTATTAAATGACATAAAACCTATAAAGCACATAGCAGGTAAATGTGGTAAACTCTTGATAAATGTTATTGTTATCATCATCATCATCACT b`]a^VHRcaggggbgeghgc\bhgefbZ\MW[gfgce^[gfff^^aa^OXeaYIae[hhhhgf^[d[hgfge_hhhgd_hY^bfdbcegecZc XT:A:U NM:i:0 SM:i:37 AM:i:37 X0:i:1 X1:i:0 XM:i:0 XO:i:0 XG:i:0 MD:Z:94
DJB775P1_0215:5:1101:1461:2198#0/3 161 chr12 51288201 37 94M chr9 8239023 0 CTGTTGGCTGGAATGTAAAATGGTGCAGCTGCTGTGGAAAACTGCATGGCAGTTCCTAGAAAAATTAAAAATAGAATTACCATATGATCCAGCA egggggefdf`egg[bdgh`]gh^dbe`dfhhbffbgIIX^^e_fgffhabgH\\_\HM\d]dUGV\\VV_ZVVHHUZ__bbc]`BBBBBBBBB XT:A:U NM:i:0 SM:i:37 AM:i:37 X0:i:1 X1:i:0
Can someone tell me why my reads are paired so weirdly? Any help is greatly appreciated!
Comment
-
If the reads have wildly different names, they aren't supposed to be paired with eaach other.
bwa assumes that the first read of the first fq goes with the first read of the second fq, and so on. That doesn't appear to be the case here, that's why your "pairs" are all over the place.
Comment
-
Hey,
I just wanna ask about paired-end data filtering. Do I need to filter read 1 and read 2 separately or combine read 1 and read 2 then filter? Because later on I want to use the filter data for RNA-seq analysis, using Tophat and cufflink. But, the Tophat require the read 1 and read 2 as input not as paired-end.
Comment
Latest Articles
Collapse
-
by seqadmin
The sequencing world is rapidly changing due to declining costs, enhanced accuracies, and the advent of newer, cutting-edge instruments. Equally important to these developments are improvements in sequencing analysis, a process that converts vast amounts of raw data into a comprehensible and meaningful form. This complex task requires expertise and the right analysis tools. In this article, we highlight the progress and innovation in sequencing analysis by reviewing several of the...-
Channel: Articles
05-06-2024, 07:48 AM -
-
by seqadmin
The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...-
Channel: Articles
04-22-2024, 07:01 AM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, Today, 07:03 AM
|
0 responses
10 views
0 likes
|
Last Post
by seqadmin
Today, 07:03 AM
|
||
Started by seqadmin, 05-10-2024, 06:35 AM
|
0 responses
31 views
0 likes
|
Last Post
by seqadmin
05-10-2024, 06:35 AM
|
||
Started by seqadmin, 05-09-2024, 02:46 PM
|
0 responses
41 views
0 likes
|
Last Post
by seqadmin
05-09-2024, 02:46 PM
|
||
Started by seqadmin, 05-07-2024, 06:57 AM
|
0 responses
33 views
0 likes
|
Last Post
by seqadmin
05-07-2024, 06:57 AM
|
Comment