Thanks to everyone who contributes to these basic threads. I know they may get tedious for the old-timers, but they're vital for newer folks. Even if the same info is technically available at reference sites, it's often easier to understand when the answer is to a specific, basic question, rather than organized as vendor documentation or generalized teaching material!
Seqanswers Leaderboard Ad
Collapse
Announcement
Collapse
No announcement yet.
X
-
This helped but not enough ! I think I get it but then questions pop up - this mates in a pair, paired read stuff pops up when I look a the specification for the SAM format output from the bowtie program. There are flags values that are returned to indicate if the aligned read is one of pair and/or first one in a pair and/or second one in a pair and so on..
But how does it, meaning bowtie, know ? Am I right in thinking that is specified in the input record ? I can see the .fastq input format has a place holder for this, so I'm guessing that other input formats eg. sra also have it ? And if they don't then there's no way for bowtie ( or other algorithms ) to derive it ?
Seeing as this is all equipment specific I'm going to need to look at some videos that describe the front parts of the this entire operation. Any ideas where ? I was hoping to just pick it up and work with it from the point of sequences of characters - an IT perspective - but guess not.
Comment
-
Originally posted by edilana.gomes View PostHi!!
Can anyone clarify the difference between mate pair, pair end and single end reads?
Thanks.
Mate-pairs and paired-end reads have been covered. Single end reads are just that... a single read from one end of each sheared DNA fragment.
Scott.
Comment
-
Originally posted by raonyguimaraes View PostHello,
Suppose I have two reads from an exome in fastq. How to determine if they are pair-end or single-end or mate-pair ?
-A
Comment
-
Originally posted by raonyguimaraes View PostHello,
Suppose I have two reads from an exome in fastq. How to determine if they are pair-end or single-end or mate-pair ?
Paired ends run towards each other, and are about 100-500 bp apart. Mate pairs run away from each other, and tend to be a few kb apart, but I belive sometimes they are contaminated with ordinary paired end data.
Comment
-
I have some questions about aligning paired end sequencing reads. I am using BWA sampe function to align my paired end reads and it worked, but surprisingly almost all reads are being paired with reads on a different chromosomes, resulting in a lot of "improper reads". I don't understand why BWA did that and I wonder if it was because I used the command "bwa sampe -a 15000 -A" to force bwa to not run smith waterman alignment for unmapped reads.
Also, if paired end reads share the same x and y coordinates, which are indicated by the first line of their fastq files, why doesn't bwa just pair them up by their coordinates? That seems like the most straightforward way to find the right pair to me.
Comment
-
Hi all I am still struggling with using BWA to align my paired end reads. I used the command:
bwa sampe -P -s hg19.fasta CATTCG_1.sai CATTCG_3.sai CATTCG_1.fastq CATTCG_3.fastq > CATTCG_PE.sam
and the first few lines of the program running look like this:
[bwa_sai2sam_pe_core] convert to sequence coordinate...
[infer_isize] fail to infer insert size: too few good pairs
[bwa_sai2sam_pe_core] time elapses: 10.96 sec
[bwa_sai2sam_pe_core] changing coordinates of 6 alignments.
[bwa_sai2sam_pe_core] align unmapped mate...
[bwa_sai2sam_pe_core] time elapses: 0.00 sec
[bwa_sai2sam_pe_core] refine gapped alignments... 0.82 sec
[bwa_sai2sam_pe_core] print alignments... 1.99 sec
[bwa_sai2sam_pe_core] 262144 sequences have been processed.
[bwa_sai2sam_pe_core] convert to sequence coordinate...
[infer_isize] (25, 50, 75) percentile: (3520, 39961, 70863)
[infer_isize] low and high boundaries: 94 and 205549 for estimating avg and std
[infer_isize] inferred external isize from 27 pairs: 37726.370 +/- 35311.353
[infer_isize] skewness: 0.341; kurtosis: -1.395; ap_prior: 1.00e-05
[infer_isize] inferred maximum insert size: 251007 (6.04 sigma)
[bwa_sai2sam_pe_core] time elapses: 10.87 sec
[bwa_sai2sam_pe_core] changing coordinates of 178 alignments.
[bwa_sai2sam_pe_core] align unmapped mate...
[bwa_sai2sam_pe_core] time elapses: 0.00 sec
[bwa_sai2sam_pe_core] refine gapped alignments... 0.82 sec
[bwa_sai2sam_pe_core] print alignments... 1.97 sec
[bwa_sai2sam_pe_core] 524288 sequences have been processed.
DJB775P1_0215:5:1101:1262:2347#0 65 chr13 92966174 37 94M chr5 33346129 0 AATAACCACCTAGATAAATGTTCACTCATCTCGCCTGTCTAGCCTGTCTTGAGGCCGGTTTCATCATGAGTCACTCCACCAATTACTTCAAAAC cgggeghhhhfhhfffgffegbcfgffdhfffhhhdb^efgfddfhhhhffS\eefgg\W`c]RZ^__GU]UMMZ_Z]\_a^_TR_bbbbYYYW XT:A:U NM:i:0 SM:i:37 AM:i:37 X0:i:1 X1:i:0 XM:i:0 XO:i:0 XG:i:0 MD:Z:94
DJB775P1_0215:5:1101:1495:2155#0/3 129 chr5 33346129 37 94M chr13 92966174 0 AATTAACTTCCTTTTTTTGTCTTCATATAACACTGTTGACCTACTCATATTGAGCCCTCAGTCTTTTTTGTACACATGCTCATCCCTGGCATGT ceggggiiiiiiiiiiiiihiiiiihiiihihifgffhiig`fgfghhfhghffhhihigfgfgeeceacabbcccb`b_bcccccc_X[`b^Y XT:A:U NM:i:0 SM:i:37 AM:i:37 X0:i:1 X1:i:0 XM:i:0 XO:i:0 XG:i:0 MD:Z:94
DJB775P1_0215:5:1101:1465:2351#0 81 chr14 44601925 37 94M chr5 33346129 0 TCTCCTACCTCCTCTCCCTTATAGAAATCCCTGTGATTCCATTAGTCTCACCTGGATAAACCAGAGTATTCTTATTATCCCAAGATCCCCATCT XTR^YGG\^ZZZRbb]]ccc_db]d^dgfcb`bZZe_bSfc`fgf_fc^Iffhhhhfhfagddhfhhhhfffgeefddfedbbd_agfcgebe\ XT:A:U NM:i:1 SM:i:37 AM:i:37 X0:i:1 X1:i:0 XM:i:1 XO:i:0 XG:i:0 MD:Z:38A55
DJB775P1_0215:5:1101:1483:2169#0/3 161 chr5 33346129 37 94M chr14 44601925 0 AATTAACTTCCTTTTTTTGTCTTCATATAACACTGTTGACCTACTCATATTGAGCCCTCAGTCTTTTTTGTACACATGCTCATCCCTGGCATGT ceYbgae_egihffhhd_^efhihfhhfXcghfcgacgfI^[cba\eecgZ_HWWHLaZ`VVVb`gacccZb_RZ]`]bbcbbbb^bc^`X][S XT:A:U NM:i:0 SM:i:37 AM:i:37 X0:i:1 X1:i:0 XM:i:0 XO:i:0 XG:i:0 MD:Z:94
DJB775P1_0215:5:1101:1317:2430#0 81 chr9 8239023 37 94M chr12 51288201 0 TTAAGTATTAAATGACATAAAACCTATAAAGCACATAGCAGGTAAATGTGGTAAACTCTTGATAAATGTTATTGTTATCATCATCATCATCACT b`]a^VHRcaggggbgeghgc\bhgefbZ\MW[gfgce^[gfff^^aa^OXeaYIae[hhhhgf^[d[hgfge_hhhgd_hY^bfdbcegecZc XT:A:U NM:i:0 SM:i:37 AM:i:37 X0:i:1 X1:i:0 XM:i:0 XO:i:0 XG:i:0 MD:Z:94
DJB775P1_0215:5:1101:1461:2198#0/3 161 chr12 51288201 37 94M chr9 8239023 0 CTGTTGGCTGGAATGTAAAATGGTGCAGCTGCTGTGGAAAACTGCATGGCAGTTCCTAGAAAAATTAAAAATAGAATTACCATATGATCCAGCA egggggefdf`egg[bdgh`]gh^dbe`dfhhbffbgIIX^^e_fgffhabgH\\_\HM\d]dUGV\\VV_ZVVHHUZ__bbc]`BBBBBBBBB XT:A:U NM:i:0 SM:i:37 AM:i:37 X0:i:1 X1:i:0
Can someone tell me why my reads are paired so weirdly? Any help is greatly appreciated!
Comment
-
If the reads have wildly different names, they aren't supposed to be paired with eaach other.
bwa assumes that the first read of the first fq goes with the first read of the second fq, and so on. That doesn't appear to be the case here, that's why your "pairs" are all over the place.
Comment
-
Hey,
I just wanna ask about paired-end data filtering. Do I need to filter read 1 and read 2 separately or combine read 1 and read 2 then filter? Because later on I want to use the filter data for RNA-seq analysis, using Tophat and cufflink. But, the Tophat require the read 1 and read 2 as input not as paired-end.
Comment
Latest Articles
Collapse
-
by seqadmin
While isolating and preparing single cells for sequencing was historically the bottleneck, recent technological advancements have shifted the challenge to data analysis. This highlights the rapidly evolving nature of single-cell sequencing. The inherent complexity of single-cell analysis has intensified with the surge in data volume and the incorporation of diverse and more complex datasets. This article explores the challenges in analysis, examines common pitfalls, offers...-
Channel: Articles
Yesterday, 07:15 AM -
-
by seqadmin
Technological advances have led to drastic improvements in the field of precision medicine, enabling more personalized approaches to treatment. This article explores four leading groups that are overcoming many of the challenges of genomic profiling and precision medicine through their innovative platforms and technologies.
Somatic Genomics
“We have such a tremendous amount of genetic diversity that exists within each of us, and not just between us as individuals,”...-
Channel: Articles
05-24-2024, 01:16 PM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, Today, 06:58 AM
|
0 responses
13 views
0 likes
|
Last Post
by seqadmin
Today, 06:58 AM
|
||
Started by seqadmin, Yesterday, 08:18 AM
|
0 responses
19 views
0 likes
|
Last Post
by seqadmin
Yesterday, 08:18 AM
|
||
Started by seqadmin, Yesterday, 08:04 AM
|
0 responses
18 views
0 likes
|
Last Post
by seqadmin
Yesterday, 08:04 AM
|
||
Started by seqadmin, 06-03-2024, 06:55 AM
|
0 responses
13 views
0 likes
|
Last Post
by seqadmin
06-03-2024, 06:55 AM
|
Comment