Seqanswers Leaderboard Ad
Collapse
Announcement
Collapse
No announcement yet.
X
-
I am also ready for analysis of AB solid data with TopHat and later Cufflinks, hope for successful example here
-
Hi clariet,
Actually I've never considered this column to be a factor in filtering, so i can't give you a definite answer. But I think it'd be good to filter away those low quality mappings, perhaps someone could give a recommendation on the threshold?
Leave a comment:
-
Thank you very much for the reply. I was referring to the 5th column. It IS the mapping quality according to SAM manual. But I guess this mapping quality must be correlated with alignment score for some way.
I have seen a lot of alignment with low mapping quality (0 or 1). For the input of cufflinks, do you usually filter out these low quality mapping reads? what is the cutoff you usually use?
Thanks
Originally posted by Haneko View PostHi there,
That is actually not the score, but the mapping quality (I'm assuming you're referring to column 5 for 255 "score"). For calculation of score, you will have to take the alignment in colorspace (XL:Z) and the number of colorspace mismatches (XU:Z), then use SOLiD's formula.
A score of 10 shouldn't be appearing in the output. The seed of 25bp mapping with at most 2 mismatches will give u the lowest possible score for an alignment to be reported, which in my case is 18.
Leave a comment:
-
Originally posted by Xi Wang View PostAccording to SAM manual, the column 5 is for mapping quality. I just refer to this column as mapping quality, and I think the two - "alignment score" and "mapping quality" - are the same.
Leave a comment:
-
Sorry, I made it a bit confusing.
Yes, the alignment score I'm referring to is self-calculated. Not column 5 of mapping quality.
Leave a comment:
-
Originally posted by Haneko View PostHmm, I guess it depends on how you see it. I noted that the column gives a value of 255 for spliced reads, so it's not really helpful when it comes to spliced alignments. And we've always been dependent on the alignment score (using alignment length and mismatches) since WTAP1.2, so I tend to favor that.
But I really want to point out that the 255 means the mapping quality is not available. (Refer to the SAM manual: http://samtools.sourceforge.net/SAM1.pdf)
Leave a comment:
-
Hmm, I guess it depends on how you see it. I noted that the column gives a value of 255 for spliced reads, so it's not really helpful when it comes to spliced alignments. And we've always been dependent on the alignment score (using alignment length and mismatches) since WTAP1.2, so I tend to favor that.
Leave a comment:
-
Originally posted by Haneko View PostI'm not sure about that. We've never really looked into mapping quality. The score of 10 i was referring to was the alignment score actually.
Leave a comment:
-
I'm not sure about that. We've never really looked into mapping quality. The score of 10 i was referring to was the alignment score actually.
Leave a comment:
-
Originally posted by Haneko View PostHi there,
A score of 10 shouldn't be appearing in the output. The seed of 25bp mapping with at most 2 mismatches will give u the lowest possible score for an alignment to be reported, which in my case is 18.
Leave a comment:
-
Hi there,
That is actually not the score, but the mapping quality (I'm assuming you're referring to column 5 for 255 "score"). For calculation of score, you will have to take the alignment in colorspace (XL:Z) and the number of colorspace mismatches (XU:Z), then use SOLiD's formula.
A score of 10 shouldn't be appearing in the output. The seed of 25bp mapping with at most 2 mismatches will give u the lowest possible score for an alignment to be reported, which in my case is 18.
Leave a comment:
-
From the lines below, the score of these alignment are all the same: 255. But from my bioscope output, most of the alignment has less than 10 score. Should I filter out these alignments?
[
QUOTE=Haneko;15679]I'm getting the following using your code:
1206_912_423 16 chrX 148852770 255 10H10M101N30M * 0 0 CTCCCGTAGCCTTGATGGTCTGCTGCTTCCGTCTGTCACT ,GA%%:IIIIIIIIIIIIIIIIIIIIIIIIIIII
IIIIII CS:Z:T32112112213020231231221013210203231320221310310031 XJ:Z:K CQ:Z:<<::9@9=:?==;:=>>>=:>9>695;;773:885&%*80,/&7&())6( XL:Z:39,39 XU:Z:3,1 IH:i:2 HI
:i:2 MD:Z:40 XS:A:-
922_1240_1515 16 chrX 119563391 255 10H10M1029N30M * 0 0 TGATCATGATCATTTGTCTGCAATGGTTTTGCCAGCATCT "C?H?'';?&&A?"""IIIIIIIIIIIIIIIIII
IIIIII CS:Z:T32231321031000101301312213103133211123213222112001 XJ:Z:K CQ:Z::>>:>:?<>;==::<9=;>>9><:&4,6&.2*',45+9()50)'&*&2 XL:Z:39 XU:Z:4 IH:i:1 HI:i:1 MD:Z:40 XS
:A:-
1297_662_654 0 chrX 153279920 255 10H10M102N26M4H * 0 0 CTTCGGTGTGCCACTGAAGATCCTGGTGTCGCCATG 1IIEIIIIC?III&&III&&4?I:;BDI=+.;I=3% CS
:Z:T20331231203202301111301111202132021011123301313032 XJ:Z:K CQ:Z:@@96564=5/919428;7>&:78=&:585&+*66%7,98&&)38&.%8,+ XL:Z:30,35 XU:Z:2,2 IH:i:2 HI:i:2 MD:Z:36 XS
:A:+
1289_854_1683 16 chrX 153666617 255 10H10M1046N30M * 0 0 TGCCACTCGCCATTCCTGCAGCTCAGGGGAAGGGATCAAT '<A;5<IB9;@IDH((IIHIIIIGGIIIIIIIII
IIIIII CS:Z:T33012320020200021223213122203103322110313223332222 XJ:Z:K CQ:Z:AA;A>;9>>?;?6:3:.:;4872:7(=,98)3'<7&0,6')1'5/1.)4/ XL:Z:39 XU:Z:1 IH:i:1 HI:i:1 MD:Z:40 XS
:A:-
1409_132_757 16 chrX 153666617 255 10H10M1046N30M * 0 0 TGCCACTCTACATTCCTGCAGCTCAGGGGAAGGGATCAAT "9:<?;G%###"IF##GFIIIAGIECIIIIIIII
IIIIII CS:Z:T33012320020200021223213121203213222110311113332022 XJ:Z:K CQ:Z:?><@;9<>?8:>5<31553/<7526#619&#/%71+5(3'$&%:4&&-44 XL:Z:39 XU:Z:4 IH:i:1 HI:i:1 MD:Z:8TA30
XS:A:-
1125_1188_1449 16 chrX 53458535 255 10H10M110N30M * 0 0 GAAGAACCTCCTACAATGACACGGGCAAAGGTACGGTCCT &-<I<?##E@)/<>"""/:?IIIIIIIIIIIIII
IIIIII CS:Z:T32021031310200130031112113113102231022021112101031 XJ:Z:K CQ:Z:;=:<=?A:?>@<=8=5:==<.2)'/*7(5/)8.#:&75(&*6#)9$8$#8 XL:Z:39 XU:Z:4 IH:i:1 HI:i:1 MD:Z:40 XS
:A:-
.
.
.[/QUOTE]
Leave a comment:
-
My mistake. Bioscope does output spliced alignment as one record. pre-Bioscope WT pipeline generates two records in gff file for reads mapped to splice junction.
thanks for the reply.
Leave a comment:
-
Hi,
I don't think BioScope outputs 2 entries, it should only output 1 entry for each alignment (continuous or spliced), unless there are more than one alignment for that read. It shouldn't be necessary to merge any 2 lines.
Did you find any such cases in your data?
Leave a comment:
-
Originally posted by damiankao View PostI am using Bioscope mapping output .bam files as input into cufflinks. You have to first convert to .sam file, clean it up, and added the strand information by parsing the bitwise flag.
I was able to run this cleaned up version of .sam file through cufflinks with pretty good results. The only problem I am having is that most of the output is not showing any strand information.
I think cufflink is only using strand information for spliced reads and ignoring unspliced read strand? So all the genes assembled with spliced read has strand information, but others don't?
Any thought on this issue?
Leave a comment:
Latest Articles
Collapse
-
by seqadmin
Innovations in next-generation sequencing technologies and techniques are driving more precise and comprehensive exploration of complex biological systems. Current advancements include improved accessibility for long-read sequencing and significant progress in single-cell and 3D genomics. This article explores some of the most impactful developments in the field over the past year.
Long-Read Sequencing
Long-read sequencing has seen remarkable advancements,...-
Channel: Articles
12-02-2024, 01:49 PM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, 12-02-2024, 09:29 AM
|
0 responses
158 views
0 likes
|
Last Post
by seqadmin
12-02-2024, 09:29 AM
|
||
Started by seqadmin, 12-02-2024, 09:06 AM
|
0 responses
57 views
0 likes
|
Last Post
by seqadmin
12-02-2024, 09:06 AM
|
||
Started by seqadmin, 12-02-2024, 08:03 AM
|
0 responses
48 views
0 likes
|
Last Post
by seqadmin
12-02-2024, 08:03 AM
|
||
Started by seqadmin, 11-22-2024, 07:36 AM
|
0 responses
76 views
0 likes
|
Last Post
by seqadmin
11-22-2024, 07:36 AM
|
Leave a comment: