Seqanswers Leaderboard Ad

**rgregor** · 11-12-2010, 12:48 AM

I am looking at accepted_hits.bam (output from TopHat):

read_id_0 16 1 2803 0 36M * 0 0 ACACATACACTGCGCTATTAAACAAGACACTTGTAC ffdfffefdfefffffffffffffffffffffffff NM:i:0 NH:i:14 CC:Z:= CP:i:7210

Are in this file only alignments that mapped to splice-sites? How to know how the read was spliced? (both locations of mapping)

Perhaps from the last part (SAM TAGS?): NM:i:0 NH:i:14 CC:Z:= CP:i:7210

tnx,
Gregor

**fhb** · 11-14-2010, 08:11 AM

Hi Gregor,
take a look at this file:http://samtools.sourceforge.net/SAM1.pdf

tophat print both splided and non spliced alignemnts. In this case you do not have a splice (36M)

You will see a spliced sequence as XXMXXIXXM. In this case X are the number of bases that Matched on one exon, number of bases from the intron, and number of bases that matched on the other exon.

I hope it helps.
Fernando

In the item 2.2.3 of that file you have:

2.2.3. Extended CIGAR format
A CIGAR string is comprised of a series of operation lengths plus the operations. The conventional CIGAR format allows
for three types of operations: M for match or mismatch, I for insertion and D for deletion. The extended CIGAR format
further allows four more operations, as is shown in the following table, to describe clipping, padding and splicing:
op Description
M Alignment match (can be a sequence match or mismatch)
I Insertion to the reference
D Deletion from the reference
N Skipped region from the reference
S Soft clip on the read (clipped sequence present in <seq>)
H Hard clip on the read (clipped sequence NOT present in <seq>)
P Padding (silent deletion from the padded reference sequence

**lpachter** · 11-15-2010, 09:08 AM

Greg, to answer your last question: tophat uses bowtie as the engine for its read -> genome mapping as part of the algorithm for finding spliced reads. Cufflinks in turn can use the tophat alignments. The programs are modular so that you can run Cufflinks using (spliced) read alignments made with other programs.

Topics	Statistics	Last Post
Expanding the Horizons of Cellular Research with the Single Cell Atlas by seqadmin Started by seqadmin, 04-25-2024, 11:49 AM	0 responses 19 views 0 likes	Last Post by seqadmin 04-25-2024, 11:49 AM
Genetic Variants and Diabetes Risk in Childhood Cancer Survivors by seqadmin Started by seqadmin, 04-24-2024, 08:47 AM	0 responses 20 views 0 likes	Last Post by seqadmin 04-24-2024, 08:47 AM
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 62 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 61 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM

Seqanswers Leaderboard Ad

Announcement

Abundande with bowtie, tophat and cufflinks

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News