I have two RNA-seq data sets. When one is run through Cufflinks I see isoforms but when I run the other I see no isoforms, only single exons/transcripts, none with multiple exons. Further investigating I see that my alignments don't show spanning splice junctions using IGV to visualize. I have attached two png files to show what it should look like (normal.png) and what it does look like (broad.png). Any insights onto why this may be happening?
Seqanswers Leaderboard Ad
Collapse
Announcement
Collapse
No announcement yet.
X
-
Cufflinks won't be remapping anything, just estimating abundances and looking for new transcripts (among other related things). Have you tried looking at the same area in both alignments? Also, what about just running "samtools view alignments_lacking_apparent_splicing.bam | cut -f 6 | grep N" to double check things, as that'll check for splicing in the whole file instead of you needing to spot-check things.
-
dpryan, I just contacted the institute that aligned the files and the didn't use any splice junction aligned like tophat or PASTA, which I foolishly assumed. As they won't send me the read files as they are "too large" in their words, what is the best way to extract sequences from .bam files that I can run on tophat?
Comment
-
Wow, they sounds like assholes. I bet the original reads had crap alignment rates or something, suggesting that the data may be bad to begin with (though you wouldn't be able to see this from the alignments). You can just use the SamToFastq command from picard tools to extract the raw reads for realignment. The bad news is that, since they didn't use a splice-aware aligner (if these really are RNAseq reads, then these people are morons), a lot of the spliced reads probably just didn't align, so they probably aren't included in the BAM file.
Comment
-
I have found the culprit of the problem. It seems that the sequencing institute aligned the reads to the genome, which makes us think that they received reads and then aligned them genome. In actuality the sequencer is automated to align the reads and report a BAM file to indicate possible alignment sites, therefore consolidating data. Using picard to get the sequences to run through the proper Tophat-cufflinks protocol. Thanks for all the help everyone
Comment
Latest Articles
Collapse
-
by seqadmin
While isolating and preparing single cells for sequencing was historically the bottleneck, recent technological advancements have shifted the challenge to data analysis. This highlights the rapidly evolving nature of single-cell sequencing. The inherent complexity of single-cell analysis has intensified with the surge in data volume and the incorporation of diverse and more complex datasets. This article explores the challenges in analysis, examines common pitfalls, offers...-
Channel: Articles
06-06-2024, 07:15 AM -
-
by seqadmin
Technological advances have led to drastic improvements in the field of precision medicine, enabling more personalized approaches to treatment. This article explores four leading groups that are overcoming many of the challenges of genomic profiling and precision medicine through their innovative platforms and technologies.
Somatic Genomics
“We have such a tremendous amount of genetic diversity that exists within each of us, and not just between us as individuals,”...-
Channel: Articles
05-24-2024, 01:16 PM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, 06-07-2024, 06:58 AM
|
0 responses
179 views
0 likes
|
Last Post
by seqadmin
06-07-2024, 06:58 AM
|
||
Started by seqadmin, 06-06-2024, 08:18 AM
|
0 responses
228 views
0 likes
|
Last Post
by seqadmin
06-06-2024, 08:18 AM
|
||
Started by seqadmin, 06-06-2024, 08:04 AM
|
0 responses
184 views
0 likes
|
Last Post
by seqadmin
06-06-2024, 08:04 AM
|
||
Started by seqadmin, 06-03-2024, 06:55 AM
|
0 responses
18 views
0 likes
|
Last Post
by seqadmin
06-03-2024, 06:55 AM
|
Comment