why low mapping rates for RNAseq?

[email protected] replied

09-09-2014, 12:34 PM
I will have to ask my bioinformatician for the stats after adapter trimming. I will get back on this. Thanks for your time.
Leave a comment:
Brian Bushnell replied

09-09-2014, 12:04 PM
The whole thing as PDF is good way to start. But particularly the quality histograms, anything that failed the tests, and anything related to mapping. And the stats from adapter trimming, the command lines you used for each program, the top hits and mapping rate you got from Blast... etc.
Leave a comment:
[email protected] replied

09-09-2014, 11:57 AM
Ok. I have looked at the FastQC report of the samples and it looks fine to me. What kind of information should I give from FastQC report?
Leave a comment:
Brian Bushnell replied

09-08-2014, 02:47 PM
It would be helpful if you posted some basic quality metrics, such as you get from FastQC. There's not enough information to determine what the problem is or even if there is a problem.
Leave a comment:
[email protected] replied

09-08-2014, 01:05 PM
The bioinformatic person is doing all that. She trimmed the adapters and tried aligning the reads to the genome using tophat. She got 40% alignment there. We tried blasting some of the unaligned reads and realized that something went wrong with the tophat run as some of them were aligning to chromosome M, chr 1, 4 etc. She will be doing the alignment again with STAR this time but to save on time she also ran the RSEM along side and got these low percentage alignments to transcriptome so I wanted to know if we are missing out on anything?
Leave a comment:
dpryan replied

09-08-2014, 12:48 PM
Replying to a ~year old thread is not normally the most efficient route to get help.

Did you adapter trim your data? Have you tried aligning to the genome? Have you tried blasting a few unaligned reads?
Leave a comment:
[email protected] replied

09-08-2014, 12:45 PM
Hi, I have run RNA-seq on human samples and got very low alignment percentages in Tophat and RSEM. I had used Illumina ribo zero Truseq kit for library prep. What could be the reason of low alignment? Right now only 11% of my reads are aligning with the transcriptome in RSEM. Can I do something to fix this?
Leave a comment:
Lizex replied

08-29-2013, 05:37 AM
Originally posted by dpryan View Post

Depending on exactly what you want to do with the reads, you can either map read1 as single-ended with tophat or just ignore them (the read2 file will mostly be crap in my experience). Given how many of your pairs became singletons, you might want to go ahead and align read1 just so you have a bit more data (I haven't ever lost many reads).

Thanks for the advice.
Leave a comment:
dpryan replied

08-29-2013, 04:49 AM
Depending on exactly what you want to do with the reads, you can either map read1 as single-ended with tophat or just ignore them (the read2 file will mostly be crap in my experience). Given how many of your pairs became singletons, you might want to go ahead and align read1 just so you have a bit more data (I haven't ever lost many reads).
Leave a comment:
Lizex replied

08-29-2013, 02:23 AM
Originally posted by dpryan View Post

That looks pretty reasonable. You started with ~1.5 million reads and aligned ~1.4 million, of which ~85% were properly paired. That's certainly a vast improvement over the original 12% mapping rate that you reported!

Thanks for the reply. This result was for the paired reads (output from Trimmomatic). What should I do for the unpaired reads (output from Trimmomatic) which don't have an even number of reads, read1 has 896 804 reads and read2, 13 476. Should I map them also using Tophat.
Leave a comment:
dpryan replied

08-29-2013, 02:09 AM
That looks pretty reasonable. You started with ~1.5 million reads and aligned ~1.4 million, of which ~85% were properly paired. That's certainly a vast improvement over the original 12% mapping rate that you reported!
Leave a comment:
Lizex replied

08-29-2013, 01:43 AM
Originally posted by Lizex View Post

Thanks. I'll give it a try.

Hi dpryan

I've tried Trimmomatic. The number of reads i.e read1.fq and read2.fq are 1 492 345 for each. After mapping using Tophat 1.4.0, the stats of the accepted_hits.bam file looks like this:

samtools flagstat /Data_Analysis/E0.2.3/E0_tophat/accepted_hits.bam 1404454 + 0 in total (QC-passed reads + QC-failed reads)
0 + 0 duplicates
1404454 + 0 mapped (100.00%:nan%)
1404454 + 0 paired in sequencing
682904 + 0 read1
721550 + 0 read2
1200618 + 0 properly paired (85.49%:nan%)
1243330 + 0 with itself and mate mapped
161124 + 0 singletons (11.47%:nan%)
0 + 0 with mate mapped to a different chr
0 + 0 with mate mapped to a different chr (mapQ>=5)

Is this a good mapping or bad? How should I interpret this result?
Leave a comment:
Lizex replied

08-27-2013, 01:15 PM
Originally posted by dpryan View Post

trim_galore or trimmomatic are common suggestions. I've had good luck in the past with trim_galore, which is also quite flexible.

Thanks. I'll give it a try.
Leave a comment:
dpryan replied

08-27-2013, 01:11 PM
Originally posted by Lizex View Post

Thanks, dpryan. What do suggest I use?

trim_galore or trimmomatic are common suggestions. I've had good luck in the past with trim_galore, which is also quite flexible.
Leave a comment:
Lizex replied

08-27-2013, 12:24 PM
Originally posted by dpryan View Post

Since you have paired-end data, be careful using fastx_toolkit. I've seen a lot of people desyncing their paired-end reads with it.

Thanks, dpryan. What do suggest I use?
Leave a comment:

Previous 1 2 3 4 template Next

Exploring the Dynamics of the Tumor Microenvironment

by seqadmin

The complexity of cancer is clearly demonstrated in the diverse ecosystem of the tumor microenvironment (TME). The TME is made up of numerous cell types and its development begins with the changes that happen during oncogenesis. “Genomic mutations, copy number changes, epigenetic alterations, and alternative gene expression occur to varying degrees within the affected tumor cells,” explained Andrea O’Hara, Ph.D., Strategic Technical Specialist at Azenta. “As...
- Channel: Articles
07-08-2024, 03:19 PM

Topics	Statistics	Last Post
Gene Misexpression in the Healthy Human Population by seqadmin Started by seqadmin, Yesterday, 06:46 AM	0 responses 9 views 0 likes	Last Post by seqadmin Yesterday, 06:46 AM
New Method for Rapid Genetic Diagnosis of Mendelian Disorders by seqadmin Started by seqadmin, 07-24-2024, 11:09 AM	0 responses 26 views 0 likes	Last Post by seqadmin 07-24-2024, 11:09 AM
Advancing Nanopore Technology for Portable Sensing Devices by seqadmin Started by seqadmin, 07-19-2024, 07:20 AM	0 responses 159 views 0 likes	Last Post by seqadmin 07-19-2024, 07:20 AM
New RNA-Based Gene Writing Technology Achieves Precise Gene Integration by seqadmin Started by seqadmin, 07-16-2024, 05:49 AM	0 responses 127 views 0 likes	Last Post by seqadmin 07-16-2024, 05:49 AM

Seqanswers Leaderboard Ad

Announcement

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Latest Articles

ad_right_rmr

News