Unconfigured Ad

**chadn737** · 11-01-2012, 07:08 AM

How have you trimmed your reads? Have you looked for adaptor sequence in your reads?

**nr23** · 11-01-2012, 07:18 AM

I've trimmed the reads using fastq_quality_trimmer & filter and fastx_trimmer.

One of the problems I've had is that the RNA fragment size is ~130 bp (post adapter removal) and my 100bp reads therefore overlap considerably. I've been using fastx_trimmer to cut the reads to 65bp to ensure no overlap - but they don't seem to be pairing properly in mapping.

I haven't checked for adapters - I ran the .txt files through fastqc and there were no over-represented sequences.

N

**chadn737** · 11-01-2012, 07:37 AM

Thats what I thought.

Even at 65 bp you may still have overlap and/or adaptor sequence.

Is it critical that you have paired end data? I had a similar situation with some paired end data. I simply dispensed with the second set of reads and treated it as single end reads. With that amount of overlap, its probably going to be impossible for tophat to get the insert size right.

Also try adaptor trimming with a trimmer that can handle variable lengths of adaptor sequence, I have used cutadapt with great success. Then try realigning without your paired end and you should have better results.

Otherwise....make a new library.

**GenoMax** · 11-01-2012, 08:01 AM

If your reads are overlapping significantly you may want to try this as an alternative to stitch the two ends together.

http://bioinformatics.oxfordjournals.org/content/early/2011/09/07/bioinformatics.btr507.full.pdf

Updated citation:

Tanja Magoč and Steven L. Salzberg

FLASH: fast length adjustment of short reads to improve genome assemblies Bioinformatics (2011) 27(21): 2957-2963

**nr23** · 11-01-2012, 08:04 AM

I ran the 65bp trimmed reads through FLASH (http://genomics.jhu.edu/software/FLASH/index.shtml) to confirm that, post trim, there's no overlap.

As I understand it bowtie and tophat map the pairs independently, so I would expect that dispensing of 1/2 of my reads would result in the same % mapped reads, maybe I'm wrong though?

My primary concern is that the % of reads mapped is so low, I'm less concerned about the pairing of the reads (I'm interested in differential expression rather than resolving isoforms etc) but can't help but feel that the two are linked...

Topics	Statistics	Last Post
Large-Scale Protein Screen Uncovers Hidden Regulators of Alternative Polyadenylation by SEQadmin2 Started by SEQadmin2, 06-26-2026, 11:10 AM	0 responses 12 views 0 reactions	Last Post by SEQadmin2 06-26-2026, 11:10 AM
Whole-Genome Sequencing Traces Faroe Islands Ancestry to a North Atlantic Founder Population by SEQadmin2 Started by SEQadmin2, 06-17-2026, 06:09 AM	0 responses 48 views 0 reactions	Last Post by SEQadmin2 06-17-2026, 06:09 AM
Sequencing the Two-Toed Sloth Genome Reveals Jumping Genes Tied to Its Extreme Metabolism by SEQadmin2 Started by SEQadmin2, 06-09-2026, 11:58 AM	0 responses 107 views 0 reactions	Last Post by SEQadmin2 06-09-2026, 11:58 AM
A New Method Makes Hantavirus Genome Analysis Faster and More Accessible by SEQadmin2 Started by SEQadmin2, 06-05-2026, 10:09 AM	0 responses 125 views 0 reactions	Last Post by SEQadmin2 06-05-2026, 10:09 AM

Unconfigured Ad

Samtools flagstat - low % reads mapping

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News