Tophat2 output bam size discrepancies

Sentinel156

Junior Member

Join Date: Oct 2014

Posts: 5
- Share
- Tweet
#1

Tophat2 output bam size discrepancies

05-19-2015, 12:19 AM

Hi all,

I'm seeing tophat2 output bam files of different sizes (~800mb vs ~300mb) for two fastq files from the same strain (different biological replicates) with similar total read numbers for each (~14 million from 100bp single end illumina hiseq 2000 run). both fastqs are being mapped against the same reference genome but each sample was sequenced at different times and on different lanes. I don't believe this is a problem but would like to understand the size discrepancy.

Cheers,
Tags: bam, discrepancy, file size, tophat2
Brian Bushnell

Super Moderator

Join Date: Jan 2014

Posts: 2709
- Share
- Tweet
#2

05-19-2015, 09:34 AM

Things like mapping rate and error rate are much more useful than bam file size, which depends on such factors as compression level, read names, read quality score smoothness, and specific aligner parameters like which tags to generate and whether to output unmapped reads. Do you know the alignment rates and command line parameters?
Comment

Previous template Next

Topics	Statistics	Last Post
A Close Examination at Probiotic-Related Bacteremia by seqadmin Started by seqadmin, 05-02-2024, 08:06 AM	0 responses 16 views 0 likes	Last Post by seqadmin 05-02-2024, 08:06 AM
Expanded Genetic Insights into Blood Pressure Regulation by seqadmin Started by seqadmin, 04-30-2024, 12:17 PM	0 responses 20 views 0 likes	Last Post by seqadmin 04-30-2024, 12:17 PM
The Role of Enhancers in Defining Cell Fate by seqadmin Started by seqadmin, 04-29-2024, 10:49 AM	0 responses 25 views 0 likes	Last Post by seqadmin 04-29-2024, 10:49 AM
Expanding the Horizons of Cellular Research with the Single Cell Atlas by seqadmin Started by seqadmin, 04-25-2024, 11:49 AM	0 responses 28 views 0 likes	Last Post by seqadmin 04-25-2024, 11:49 AM

Seqanswers Leaderboard Ad