Unconfigured Ad

**Brian Bushnell** · 08-11-2014, 10:31 AM

The optimal insert size depends on various factors...

1) Read length and sequencing platform
2) Gene and exon length distribution in the target organism
3) Use of data - assembly vs quantification

I don't think you can derive a useful number without specifying these things. I like long insert sizes, particularly in organisms with differential splicing, as they are more informative about the source isoform. But it's really experiment-specific.

**luc** · 08-11-2014, 12:49 PM

Thanks, Brian.

Yes, I should have specified that. I was thinking about Illumina HiSeq systems and transcript quantification as the purpose (e.g. usually single -end 50 bp reads).

I imagine random-priming will cause some bias against smaller fragments. Illumina flowcell clustering on the other hand is more efficient for smaller fragments. The chemical fragmentation is very likely approximately random; nevertheless there is likely some bias as to which transcripts of specific size ranges (lets say about 400 bp transcripts compared to 3kb transcripts) show up as fragments of a specific size range (e.g. 150 bp inserts or 300 bp inserts)?
Very likely it would be best to look at some ERCC spike-in data.

**Brian Bushnell** · 08-11-2014, 04:01 PM

With 50bp single-end reads, there is no reason to shoot for a long insert size, and for quantification, short inserts will be less biased anyway. I don't know what kind of biases are introduced by the different fragmentation methods, though I understand that "random hexamer priming" is actually pretty non-random, so it seems like something to avoid for accurate quantification of small transcripts.

Also, the shorter your insert sizes, the less genetic material or amplification you will need. So it seems like you should go as short as possible; maybe 100bp.

**turnersd** · 08-13-2014, 04:17 AM

Don't mean to side-track this discussion too much, but I'm noticing I have very poor coverage of a relatively small transcript (1200bp) after rRNA reduction and 2x100 sequencing, need to check on insert size. What are some of the upstream library prep steps that have been discussed here that could result in this poor coverage? That is, could you help me understand why random hexamer priming biases against coverage of small transcripts? How does the insert size affect this small-transcript coverage?

Thanks.

**Brian Bushnell** · 08-13-2014, 08:48 AM

If the random hexamers are not completely random (in terms of their concentration or binding affinity), then transcripts rich in the more concentrated/better-binding hexamers will be overrepresented and those poor in them will be underrepresented. The shorter a transcript is, down to a limit of 6bp, the more highly skewed the abundance distribution of its hexamers is likely to be. 1200bp is probably fairly long for that to play a major role.

Also, the longer the insert relative to the transcript, the fewer available start/stop positions there are. Considering a 600bp transcript, there's no longer any place an 800bp insert fragment can originate. But assuming you kept 600bp and smaller fragments, the majority of fragments from that transcript would be expected to be the whole unsheared transcript, starting at one end and ending at the other with no coverage in the middle (since only the 2 outermost 100bp sections would be sequenced).

**turnersd** · 08-13-2014, 10:07 AM

Thanks for the helpful explanation, Brian.

Topics	Statistics	Last Post
A New Method Makes Hantavirus Genome Analysis Faster and More Accessible by SEQadmin2 Started by SEQadmin2, 06-05-2026, 10:09 AM	0 responses 12 views 0 reactions	Last Post by SEQadmin2 06-05-2026, 10:09 AM
A New Single-Cell Method Maps DNA-Protein Interactions by SEQadmin2 Started by SEQadmin2, 06-04-2026, 08:59 AM	0 responses 23 views 0 reactions	Last Post by SEQadmin2 06-04-2026, 08:59 AM
Long-Read RNA Sequencing Uncovers a Hidden Layer of Immune Cell Regulation by SEQadmin2 Started by SEQadmin2, 06-02-2026, 12:03 PM	0 responses 28 views 0 reactions	Last Post by SEQadmin2 06-02-2026, 12:03 PM
DNA Methylation Study Reveals How Epigenetic Changes Pass Between Generations by SEQadmin2 Started by SEQadmin2, 06-02-2026, 11:40 AM	0 responses 22 views 0 reactions	Last Post by SEQadmin2 06-02-2026, 11:40 AM

Unconfigured Ad

insert sizes for RNA-seq

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News