Seqanswers Leaderboard Ad

**fkrueger** · 07-31-2011, 11:58 PM

The insert is normally the stretch of sequence between the paired-end adapters, so in your case the insert size would be 250 bp (2x75 bp reads + 100 bp unsequenced middle piece). The fragment size (which you need to select for during a gel purification for example) would be the insert size + length of both adapters (around 120 bp extra for both Illumina adapters).

**louis7781x** · 08-01-2011, 12:57 AM

Originally posted by fkrueger View Post

The insert is normally the stretch of sequence between the paired-end adapters, so in your case the insert size would be 250 bp (2x75 bp reads + 100 bp unsequenced middle piece). The fragment size (which you need to select for during a gel purification for example) would be the insert size + length of both adapters (around 120 bp extra for both Illumina adapters).

Hi,Does adapter also sequence too? I mean row data has adapter sequence?

**fkrueger** · 08-01-2011, 01:02 AM

Normally sequencing starts right after the adapter but does not include adapter sequence.

**kmcarr** · 08-01-2011, 05:17 AM

Originally posted by louis7781x View Post

Hi,

I have a question about term "insert-size"

If |-----75----|------------------------100-----------------|-----75-----|

paired-end data both 75 mer,In this example, The insert-size is 100 or 250?

I am confused with fragment size

Thanks

Best regard!

In your example I would say the insert size is 250bp. But as fkrueger noted above there is more than one way to describe things. When the wet lab sends data to me they report the library fragment size which includes the ligated Illumina adapters; continuing with your example the fragment size in this case would have been 320bp. Certain software may use different measurements. For example TopHat requests the mate inner distance, the length between the two sequence reads, which in your example is 100bp.

The lesson is to be very clear about what is being asked or reported.

**arkilis** · 09-23-2013, 09:09 PM

Originally posted by louis7781x View Post

Hi,

I have a question about term "insert-size"

If |-----75----|------------------------100-----------------|-----75-----|

paired-end data both 75 mer,In this example, The insert-size is 100 or 250?

I am confused with fragment size

Thanks

Best regard!

As far as I know it is the 250..

150 is a typo...sorry

**mcnelson.phd** · 09-24-2013, 03:53 AM

As noted, when most analysis programs ask for an insert size, they are referring to the size of your fragment with the adapters excluded, 250bp in your case. However, some programs use the term insert size to mean the gap distance between the 3' end of the two reads (assuming standard forward/reverse orientation), which in your case is 100bp. Most programs are decently documented enough to state which version they mean when they say insert size, but you shouldn't assume that it's interchangeable. The term pair-distance is also used, and just like insert size has been taken to mean both the size of the fragment minus the adapters (250bp) or the gap distance (100bp).

For assemblies interchanging the two value won't cause huge problems, but for read mapping methods where you want to look for insertions/deletions or splice variation then inputting the correct value can be very important.

**Yue Xu** · 09-24-2013, 04:12 AM

Expression quintiles

Sorry, I am recently study some about transcription assembly. Can you tell me the meaning of Expression quintiles? Thank you very much.

**Yue Xu** · 09-24-2013, 04:13 AM

Originally posted by Yue Xu View Post

Sorry, I am recently study some about transcription assembly. Can you tell me the meaning of Expression quintiles? Thank you very much.

Oh, sorry, I post it wrongly.

**thomasblomquist** · 09-24-2013, 04:21 AM

P5 --- Index/Barcode1 --- Read 1 Primer --- Insert/TargetFragment --- Read 2 Primer --- Index/Barcode2 --- P7

The Insert/TargetFragment region needs to be less than the size of the base length sequencing kit you're using. For example if you use a 2 x 100 PE kit, and you require at least 20 bases of overlap from Read 1 and Read 2, your insert fragments cannot be larger than 180 bases in length.

As stated above, the P5 --- Index/Barcode1 --- Read 1 Primer, and Read 2 Primer --- Index/Barcode2 --- P7 add about 120-130 bases of length onto your insert fragment (depending on the size of the index barcodes and type of read 1 and read 2 primers you have chosen).

Keep in mind that others have reported/observed, and myself included, that the efficiency and success rate of the clustering step is significantly reduced when a final library template molecule is <250 or >800 bases. Thus, make sure the sum of the lengths falls between these ranges if possible. Quantitation between deletion/insertion alleles that straddle these upper and lower ranges cannot be trusted for reproducibility between different library preps (Just an FYI, personal observation).

-Tom

**kmcarr** · 09-24-2013, 09:45 AM

Originally posted by thomasblomquist View Post

P5 --- Index/Barcode1 --- Read 1 Primer --- Insert/TargetFragment --- Read 2 Primer --- Index/Barcode2 --- P7

The Insert/TargetFragment region needs to be less than the size of the base length sequencing kit you're using. For example if you use a 2 x 100 PE kit, and you require at least 20 bases of overlap from Read 1 and Read 2, your insert fragments cannot be larger than 180 bases in length.

It is not required that the two reads overlap. For most applications you do not, in fact want them to overlap and thus want an insert size larger than 2x read length.

Keep in mind that others have reported/observed, and myself included, that the efficiency and success rate of the clustering step is significantly reduced when a final library template molecule is <250 or >800 bases. Thus, make sure the sum of the lengths falls between these ranges if possible.
-Tom

Having found over the years a metric crap-ton of adapter dimers (120 bp fragment size) in read data where none is visible in the Bioanalyzer trace of the library I would say that fragments ≤ 150bp cluster and amplify efficiently as hell.

**thomasblomquist** · 09-24-2013, 10:21 AM

Originally posted by kmcarr View Post

It is not required that the two reads overlap. For most applications you do not, in fact want them to overlap and thus want an insert size larger than 2x read length.

Correct, I did not place the "if you need overlap" qualifier.

Originally posted by kmcarr View Post

Having found over the years a metric crap-ton of adapter dimers (120 bp fragment size) in read data where none is visible in the Bioanalyzer trace of the library I would say that fragments ≤ 150bp cluster and amplify efficiently as hell.

LMAO. Yes, they do indeed cluster. I think, and I'm just surmising here, that the adapter dimers (ssDNA), heterodimerize with actual target template (dsDNA). My evidence to this statement is that in my amplicon libraries, wherein I stop the PCR prep in early cycles, when the target size peak is just starting to crop up on the electropherogram on the bioanalyzer DNA chip, then size extract that target peak, I get virtually no primer/adapter dimers sequenced. However, as the target peak begins to reach plateau in PCR, the dimer peak starts to diminish a bit, and my thoughts are that the adapter dimer, is non-specifically annealing to other target-specific templates. These electrophorese on the Bioanalyzer at or around the target specific size, and in a non-denaturing size-based extraction, will be pulled into the final library. In these latter cases with over-shooting the cycles in the PCR based library prep, I see a ton of adapter or read1/2 dimer products formed.

As for ligation type approach, my assumption is that it is probably fairly easy to subsequently accidently denature and reanneal a complex library and the adapter/read primer dimers get heterodimerized with other large complexes.

The key then is to pull out the ssDNA that is the target length. PAGE purification? But yield tends to be too low.

Thus, I tend to aim for a low minimal number of PCR cycles, and keeping the prepped library cool to minimize this issue.

Good point to bring up! :-)

-Tom

**mohiuddinbdfh** · 10-01-2013, 08:16 AM

Hi,
I am a newbie in metagenomics. I just sequenced my soil DNA samples through Illumina HiSeq2000 (2X151 bp). Now I need to assemble my sequences and for doing that I need the insert size, the minimum and maximum distance between the sequences. I asked the sequencing facility about this but they send me the bioanalyzer result which looks complicated to me. I attached the bioanalyzer result here. I will appreciate if anyone can explain this bioanalyzer result.

Thanks

**ymc** · 10-11-2014, 12:19 AM

Is this a case of running 2x250 on MiSeq but is getting 150-350 PE reads???

Child with meningoencephalitis: untreated CSF - SRA - NCBI

http://www.ncbi.nlm.nih.gov/sra/?term=SRR1145846

**sunguk** · 02-05-2017, 10:36 PM

Originally posted by mohiuddinbdfh View Post

Hi,
I am a newbie in metagenomics. I just sequenced my soil DNA samples through Illumina HiSeq2000 (2X151 bp). Now I need to assemble my sequences and for doing that I need the insert size, the minimum and maximum distance between the sequences. I asked the sequencing facility about this but they send me the bioanalyzer result which looks complicated to me. I attached the bioanalyzer result here. I will appreciate if anyone can explain this bioanalyzer result.

Thanks

It may mean the size of your DNA before sequencing. By observing the sizes of DNAs, we can check contaminants. And they will fragment DNAs and sequence them. Later you will get the sequencing data.
By the way, your data looks strange.
And this result does not have nothing with Illumina library insertion data.
Generally, the insertion size can be 180-350 bp.
You better BLAST both sequences of the same id and manually check the insertion size.

Topics	Statistics	Last Post
AI Tool Creates High-Resolution 3D Maps of the Mouse Brain by seqadmin Started by seqadmin, 03-20-2025, 05:03 AM	0 responses 49 views 0 reactions	Last Post by seqadmin 03-20-2025, 05:03 AM
Studying Microbial Gene Transfer with RNA Barcoding by seqadmin Started by seqadmin, 03-19-2025, 07:27 AM	0 responses 57 views 0 reactions	Last Post by seqadmin 03-19-2025, 07:27 AM
Mapping the snoRNAome in Zebrafish to Advance Disease Research by seqadmin Started by seqadmin, 03-18-2025, 12:50 PM	0 responses 50 views 0 reactions	Last Post by seqadmin 03-18-2025, 12:50 PM
TIGR Systems Offer a Compact Alternative to CRISPR for Gene Editing by seqadmin Started by seqadmin, 03-03-2025, 01:15 PM	0 responses 201 views 0 reactions	Last Post by seqadmin 03-03-2025, 01:15 PM

Seqanswers Leaderboard Ad

The insert-size in paired-end data

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News