Seqanswers Leaderboard Ad

**mastal** · 06-17-2013, 07:32 AM

RNASeq: Read length different from expected

The read length of 101 bp instead of 100 bp would probably be because the operator thought there were enough reagents left to sequence the extra base.

Presumably the run that gave the 101 bp reads had some problems that resulted in poorer base qualities than the run or lane that gave the 100 bp reads.

**GenoMax** · 06-17-2013, 07:34 AM

That is odd. I do not think one can set up lanes on a single flowcell to run for different number of cycles.

Have you checked with your provider to confirm if all the samples ran together (were they multiplexed) or if there was a batch difference (2 separate runs?) that can account for your observation. I think it would probably be the latter case.

**mastal** · 06-17-2013, 07:41 AM

The read identifiers for each sample should tell you if they were on the same flow cell or lane.

**GenoMax** · 06-17-2013, 07:49 AM

Originally posted by mastal View Post

The read identifiers for each sample should tell you if they were on the same flow cell or lane.

Great point. This should be easy to check.

**kmcarr** · 06-17-2013, 01:24 PM

It is somewhat standard practice for Illumina sequencing that if you want read lengths of N bases you run N+1 cycles. This has to do with the way base calling works on Illumina; to properly call the base at position n in a read you need data from cycle n+1. The last base in a read will always have a lower Q-score reflecting the added uncertainty in the base call. To mitigate this you run N+1 cycles but just report N bases per read, dropping the last, low quality base. This practice is ingrained in the Illumina run recipes; the standard PE100 recipe (with indexing) on the HiSeq runs a 209 cycles (101 + 7 + 101) adding an extra cycle each to read 1, the index read and read 2. (Interestingly the MiSeq recipes still add the extra cycle to reads 1 & 2 but not to the index read.)

Some core labs simply call and report all 101 cycles, some stick to the original practice of clipping the last base. It may be that your samples were all run together on the same flow cell for 2x101 cycles but for one set of 20 they reported all 101 cycles and the other they clipped the last base. You need to check the IDs of your samples to identify the flow cell and lanes used for each.

**gogodidi** · 06-17-2013, 10:30 PM

Originally posted by mastal View Post

The read identifiers for each sample should tell you if they were on the same flow cell or lane.

Indeed, they were!
Thank you (Don't know why it didn't occur to me).

**gogodidi** · 06-17-2013, 10:31 PM

Originally posted by kmcarr View Post

It is somewhat standard practice for Illumina sequencing that if you want read lengths of N bases you run N+1 cycles. This has to do with the way base calling works on Illumina; to properly call the base at position n in a read you need data from cycle n+1. The last base in a read will always have a lower Q-score reflecting the added uncertainty in the base call. To mitigate this you run N+1 cycles but just report N bases per read, dropping the last, low quality base. This practice is ingrained in the Illumina run recipes; the standard PE100 recipe (with indexing) on the HiSeq runs a 209 cycles (101 + 7 + 101) adding an extra cycle each to read 1, the index read and read 2. (Interestingly the MiSeq recipes still add the extra cycle to reads 1 & 2 but not to the index read.)

Some core labs simply call and report all 101 cycles, some stick to the original practice of clipping the last base. It may be that your samples were all run together on the same flow cell for 2x101 cycles but for one set of 20 they reported all 101 cycles and the other they clipped the last base. You need to check the IDs of your samples to identify the flow cell and lanes used for each.

Thank you for this information.
It is strange then, that they cut it for one lane, and not for the other.
I'll have to call them to ask about that.

Thank you so much, all of you!

Topics	Statistics	Last Post
Gene Misexpression in the Healthy Human Population by seqadmin Started by seqadmin, 07-25-2024, 06:46 AM	0 responses 9 views 0 likes	Last Post by seqadmin 07-25-2024, 06:46 AM
New Method for Rapid Genetic Diagnosis of Mendelian Disorders by seqadmin Started by seqadmin, 07-24-2024, 11:09 AM	0 responses 26 views 0 likes	Last Post by seqadmin 07-24-2024, 11:09 AM
Advancing Nanopore Technology for Portable Sensing Devices by seqadmin Started by seqadmin, 07-19-2024, 07:20 AM	0 responses 160 views 0 likes	Last Post by seqadmin 07-19-2024, 07:20 AM
New RNA-Based Gene Writing Technology Achieves Precise Gene Integration by seqadmin Started by seqadmin, 07-16-2024, 05:49 AM	0 responses 127 views 0 likes	Last Post by seqadmin 07-16-2024, 05:49 AM

Seqanswers Leaderboard Ad

Announcement

RNASeq: Read length different from expected

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News