Seqanswers Leaderboard Ad

**GERALD** · 01-29-2011, 05:19 PM

bareback

Actually, I have tried this myself and found it to be true. I just made a perl script to copy all the files and rename them (called it goatfooler). Then, I ran CASAVA and used another script to copy the tags and qscores to the front of the call. After recalling them, my base calling went from utter failure to complete success. I actually tried the undocumented --image-flags option and, just as you described, it didn't work very well. My Illumina rep was utterly baffled by my results. It would be really nice if Illumina provided more documentation of how they do their basecalling. I'm glad to hear that someone else obtained similar results from their analysis.

**C.R.** · 04-05-2011, 03:53 AM

I strongly agree. This is a big problem and Illumina does not pay attention to it. In general my libraries are OK, since it worked for one test run on a Genome Analyzer. Now I got 5 RRBS samples sequenced on a HiScanSQ but all reads are trash due to the problem which is nicely described in your paper. The Illumina tech-support did not help so far. Now since more than a week they only keep telling us that there was no technical problem during sequencing. Well, this is true, because the control lane and 2 Lanes ChIP-Seq are OK. Unfortunately, it seems that no high resolution images have been recorded, such that I cannot use your software. Thank you very much for your helpful comments so far Felix!
Is there anybody else who can tell me what needs to be considered for a successful Illumina HiSeq / HiScanSQ sequencing of RRBS libraries?

**NextGenSeq** · 04-27-2011, 10:28 AM

We just had this same issue with our HiSeq 2000. How can we reanalyze this without the image files? Can this be done using the CIF files?

**fkrueger** · 04-27-2011, 10:31 AM

I am afraid this won't work if you don't have the saved images. Did you lose entire lanes or just a certain fraction of it?

**NextGenSeq** · 04-27-2011, 10:56 AM

A fraction, the data quality drops off quickly after the barcode.

It's infuriating that Illumina has done nothing about this when they've known about this for years.

**HESmith** · 04-27-2011, 02:30 PM

I'll be the first to admit that Illumina has made some mistakes (for example, generating a file format that its aligner cannot read), and they could do a better job of advertising the issue, but the decision not to save the image files seems a reasonable trade off (although it would be nice to have the option to save). Transferring the images to the server had become the bottleneck for sequencing runs, and the problem was exacerbated when they rolled out the HiSeq. There are a couple of straightforward non-computational solutions: use custom sequencing primers if there's no diversity, or design multiple balanced barcodes for each sample to introduce diversity.

**protist** · 04-28-2011, 03:42 AM

Has anyone tried the "Configurable Template Generation Cycles option" in the new SCS2.9/RTA1.9 when running indexed samples on a GAIIx. It allows deferred cluster calling for low complexity or in adapter bar-coded samples. We have got the script from our FAS but have not tried it as yet....wondering if there is anyone out there who has?

[I]From SCS2.9/RTA1.9 Release notes:
Configurable Template Generation Cycles: The SCS CIF file generation feature cannot start until RTA has generated the tile templates. This
takes 5 cycles after the declared template generation cycle.
Normally template generation begins on cycle 1 and ends on cycle 5. However template generation requires a diversity of bases in the clusters of the template generation cycles. Some users have custom sample preparation procedures that place arbitrary sequences on the clusters, adapters or indexing ““spikes””, etc. The required diversity of bases may not be present in this case, and it is possible to delay template generation until the actual sample is being sequenced.
[/I]

**fkrueger** · 04-28-2011, 04:30 AM

Originally posted by protist View Post

Has anyone tried the "Configurable Template Generation Cycles option" in the new SCS2.9/RTA1.9 when running indexed samples on a GAIIx. It allows deferred cluster calling for low complexity or in adapter bar-coded samples. We have got the script from our FAS but have not tried it as yet....wondering if there is anyone out there who has?

I would also be interested if anyone had used this "new" option. After talking to our Illumina rep we don't have any reason to believe that the "Configurable Template Generation Cycles" option is any different from the previous unofficial option "--image-flags". Thus, I would imagine that the basecalls would still suffer from mysteriously bad qualities, see the Supplementary Figures linked in the first post of this thread.

Not quite but I also think that this option can only be applied to the entire flowcell and not on a per-lane basis, right?

**DNAANDDAN** · 06-06-2011, 07:43 PM

how about PE data

Hi, I have the same issue with my data. however , in my data , which is paired-end manner of solexa data ,1-81 are read1 data,and 82-162 are read2 data , 1-7 and 82-88 cycles are barcode with low diversety .
could bareback handle this kind of data ?

**fkrueger** · 06-06-2011, 11:05 PM

Hi Lan,

Yes, in theory bareback-processing should be able to handle this kind of data. Cluster coordinates are determined for read 1 only, so it will be sufficient if you shuffle the first 7 bp or read 1 towards the back and leave read 2 untouched (the bareback-script will do just that).

Good luck!

**Horacio G** · 09-21-2011, 12:54 PM

First try on low-diversity libraries

Hi guys,

I'm trying to run my first flow cell on a GAIIx with low-diversity libraries. I'm still not sure whether to go ahead and save the images and do the post analysis with Bareback (my illumina rep does not encourage that alternative) or to use the delay template generation. However on the latter I don't know if I'll get an early report about the quality of the run (i.e. focusing, intensities ).
Any suggestions would be greatly appreciated.

Horacio

**fkrueger** · 09-21-2011, 02:35 PM

Hi Horacio,

Why am I not surprised that your rep does not recommend anything but using the standard pipeline... If you've got the option to save the images I would definitely vote for that. If you still have the images you can choose to use the standard pipeline, use --image-flags (which is the Illumina deferred cluster calling option) or even bareback processing. However if you don't save images you will have to go with whatever the standard analysis pipeline will give you (and this can be shocking (0 sequences in the worst case scenario which we experienced several times)... but this highly depends on your experimental setup, the number of low diversity sequences, the cluster density and so on).

If you have further questions don't hesitate to ask via email.

Best,
Felix

**pmiguel** · 09-22-2011, 05:10 AM

Originally posted by HESmith View Post

[...] the decision not to save the image files seems a reasonable trade off (although it would be nice to have the option to save). Transferring the images to the server had become the bottleneck for sequencing runs, and the problem was exacerbated when they rolled out the HiSeq. [...]

There is an option to save the images. We tried it out on a recent run. This is using the standard HiSeq run software and v3 chemistry. 6.24 TB of TIFFs for a 2x101+7 run. (PE + index). That was only 1 surface of one flow cell though. So it would be 2x or 4x more for a HiSeq 1000 or HiSeq 2000. Also we save the runs to an offsite server during the run -- not the console machine itself.

What? You don't have 25 TBs handy to store image data?

What are you going to do with it? You can tell the instrument console (a Dell server running Windows Vista) to reprocess the image data. But that is going to be a slow process. You probably don't want to tie up your instrument that long reprocessing a run. Maybe clone the console server into a virtual machine and run it off-site?

--
Phillip

**fkrueger** · 09-22-2011, 05:21 AM

Thanks for this piece of information Phillip, so far the general consensus seemed to be that it is absolutely impossible to store image data (apart from thumbnails) from the HiSeq (probably also the HighScan then) at all. Storing this amount of data let alone reprocessing a whole flowcell (which would likely take a couple of days) is a whole different matter, though...

Topics	Statistics	Last Post
Bacterial Timeline Study Suggests Oxygen Use Preceded Photosynthesis by seqadmin Started by seqadmin, Today, 12:59 PM	0 responses 3 views 0 reactions	Last Post by seqadmin Today, 12:59 PM
New Software Simplifies 3D Gene Expression Mapping by seqadmin Started by seqadmin, Yesterday, 10:17 AM	0 responses 7 views 0 reactions	Last Post by seqadmin Yesterday, 10:17 AM
AI Tool Creates High-Resolution 3D Maps of the Mouse Brain by seqadmin Started by seqadmin, 03-20-2025, 05:03 AM	0 responses 49 views 0 reactions	Last Post by seqadmin 03-20-2025, 05:03 AM
Studying Microbial Gene Transfer with RNA Barcoding by seqadmin Started by seqadmin, 03-19-2025, 07:27 AM	0 responses 60 views 0 reactions	Last Post by seqadmin 03-19-2025, 07:27 AM

Seqanswers Leaderboard Ad

Loss of data in low-diversity libraries can be recovered by deferred cluster calling

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News