Seqanswers Leaderboard Ad

**NGSfan** · 07-26-2010, 07:20 AM

Originally posted by cbrennan View Post

Hello all,

Has anyone out there done multiplexing on HiSeq without using the Illumina Multiplex kits? We're getting ready to upgrade to HiSeq & I'm a little concerned about how to handle multiplex lanes.

For GAIIx data we use the standard kits (no indexing read), I run OLB from images on a subset of tiles past the barcode, generate a crosstalk matrix & then use that to run basecalling.

However, after reviewing the HiSeq docs, I don't see any way to do that with the .bcl files or BCL converter. Anyone have experience with this?

Hi cbrennan!

Thanks for making me aware of the changes in the new HiSeq/HiScanSQ software. I have been also wondering how to handle self-barcoded samples in the new software as we wait for the new machine.

Btw - your approach to generating a crosstalk matrix outside of the barcode with a few tiles is something I would like to try. I run many barcoded samples and would like to drop using an unbarcoded control lane.

Could you share your approach? For example, how many tiles / cycles do you use? Do you just have RTA output a few tiles from your run?

I am sure many of us would find it useful!!

**kmcarr** · 07-26-2010, 01:54 PM

Christine,

New information which impacts on the ability to run OLB in this thread.

**NGSfan** · 07-28-2010, 07:18 AM

I looked up the HiSeq and HiScanSQ manuals and both machines will output BCL and CIF files - so offline basecalling is still possible.

It is mentioned that Images are deleted off the machine by *default*. They are stored and processed on a local hard drive (1.5TB). Not sure how many cycles it holds before it starts deleting. My guess is you can probably change the default setting to save images to a network? anyone have first hand experience with a HiSeq/HiScanSQ? we are still waiting to get ours. But Illumina will probably not provide any Technical Support if you play around with images (and probably no more Firecrest included in the OLB). But that is just my guess!! No clue, if anyone else knows please chime in.

Speaking of cross talk and phasing matrices, is it possible to change from Bustard.py the estimateCrosstalk -c <cycle> value to use a cycle outside of the barcode? (say my barcode is 5bp long, and I want to use cycle 6 for crosstalk matrix estimation). Also, does that mean that estimateCrosstalk uses only one cycle? or is the -c option just the starting cycle?

Do I need to change the estimatePhasing as well?

**cbrennan** · 07-30-2010, 10:20 AM

I think the options you are looking for are included in the --image_flags options of bustard.py If you do ./bustard.py --image_flags=help you should get a printout of the available flags for that version of the pipeline. Those are undocumented options that Illumina doesn't advertise. Our tech support person at Illumina didn't even know they existed, so use at your own risk.

Perhaps what you are looking for is --image_flags='--first-detection-cycle x' I think that moves all the decision making to start at cycle 'x' but I'm not sure. Perhaps your tech support person knows a little more about it than ours

As for HiSeq - from what I've heard, HiSeq never takes pictures, it scans the flowcell. I've been told 4 scans per lane, so that's 30 tiles per scan. Unless you write your own software to process the scans, I'm not sure saving would be any help. I am a bit concerned about that, but I'm not sure there's anything we can do about it --- ah progress.

Our HiSeqs have been on order for about a month, hopefully it will get here soon and we can start testing some of the possibilities.

**NGSfan** · 08-02-2010, 05:25 AM

Originally posted by cbrennan View Post

I think the options you are looking for are included in the --image_flags options of bustard.py If you do ./bustard.py --image_flags=help you should get a printout of the available flags for that version of the pipeline. Those are undocumented options that Illumina doesn't advertise. Our tech support person at Illumina didn't even know they existed, so use at your own risk.
Perhaps what you are looking for is --image_flags='--first-detection-cycle x' I think that moves all the decision making to start at cycle 'x' but I'm not sure. Perhaps your tech support person knows a little more about it than ours

thanks for the tip! I am familiar with that one - this is great for when you are saving the images, however I found that it only helps for the cluster identification (identifying the spots on the tile) during image analysis (Firecrest) but not for the basecalling parameterization (Bustard).

This is the command I use:
goat_pipeline.py --image_flags="--nd=4 --first-detection-cycle=5"

When I have more than 4 barcodes, then I only see a 6% increase in clusters when I do this - so there is not much return given the extra effort.

But my question is if you want to change the cycle in which *basecalling parameterization* starts then that is currently not something that is easily passed onto the downstream programs like estimateCrosstalk and estimatePhasing.

In theory, you shouldn't have to do any image analysis to handle barcoded fragments. The user should be allowed to train their basecalling matrices to "around/after" the barcodes and then apply them back to all cycles later.

The question is, how can I do this easily? do I just change some lines in the Makefile.basecalling.config that bustard produces?

Originally posted by cbrennan View Post

As for HiSeq - from what I've heard, HiSeq never takes pictures, it scans the flowcell. I've been told 4 scans per lane, so that's 30 tiles per scan. Unless you write your own software to process the scans, I'm not sure saving would be any help. I am a bit concerned about that, but I'm not sure there's anything we can do about it --- ah progress.

Technically, it still "takes pictures", in the sense that it stores them and analyzes them while the machine continues to sequence. The RTA software is practically the same as the offline software with a few exceptions. It looks to me they basically figured out how to have the offline software "run in the background" during the image capturing and flow cell chemistry cycles to process the image data as it is made, and delete it after analysis. It is not truly done "on the fly" in the sense that the images are not stored in RAM and processed near-instantaneously.

btw - I found in the Hi-Seq manual it mentions that the images are actually stored transiently on local harddrives of the Hi-Seq machine (and the HiScanSQ, which is the machine our facility will get soon):

The output from a sequencing run is a set of quality-scored base call files
(*.bcl files), which are generated from the raw image files and contain the
base calls per cycle. By default, images are deleted from the instrument
computer after image analysis. The raw image data are not needed for
downstream analysis, and in fact the two 1.5 TB hard drives can only store
images from approximately 20 cycles.

I know in the RTA config file, one can tell RTA to save images, even by lane and tile number:

<add key="ImageFilter" value="1_5,1_15,1_25,2_5,2_15,2_25" />

So perhaps a few tiles could be saved and analyzed... maybe...

**zero** · 08-26-2010, 08:47 PM

If you experience difficulties with the base calling using Bustard due to barcoding, you could try using BayesCall instead. We had a NimbleGen array capture tag of 20bp on our samples and Bustard totally messed up the base calling. Using BayesCall on the intensity text files (GAIIx, pipeline 1.5.1), the runs could be salvaged and we actually got excellent data (even better after removing the first 20 intensity values). So you could do the base calling on the full reads to find the barcode for each cluster and then chop off the barcode from the intensity files and call only those with BayesCall, which would give best results. You will get more reads, better quality and mapping scores, as well as reduced error rates.

So I am all for keeping those images!

**BGould** · 03-23-2011, 11:49 AM

Related Hi-seq Barcode issues?

Hi All,
I'm attempting to run barcoded libraries multiplexed on the Hi-seq platform, and I have had multiple failed runs. I have been trying to follow your thread here, but I'm relatively new to NGS and am not sure if the issue I am having might be related.

I am running RNA-seq paired-end libraries constructed with NEB reagents and 3 bp barcodes. The barcodes are as different as possible at every base. So far I have tried multiplexing just two libraries in one lane. A lot of raw reads are generated but filter down to almost nothing when going through the pre-filter on the platform.

If any glaring errors or incompatibilities jump out at you here I'd love to hear back. At this point I'm not sure if I need new libraries, new barcodes, a new platform, or what.

Any ideas would be much appreciated!!
Thanks,
BG

**NextGenSeq** · 03-23-2011, 01:57 PM

I've seen absolutely horrible libraries made with self made custom barcodes.

Buy the Epicentre multiplex RNA-Seq kit and save yourself aggravation.

**greigite** · 03-24-2011, 10:51 AM

Originally posted by NextGenSeq View Post

I've seen absolutely horrible libraries made with self made custom barcodes.

Buy the Epicentre multiplex RNA-Seq kit and save yourself aggravation.

NGS, what do you think is the main reason for failure of self made barcoding libraries? I just had a crappy HiSeq run using the UCD Genome Core set of 12 inline barcodes. Quantified all libraries post PCR with PicoGreen and Bioanalyzer prior to pooling but still got quite uneven representation and very low read yields despite fairly even base representation in the 1st 4 cycles.

The problem with Illumina's kits is that you can only multiplex 12 at a time. Do you think the solution is to incorporate custom barcodes into the TruSeq adapters to make use of the indexing read concept with more samples per lane?

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 18 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 22 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 17 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 49 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad

Announcement

Multiplex on HiSeq?

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News