Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    We have had great success using the NuGen library prep. Their adapters have inline barcodes which adds to the diversity for the first cycles and allows sequences to pass filter. After passing filter the HiSeq can sequence the no or low diversity samples without any problems.

    Comment


    • #17
      Just as another thought, if you could afford to spike in 90-95% gDNA, couldn't you also find an external sequencing facility who still run GAIIx's and use the methods which work well on these?

      Comment


      • #18
        I have seen bad batches of phiX that had fairly high (a few percent, I think) adapter dimers levels in them. Maybe you should make your own genomic DNA library to make sure your "diluent" is of high quality.

        Also you could obtain your "diluent" by sub-contracting a sequencing job. Send an email out to a prospective department (maybe one with a high level of plant or fungal sciences being done) and offer a one time only discount genome sequence. Our diluent was a sorghum genomic DNA library.

        --
        Phillip

        Comment


        • #19
          Originally posted by HESmith View Post
          There's an alternative approach, assuming that you have not yet constructed the libraries. Design them so the junction is at the opposite end of the insert, and perform paired-end sequencing. Cluster calling is based only on the first five cycles of read one, so you'll avoid the low-complexity issue.

          I have a sample of 96-plex low diversity amplicon libraries running now and clusters were found just fine--but the low diversity is causing a tremendous discrepancy between the blue and the green box-and-whiskers plot--raw clusters and clusters passing filter. I hope those data are recoverable at the end. Nothing in my primer design, barcoding, indexing scheme can change the fact that it's "low complexity". First four bases were completely random and followed by eight different in-line bar codes.

          This is PE sequencing.

          Yet I know labs are making this work.

          Comment


          • #20
            If the first four bases are random, then subsequent low complexity should not adversely affect cluster calling or data quality. Excessive cluster density is a possible culprit: what are your raw and PF values?

            Comment


            • #21
              We do similar things to what you are describing all the time. A 20-30% PhiX spike (or any other library) should do the trick. PhiX is easy since it can be easily removed without an index and you can monitor the percent alignment as the run is going.

              Our most common condition is a HiC or 5C library where we need to get through some T3 and T7 sequences that are common to all of the samples. We have used both ChIPseq libraries as well as PhiX spikes with very good results. The use of the ChIPseq libraries just allows those reads to be used for something useful where PhiX is just data thrown away.

              Add in a spike and lower cluster density on the HiSeq to the 500k to 600k range and you should be fine. If you want to avoid the spike altogether, lowering clusters to about 200k also works but with more variable results.
              HudsonAlpha Institute for Biotechnology
              http://www.hudsonalpha.org/gsl

              Comment


              • #22
                For what it's worth we sequenced two lanes on a HiSeq (v2 flow cell) containing 11 7 bp inline barcodes with a spike-in of 5% phiX. Despite the low diversity visible in fastqc plots for the first 7 bp our cluster density and % of clusters passing filter was comparable to other lanes on the same run that did not have any low-complexity issues.

                Comment


                • #23
                  Originally posted by HESmith View Post
                  If the first four bases are random, then subsequent low complexity should not adversely affect cluster calling or data quality. Excessive cluster density is a possible culprit: what are your raw and PF values?
                  Semantically "random" does not mean "high complexity". See.

                  Point being we presume "random" here means all four bases were a random mix of ACGT and therefore an even mixture of all 256 possible sequences. But we would need to know what the method of generating these was to know.

                  --
                  Phillip

                  Comment


                  • #24
                    random (adj.) - lacking a definite plan, purpose, or pattern (emphasis added).

                    The poster stated that the "first four bases were completely random". I assumed (not "presumed") he meant what he wrote :-).

                    Semantically, "high complexity" does not mean base composition diversity, which is the relevant issue for cluster calling. A library that consists solely of AAAAA, CCCCC, GGGGG, or TTTTT starts (in roughly equal amounts) would suffice, yet (almost) no one would argue that this constitutes high complexity.

                    Apologies if this message comes across as cranky, Phillip. I was just trying my best to help the poster, and don't see how your comments contribute to the solution.

                    -Harold

                    Comment


                    • #25
                      Hi Harold,
                      Despite the simplicity of what we are discussing here, I think there are ambiguities. I agree your interpretation is likely the correct one. But randomly choosing 5 bases once and prefixing all the reads in a lane with that random sequence would lead to failure of the cluster calling software. That is all I meant. That might seem ludicrous, but I have seen experiments fail for misunderstandings just as ludicrous.

                      But, yeah, that might be sufficiently unlikely that my bringing up was just distracting, not illuminating. (Also, it could be the malign influence of xkcd forcing my hand to create that link back to it...)

                      --
                      Phillip

                      Comment


                      • #26
                        Originally posted by pmiguel View Post
                        Semantically "random" does not mean "high complexity". See.

                        Point being we presume "random" here means all four bases were a random mix of ACGT and therefore an even mixture of all 256 possible sequences. But we would need to know what the method of generating these was to know.

                        --
                        Phillip
                        Harold and Phillip,
                        I am the poster ("she", not "he", btw) who used the phrase "First four bases were completely random". The 4 random bases were generated by ordering my oligos with "NNNN" where the read is supposed to begin. I did not generate a single "random" sequence to use. Back when I used to synthesize oligos myself, we achieved randomness by mixing reagents into a single bottle that went on the instrument along with A, C, G, T. Don't know what InVitrogen or IDT do these days. (Anybody else go back to Maxam&Gilbert sequencing days, pre-PCR ?)

                        To update, it appears that I got a reasonable number of reads surviving up until the HiSeq lost focus partway through read 3. I'll try the lower cluster density and phiX or shotgun library spike in next time.

                        Thanks, all.

                        Hilary

                        Comment


                        • #27
                          Hi Hilary,

                          Apologies for using the incorrect gender, and sorry to hear about the focusing error. Better luck next time.

                          Harold

                          Comment


                          • #28
                            Would running low diversity libraries at a low concentration not help solve the problem?
                            If you are not looking for large number of reads, then running at a low concentration should mean less chance of overlapping clusters and more reads passing filter.

                            Comment


                            • #29
                              Yes it helps. But from my limited experience, the catastrophic failures come from focusing issues -- where the instrument sees a blank flow cell surface and de-focuses as it attempts to "find" the clusters it expects.

                              Again, recent firmware upgrades may have mitigated this particular issue. I am particularly paranoid about it because we only recently got an Illumina sequencer and our particular model is an outlier. So problems probably get solved for the HiSeqs first -- those particular to a HiScanSQ would be noticed and fixed later in most cases.

                              --
                              Phillip

                              Comment


                              • #30
                                Originally posted by HMorrison View Post
                                Harold and Phillip,
                                To update, it appears that I got a reasonable number of reads surviving up until the HiSeq lost focus partway through read 3. I'll try the lower cluster density and phiX or shotgun library spike in next time.

                                Thanks, all.

                                Hilary
                                The loss of focus during read 3 is likely a bubble from fluidics than a diversity problem. If you got that far with good PF clusters, good base quality and a good, flat FWHM metric, it isn't the diversity of the library that is the problem. It is very likely a fluidics issue and one you should raise with your FAS as it would potentially be eligible for a warranty replacement of the affected lane.

                                It is also worth checking if your HiSeq (assuming it is a HiSeq instrument, if not, this may not apply) has had the new solenoid valves installed. They help prevent, but not eliminate, the bubble issues. I don't know what Illumina is calling the new valves but your FSE or FAS will know.
                                HudsonAlpha Institute for Biotechnology
                                http://www.hudsonalpha.org/gsl

                                Comment

                                Latest Articles

                                Collapse

                                • seqadmin
                                  Strategies for Sequencing Challenging Samples
                                  by seqadmin


                                  Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                                  03-22-2024, 06:39 AM
                                • seqadmin
                                  Techniques and Challenges in Conservation Genomics
                                  by seqadmin



                                  The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                                  Avian Conservation
                                  Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                                  03-08-2024, 10:41 AM

                                ad_right_rmr

                                Collapse

                                News

                                Collapse

                                Topics Statistics Last Post
                                Started by seqadmin, 03-27-2024, 06:37 PM
                                0 responses
                                13 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 03-27-2024, 06:07 PM
                                0 responses
                                11 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 03-22-2024, 10:03 AM
                                0 responses
                                53 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 03-21-2024, 07:32 AM
                                0 responses
                                69 views
                                0 likes
                                Last Post seqadmin  
                                Working...
                                X