Seqanswers Leaderboard Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • yog77
    Member
    • Jun 2011
    • 18

    Bisulphite sequencing on Illumina Paired End 100bp reads

    Hi All Im new on here. i was after advice concerning the 100bp PE reads for bisulphite sequencing:

    I would like to sequence directly from 200bp PCR fragments as I am hoping to index 96 samples in a single lane.

    Is it possible to sequence many (~150) PCR amplicons that are all 200bp long using PE 100bp. I was hoping to generate similar sized bisulphite PCR amplicons (~200bp) so there would be no need for size selection and this is a reliably obtainable size for bisulphite PCR.

    Also I really have no other option but to use PCR products as I plan to do bisulphite sequencing and my source material would be bisulphite PCR products. These bisulphite PCR amplicon sequences are to be aligned to a small genomic region (a large gene) where all my amplicons will come from (300kb region of bisulphite converted sequnce, which will be used specifically for the alignment) and some of these 200bp amplicons will overlap with one another (say where I was interested in a streach of 3kb or so and covered it with a number of 200bp PCR amplicons).

    In essence I want as much read length (100bp x2) from the 200bp PCR amplicons as possible to look at methylated cytosines and SNPs throughout the full length of the same amplicon and so my thinking is there will be no insert for the 100bp PE read as I want data on the whole (or as much as possible) of the 200bp PCR amplicon - is this possible to achieve?

    Also I have heard it is a problem bioinformaticaly that if you do 100bp PE reads, you ideally don't want the reads from either end to overlap (ie for the sequncing from either end inwards) coz if they do you can't align these easily or you get over represented sequnces?

    Thanks guys
  • volks
    Member
    • Jun 2010
    • 80

    #2
    so you want to sequence 150 amplicons in 96 samples? you would have to do a lot of primer design, optimization and pipetting.

    why dont you go for 454 sequencing? you would obtain fewer (but sufficiently many) and longer reads at a cheaper price.

    alignment is not influenced by overlapping PE reads. and if you are concerned about possible biases you can soft-clip the overlap later on.

    Comment

    • frozenlyse
      Senior Member
      • Sep 2008
      • 135

      #3
      300kb target / 200bp amplicon == 1500 primer pairs just for one strand of bisulfite treated DNA (you need both strands to cover all possible SNPs). So 3000 primer pairs to cover both strands, 96 samples == 288,000 unique amplicons... At 1 lane of a GAIIx (lets be generous and say 40M PF clusters) that is ~140x coverage which is a good ball park to be in. But how are you going to do those 3000x96 PCRs, normalise the amount of products, index and mix?

      Have you considered doing sequence capture followed by bisulfite instead?

      Also aligning 2x100bp reads to a whole (human?) genome isn't really much of a problem, though there will always be holes in coverage due to regions of poor mappability.
      Last edited by frozenlyse; 07-06-2011, 04:55 PM.

      Comment

      • volks
        Member
        • Jun 2010
        • 80

        #4
        Originally posted by frozenlyse View Post
        Have you considered doing sequence capture followed by bisulfite instead?
        .. not so easy to retain complexity.

        i would suggest whole-genome MeDIP/MCIp or alike. on one lane hiseq you could multiplex a couple of samples. and you would only have to sequence SE 50bp.

        Comment

        • yog77
          Member
          • Jun 2011
          • 18

          #5
          Originally posted by frozenlyse View Post
          300kb target / 200bp amplicon == 1500 primer pairs just for one strand of bisulfite treated DNA (you need both strands to cover all possible SNPs). So 3000 primer pairs to cover both strands, 96 samples == 288,000 unique amplicons... At 1 lane of a GAIIx (lets be generous and say 40M PF clusters) that is ~140x coverage which is a good ball park to be in. But how are you going to do those 3000x96 PCRs, normalise the amount of products, index and mix?

          Have you considered doing sequence capture followed by bisulfite instead?

          Also aligning 2x100bp reads to a whole (human?) genome isn't really much of a problem, though there will always be holes in coverage due to regions of poor mappability.
          Thanks for the comments I just realised I'd made a mistake in saying 300kb sorry an extra zero - I meant 30kb so ~ 150 amplicons/primer pairs for one strand. Which is manageable I would think.

          Capture isn't really an option as there would be issues with complexity I think?

          Thanks

          Comment

          • yog77
            Member
            • Jun 2011
            • 18

            #6
            Originally posted by volks View Post
            .. not so easy to retain complexity.

            i would suggest whole-genome MeDIP/MCIp or alike. on one lane hiseq you could multiplex a couple of samples. and you would only have to sequence SE 50bp.
            We would really like to get individual CpG resolution data. MeDIP is going to be carried out by colleague so I want to do something different with greater resolution at a single candidate gene.

            Comment

            • yog77
              Member
              • Jun 2011
              • 18

              #7
              Hi all I have another Question regarding this approach:

              Q2) Secondly a concern some of my colleagues have highlighted is that using a PCR approach we would get over-representation of the start and end of the reads (i.e. the start and end of the desired 200bp amplicon) and a much lower if not absent coverage of the middle portion. Would anyone have any comments as to whether this would be the case and if so are there ways around this?

              Comment

              • volks
                Member
                • Jun 2010
                • 80

                #8
                i thought you were going to sequence the whole pcr amplicon without fragmentation. why should this then happen?

                Comment

                • yog77
                  Member
                  • Jun 2011
                  • 18

                  #9
                  Originally posted by volks View Post
                  i thought you were going to sequence the whole pcr amplicon without fragmentation. why should this then happen?
                  I guess we are worried that not all the amplicons are going to sequence at the same efficiency coupled with the fact that there will be variable lengths of good quality data (i.e. not all of them reaching QC up to base 100), so leading to over representation at either end of the 2*100bp??

                  Sorry It's just that im not sure if any of that is likely hence the question to the forum

                  Comment

                  • frozenlyse
                    Senior Member
                    • Sep 2008
                    • 135

                    #10
                    Yeah you are going to get more sequencing errors/lower quality bases at the 3' ends of your reads - however with only a 30kb region in 96 samples you are going to have ~1000x coverage so it probably won't be too much of an issue!

                    Couple of things to think about are
                    * in your PCRs what taq will you use? Phusion won't amplify from bisulfite DNA as it contains uracil. I think there is a low error rate enzyme out there which doesn't mind bisulfite but I can't remember what it is off the top of my head
                    * has your sequencing facility handled bisulfite sequencing before? It really skews the base composition (obviously) so I'm not sure what (if any) adjustments have to be made for proper basecalling etc

                    Comment

                    • yog77
                      Member
                      • Jun 2011
                      • 18

                      #11
                      Originally posted by frozenlyse View Post
                      Yeah you are going to get more sequencing errors/lower quality bases at the 3' ends of your reads - however with only a 30kb region in 96 samples you are going to have ~1000x coverage so it probably won't be too much of an issue!

                      Couple of things to think about are
                      * in your PCRs what taq will you use? Phusion won't amplify from bisulfite DNA as it contains uracil. I think there is a low error rate enzyme out there which doesn't mind bisulfite but I can't remember what it is off the top of my head
                      * has your sequencing facility handled bisulfite sequencing before? It really skews the base composition (obviously) so I'm not sure what (if any) adjustments have to be made for proper basecalling etc
                      Thanks for that I will keep that in mind - as I was thinking of using Phusion - maybe another option is the PfuTurbo Cx Hotstart DNA Polymerase or the ZymoTaq. Will look into these, but if you have a suggestion please do let me know.

                      Comment

                      • giugiu81
                        Junior Member
                        • May 2012
                        • 1

                        #12
                        Hi,
                        I would have a question, how did you design the 200 primer pairs for your study?
                        thank you so much

                        Comment

                        Latest Articles

                        Collapse

                        • seqadmin
                          Pathogen Surveillance with Advanced Genomic Tools
                          by seqadmin




                          The COVID-19 pandemic highlighted the need for proactive pathogen surveillance systems. As ongoing threats like avian influenza and newly emerging infections continue to pose risks, researchers are working to improve how quickly and accurately pathogens can be identified and tracked. In a recent SEQanswers webinar, two experts discussed how next-generation sequencing (NGS) and machine learning are shaping efforts to monitor viral variation and trace the origins of infectious...
                          03-24-2025, 11:48 AM
                        • seqadmin
                          New Genomics Tools and Methods Shared at AGBT 2025
                          by seqadmin


                          This year’s Advances in Genome Biology and Technology (AGBT) General Meeting commemorated the 25th anniversary of the event at its original venue on Marco Island, Florida. While this year’s event didn’t include high-profile musical performances, the industry announcements and cutting-edge research still drew the attention of leading scientists.

                          The Headliner
                          The biggest announcement was Roche stepping back into the sequencing platform market. In the years since...
                          03-03-2025, 01:39 PM

                        ad_right_rmr

                        Collapse

                        News

                        Collapse

                        Topics Statistics Last Post
                        Started by seqadmin, 03-20-2025, 05:03 AM
                        0 responses
                        49 views
                        0 reactions
                        Last Post seqadmin  
                        Started by seqadmin, 03-19-2025, 07:27 AM
                        0 responses
                        57 views
                        0 reactions
                        Last Post seqadmin  
                        Started by seqadmin, 03-18-2025, 12:50 PM
                        0 responses
                        50 views
                        0 reactions
                        Last Post seqadmin  
                        Started by seqadmin, 03-03-2025, 01:15 PM
                        0 responses
                        201 views
                        0 reactions
                        Last Post seqadmin  
                        Working...