Seqanswers Leaderboard Ad



No announcement yet.
  • Filter
  • Time
  • Show
Clear All
new posts

  • Agilent Methyl-Seq - enrichment + BS + sequencing

    Dear all, long time listener first time caller here.
    I've got some Agilent Methyl-Seq data, I'm looking for feedback from anyone with experience with this data - i.e. enrichment for regions of interest, then BS then sequencing. I've been using Bismark (great!, thanks very much Babraham!) and have just given Trim Galore! a whirl. Any tips on parameters used, trimming etc, I've only got about 66% mapping efficiency and would like to improve upon that.

  • #2
    Hi Elsie,

    66% mapping efficiency doesn't sound too bad, but could you give us some more information about your data such as read length, whether it was single or paired-end, the mapping parameters used or also the number of sequence that did not map at all or mapped ambiguously (this should be stated in the mapping report)?


    • #3
      Hi fkrueger,
      thanks for the reply.
      100bp reads, paired-end, default bismark + trim galore other than specifying paired-ends and I used the default directional.
      Sequence pairs with no alignments under any condition: 13154097
      Sequence pairs did not map uniquely: 871370
      Maybe this is as good as it gets? First time with NGS data so I don't have any feeling yet for good/bad data.


      • #4
        Compared to the sequences that did not align at all it seems that only a small number of sequences could did not map uniquely, which shows that you don't seem to have a problem with highly repetitive sequences (which you probably wouldn't expect from a sequence capture). For shotgun BS-Seq data one can typically expect around 75-80% mapping efficiency with 100bp paired-end reads, not quite sure how this figure is affected by the Agilent sequence capture though.

        The most common problems with low mapping efficiency for paired-end sequencing (apart from quality and adapter issues) are either that the sequenced fragments are getting so short that both reads of the pair completely contain each other or that the specified insert size is too small (controlled by the -X parameter, 500bp is the default). Trim Galore has an option '--trim1' to avoid the former case:

        -t/--trim1           Trims 1 bp off every read from its 3' end. 
                             This may be needed for FastQ files that 
                             are to be aligned as paired-end data with 
                             Bowtie. This is because Bowtie (1) regards
                             alignments like this:
                             R1 --------------------------->
                             R2 <---------------------------
                             as invalid (whenever a start/end coordinate
                             is contained within the other read).
        We have seen in the past that simply inlcuding '--trim1' for 100bp paired-end reads managed to increase the mapping efficiency in a sample with fairly small insert sizes from ~49% to 78%!

        Just as a side note: if you find that a lot of your fragments are overlapping in the middle you should use the methylation_extractor option '--no_overlap' to avoid having a coverage bias in overlapping parts of the read. I hope this helps.


        • #5
          Hi Felix,
          thanks very much for your suggestions.
          Just tried --trim1 and now have 0% mapping efficiency!, argh!
          Emailed Agilent to see what they do with this data (their datasheet suggests Bismark), they said try GeneSpring even though GeneSpring cannot yet cope with this sort of data!
          thanks for your help.


          • #6
            Then there seems to be something going very wrong... Could you maybe email me the details (precise commands) of what you have done so I can try and assist you further?


            • #7
              Methyl-Seq Analysis

              Hi all,

              I too am having trouble with Methyl-Seq analysis.

              I was wondering how many samples you multiplexed for Methyl-Seq? I have been trying to pool 4 samples and thought this may be the cause of my problems.

              I am using Bismark but my initial % on target is 7% which I hope cannot be correct!

              Any help would be greatly appreciated,



              • #8
                Hi mariebreen

                Sorry can't help with the pooling question, I just received the data with minimal background information (it is more a test of the technology than anything else). I found the comments from Felix extremely helpful and basically followed his suggestions/Bismark manual. FYI Genespring can now cope with this sort of data so maybe you could give that a whirl and see what your results look like?



                Latest Articles


                • seqadmin
                  Current Approaches to Protein Sequencing
                  by seqadmin

                  Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                  04-04-2024, 04:25 PM
                • seqadmin
                  Strategies for Sequencing Challenging Samples
                  by seqadmin

                  Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                  03-22-2024, 06:39 AM





                Topics Statistics Last Post
                Started by seqadmin, 04-11-2024, 12:08 PM
                0 responses
                Last Post seqadmin  
                Started by seqadmin, 04-10-2024, 10:19 PM
                0 responses
                Last Post seqadmin  
                Started by seqadmin, 04-10-2024, 09:21 AM
                0 responses
                Last Post seqadmin  
                Started by seqadmin, 04-04-2024, 09:00 AM
                0 responses
                Last Post seqadmin