Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Strong CpG methylation bias between R1 and R2

    Hello,
    I'm new in bioinformatics, and for first time training I've got the set of WGBS 100bp PE reads from few human cancer tissues.
    I've filtered reads with prinseq, sorted, and aligned them with bismark in PE mode to hg38 (prepared with bismark) from ucsc.
    Mapping efficiency is ~20% with ~80% C's methylated in CpG context.
    OK, low mappability of reads from BS treated DNA has been mentioned many times.
    Then I tried to map reads 1 and 2 separately in SE mode.
    Read 1: mapping efficiency ~60% with ~80% C's methylated in CpG context.
    Read 2: mapping efficiency ~50% with ~40% C's methylated in CpG context.
    additional trimming by 10-20 nt from any end of read2 slightly increase mappability, but doesn't affect methylation rate.
    This result seems extremely odd to me.
    If DNA was treated with BS, how can it happen that only read2 in pair shows 2X less methylation in CpG context?
    Does anybody have a fresh look?
    Thank you in advance.

  • #2
    Would you have following information:
    1- Kit or method used for library prep
    2- Read length
    3- Library peak size
    4- FastQC output for reads

    Comment


    • #3
      This is what I could extract from core lab personnel:

      1- Kit or method used for library prep

      Genomic DNA was extracted from tissue, BS treated, sonicated, end repaired, dA-tailed. Then standard illumina adaptors were used for PE sequencing.

      2- Read length

      100bases (adaptors already trimmed)

      3- Library peak size

      ~200nt

      4- FastQC output for reads

      sorry, I can't attach picture right now, but fastQC report is good for all reads median quality at 5' end is 30, at 3' end is ~15. And I preformed quality trimming with threshold over 15.

      Comment


      • #4
        Generally there are three WGBS library prep methods:
        1- Post-ligation bisulfite conversion: DNA fragmentation and standard library preparation with methylated adapters followed by bisulfite conversion and amplification
        2- Post-bisulfite conversion library preparation by second strand synthesis of converted ssDNA followed by standard end repair, A tailing and adapter ligation and PCR amplification of double stranded DNA.
        3- Post-bisulfite conversion library preparation by synthesise of second strand with random primers appended with one partial Illumina adapter sequence and tagging the 3’ end of new strand with Terminal Tagging Oligo appended with other partial Illumina adapter followed by PCR amplification.

        I assume your library was prepared with method 1. Peak size of 200 on average would have insert size of 75 nt so I would expect that large number of reads have been trimmed at 5’ end.

        It would be interesting to see the FastQC “per base sequence content” plot for reads and that should show similar portion of converted Cs. For an example see following plots for low diversity RRBS library that shows low %C in R1 and correspondingly low %G in R2. If your plots show similar C and G then issue could be analysis step.

        RRBS.pdf

        Comment


        • #5
          Something in this description seems wrong. After bisulfite conversion the DNA should be (mostly) single stranded (since the bisulfite conversion requires single stranded DNA). Thus the standard end-repair, A-tailing and Illumina adapter ligation with Y-adapters will not work.

          Originally posted by zubr View Post
          This is what I could extract from core lab personnel:

          1- Kit or method used for library prep

          Genomic DNA was extracted from tissue, BS treated, sonicated, end repaired, dA-tailed. Then standard illumina adaptors were used for PE sequencing.

          Comment

          Latest Articles

          Collapse

          • seqadmin
            Understanding Genetic Influence on Infectious Disease
            by seqadmin




            During the COVID-19 pandemic, scientists observed that while some individuals experienced severe illness when infected with SARS-CoV-2, others were barely affected. These disparities left researchers and clinicians wondering what causes the wide variations in response to viral infections and what role genetics plays.

            Jean-Laurent Casanova, M.D., Ph.D., Professor at Rockefeller University, is a leading expert in this crossover between genetics and infectious...
            09-09-2024, 10:59 AM
          • seqadmin
            Addressing Off-Target Effects in CRISPR Technologies
            by seqadmin






            The first FDA-approved CRISPR-based therapy marked the transition of therapeutic gene editing from a dream to reality1. CRISPR technologies have streamlined gene editing, and CRISPR screens have become an important approach for identifying genes involved in disease processes2. This technique introduces targeted mutations across numerous genes, enabling large-scale identification of gene functions, interactions, and pathways3. Identifying the full range...
            08-27-2024, 04:44 AM

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by seqadmin, Today, 06:25 AM
          0 responses
          13 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, Yesterday, 01:02 PM
          0 responses
          12 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 09-18-2024, 06:39 AM
          0 responses
          14 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 09-11-2024, 02:44 PM
          0 responses
          14 views
          0 likes
          Last Post seqadmin  
          Working...
          X