Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • BSmoothed data analysis

    Hello,
    I am a bioinformatician working at the University of Pennsylvania and I just recently started the analysis of BiSulfite-seq data. I have with me a library of condition A paired-end sequenced in 2 different lanes and the same for condition B (i.e. 4 fastq files per condition).
    I have used Bismark to align the reads and make the methylation calls. I then used BSmooth to smooth my data and call differentially methylated regions. After I do this and plot the smoothed data there are a few discrepancies which I am not sure what they mean.
    1) I see straight lines (i.e. constant methylation %) in the smoothed plots of a lot of DMRs leading me to believe they are outliers of some kind. How do I know if these are real?
    2) I see a lot of variability in the methylation % between the same library sequenced in different lanes. Is this a consequence of BSmooth/Bismark? Has anyone seen something similar happen?
    3) Are there defined stringencies/cutoffs used for calling intergenic vs promoter DMRs at any point in the BSmooth pipeline?
    Any help would be much appreciated!

  • #2
    1) You'd need to show an example, though straight lines typically occur where there's no data. Always plot raw signals along with smoothed signals so you know how realistic the results are.
    2) You should really merge these before extracting methylation calls. Lower coverage will lead to increased variability and not merging these will lead to an improperly increased N.
    3) Not that I know of, though perhaps someone else will reply with some.

    Comment


    • #3
      1) I have attached an image file with one of the examples. If there are straight lines when there is no data how do I see them showing me a high methylation %? Does that mean only a few points are driving this plot? Thanks for the raw signal advice, will do that and see how it looks.

      2) I did think of this but if I do end up merging these then BSmooth gives me errors when computing its T-statistic. Merging these I end up with only 2 files which are not enough for BSmooth to compute the T-statistic.

      Thanks!
      Attached Files

      Comment


      • #4
        If you only have two files then you shouldn't be using BSmooth. As is, any results you get are unreliable and if you try to publish the results they should be rejected. You're experiment can only provide pilot data for future experiments and nothing else.

        Comment


        • #5
          Well we do have 3 more replicates on the way but here is the problem with that. All the replicates were pooled into a single library which was then sequenced in multiple lanes (the reason for that is beyond me, but so it is) and this experiment was to make sure I could get the bioinformatics to work. So with me right now I have (or will have soon) 5 datasets which all came from the SAME pooled library. Does it make sense to use BSmooth then? Or is this an exercise in futility since I have lost all the replicate information at the pooling step?

          Comment


          • #6
            This is an exercise in futility

            Comment


            • #7
              Yeah I thought the same but they wanted me to try anyway. Thanks a lot!

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Strategies for Sequencing Challenging Samples
                by seqadmin


                Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                03-22-2024, 06:39 AM
              • seqadmin
                Techniques and Challenges in Conservation Genomics
                by seqadmin



                The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                Avian Conservation
                Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                03-08-2024, 10:41 AM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, Yesterday, 06:37 PM
              0 responses
              12 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, Yesterday, 06:07 PM
              0 responses
              10 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 03-22-2024, 10:03 AM
              0 responses
              52 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 03-21-2024, 07:32 AM
              0 responses
              68 views
              0 likes
              Last Post seqadmin  
              Working...
              X