Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    In Group A, I have 12 samples, where the first 4(embryonic time points) can be categorised into one batch , next 4 samples (day 3.5) into one batch and last 4 samples(day 5.5) into another

    For Eg: X1,X2,X3,X4,X5,X6,X7,X8,X9,X10,X11,X12

    In Group B, I have 4 samples : 2 samples (d 3.5) and another 2 (d5.5)
    Eg: S1,S2,S3,S4

    The samples S1,S3(Wildtypes) from group B, can acts as replicates to the Samples X5, X9(Wildtypes) in group A

    Comment


    • #17
      Be prepared, there are probably going to be a number of semantic questions back and forth.

      So, when you say, for example, that samples X1 through X4 can be categorised as a batch, you mean not only that they're from a single developmental time point but also that the library creation/sequencing/etc. was done in a single batch (separate from the other samples). Correct? Also, were all of the red colored samples processed as a single batch or only those within each group (I assume the answer will apply to the green colored samples as well)? It's too bad there's no obvious way to make tables on this forum, that'd be really helpful here!

      The next question is which conditions you want to compare. There are quite a few possible comparisons that one could make with your experimental design, it's just a question of determining which are confounded by a batch effect.

      Comment


      • #18
        I tried to be more careful in answering in a clear way!

        In Group A , the 12 samples were done in single batch(similar protocol followed for library creation,sequencing by same Illumina platform(paired-end) and mapping with same parameters) compared to Group B(followed mapping with same parameters as like Group A, but library preparation and Illumina sequencing(single-end) differs).

        I just colored the 12 samples (X1..X12) with 3 different colors to differentiate the timepoints(embryonic timepoints,day 3.5 and day5.5).
        The first four samples in Group A i.e X1..X4 are from different time points
        X1 <- day d0, X2- day1, X3 - day2, X4- day3
        X5 - Wt_d3.5, X6 - aa_mutant(day 3.5), X7- bb_mutant(day3.5),X8-cc_mutant(day3.5)
        X9 - Wt_d5.5, X10 - aa_mutant(day 5.5), X11- bb_mutant(day5.5),X12-cc_mutant(day5.5)

        In Group B:
        S1- Wt_d3.5, S2 - dd_mutant(day 3.5)
        S3 -Wt_d5.5, S4 - ee_mutant(day5.5)

        However, I dont want to deal with X1-X4 samples , I am interested in X5-X12 samples in group A
        and S1-S4 samples in Group B.

        I wanted to do pairwise comparisons between X5,X6,X9,X10 samples with S1-S4 samples..
        (where X5,X9 Should have similar behavior as S1,S3 in theory)

        Comment


        • #19
          Ah, everything is clear then. You're in a difficult spot, unfortunately. In the direct comparisons against anything from group B, you won't be able to tell the difference between real and batch effects (aside from the problem of not having replicates). Aside from just starting over, the only other recommendation I can give is to do a few pair-wise comparisons (use the "blind" method of DESeq) and then do some validations (assuming you have access to more samples at least for qPCR) so you have an idea how many candidates might be real. It may prove cheaper, faster, and less stressful to just do more sequencing.

          Comment


          • #20
            Originally posted by dpryan View Post
            Ah, everything is clear then. You're in a difficult spot, unfortunately. In the direct comparisons against anything from group B, you won't be able to tell the difference between real and batch effects (aside from the problem of not having replicates). Aside from just starting over, the only other recommendation I can give is to do a few pair-wise comparisons (use the "blind" method of DESeq) and then do some validations (assuming you have access to more samples at least for qPCR) so you have an idea how many candidates might be real. It may prove cheaper, faster, and less stressful to just do more sequencing.

            It means like it turns very difficult to compare different sequencing data samples done at different time points in case of no replicates. And not able to remove batch effects if the samples of analysis were not present in both groups (if i understand correctly from your points)

            By applying DESeq "blind" method I can pick some of the significant genes by performing pairwise comparison between samples based on pvalues,foldchange values and can filter the genes with some statistics?

            Comment


            • #21
              Yup. You're running into two problems. The first is the lack of biological replicates, which makes it impossible to accurately measure variance in each group. The second problem is that there will often be a difference between samples processed at different dates and using paired-end versus single-end reads (the DESeq vignette has a nice example of this using the pasilla dataset). You want to be able to separate biologically meaningful differences from these "batch effects", otherwise you'll end up just wasting a lot of time and resources.

              Regarding DESeq (there are similar methods with edgeR and likely other programs), you'll get p-values, but they're not very meaningful in your situation (read through the DESeq vignette, paying particular attention to the section talking about comparisons without replicates). This is a convenient way to get fold-changes and such. You're best off filtering on fold-change and average expression level. These will probably correlate with the p-value, but I really need to stress that the p-value is not otherwise meaningful in any normal sense (i.e., do not ever try to publish it).

              Comment


              • #22
                Originally posted by dpryan View Post
                Yup. You're running into two problems. The first is the lack of biological replicates, which makes it impossible to accurately measure variance in each group. The second problem is that there will often be a difference between samples processed at different dates and using paired-end versus single-end reads (the DESeq vignette has a nice example of this using the pasilla dataset). You want to be able to separate biologically meaningful differences from these "batch effects", otherwise you'll end up just wasting a lot of time and resources.

                Regarding DESeq (there are similar methods with edgeR and likely other programs), you'll get p-values, but they're not very meaningful in your situation (read through the DESeq vignette, paying particular attention to the section talking about comparisons without replicates). This is a convenient way to get fold-changes and such. You're best off filtering on fold-change and average expression level. These will probably correlate with the p-value, but I really need to stress that the p-value is not otherwise meaningful in any normal sense (i.e., do not ever try to publish it).
                Thank you for your time and giving me valuable suggestions, I got an idea to deal with my datas. I try to filter some of the genes from list and see if something interesting comes up.

                Comment

                Latest Articles

                Collapse

                • seqadmin
                  Best Practices for Single-Cell Sequencing Analysis
                  by seqadmin



                  While isolating and preparing single cells for sequencing was historically the bottleneck, recent technological advancements have shifted the challenge to data analysis. This highlights the rapidly evolving nature of single-cell sequencing. The inherent complexity of single-cell analysis has intensified with the surge in data volume and the incorporation of diverse and more complex datasets. This article explores the challenges in analysis, examines common pitfalls, offers...
                  06-06-2024, 07:15 AM
                • seqadmin
                  Latest Developments in Precision Medicine
                  by seqadmin



                  Technological advances have led to drastic improvements in the field of precision medicine, enabling more personalized approaches to treatment. This article explores four leading groups that are overcoming many of the challenges of genomic profiling and precision medicine through their innovative platforms and technologies.

                  Somatic Genomics
                  “We have such a tremendous amount of genetic diversity that exists within each of us, and not just between us as individuals,”...
                  05-24-2024, 01:16 PM

                ad_right_rmr

                Collapse

                News

                Collapse

                Topics Statistics Last Post
                Started by seqadmin, Yesterday, 06:58 AM
                0 responses
                13 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 06-06-2024, 08:18 AM
                0 responses
                20 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 06-06-2024, 08:04 AM
                0 responses
                18 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 06-03-2024, 06:55 AM
                0 responses
                13 views
                0 likes
                Last Post seqadmin  
                Working...
                X