Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • beta diversity depth

    Hi all,

    I am using QIIME to analyse my 16S metagenomic sample.
    I am going to compare the bacterial community of 2 group of sample and each group composed of 10 sets of samples

    I have merged, trimmed and filtered my sequences and the no. of read for my 10 samples are:

    Group 1: 88322, 131727, 150013, 169207, 177499, 193288, 197006, 200491, 201732, 229860
    Group 2: 115444, 127776, 172511, 172573, 179295, 181659, 186582, 200619, 201387, 212047

    I have subsample the reads to multiple rarefaction (sample size from 1000 to 88000 with step size of 2000) to calculate the alpha diversity with parallel_multiple_rarefactions.

    I am going to use the jackknifed_beta_diversity to see if there is difference between 2 group of samples. Unlike alpha diversity, it seems that only a single depth is allowed to compute the beta diversity. I would like to ask what is the best sequence depth for compute the beta diversity and plot the principal coordinates analysis for these sets of data ? Between, is it a good idea to remove the data with only 88322 reads which is relative fewer reads?

    Thanks for answering the long question.

  • #2
    88k reads is generally more than sufficient to saturate each sample. Look at your alpha diversity rarefaction plots - do they plateau way before 88k? If so, then you probably have a good representation of your population at lower read counts.

    Comment


    • #3
      I have attached the rarefaction plot, the increase in observed OTU numbers slow down when more sequences, but seems not reaching a plateu??

      Is the observed species over-estimate? I see the observed OTU from other papers usually below 1k.
      Attached Files

      Comment


      • #4
        Yeah, those do seem high but it depends on your sample (e.g. bacteria-rich soil). How are you doing the OTU picking? Are you filtering the OTU table afterwards, for example to remove really low abundance species? We typically remove anything at < 0.005% abundance and this leaves us with a few hundred OTUs.

        You want to run the rarefaction plot out to >88k to see if you want to remove the sample with fewest reads, right?

        Comment


        • #5
          I pick the otu by pick_open_reference_otus against greengenes 97% clustering 16S reference set (remove singleton by default):

          pick_open_reference_otus.py -i input.fasta -o otus -r 97_otus.fasta

          And I have taken your advise by removing < 0.005% abundance reads and still got over 4000 OTUs.




          Originally posted by fanli View Post
          Yeah, those do seem high but it depends on your sample (e.g. bacteria-rich soil). How are you doing the OTU picking? Are you filtering the OTU table afterwards, for example to remove really low abundance species? We typically remove anything at < 0.005% abundance and this leaves us with a few hundred OTUs.

          You want to run the rarefaction plot out to >88k to see if you want to remove the sample with fewest reads, right?
          Attached Files

          Comment


          • #6
            Hello, Fanli, I like your result of 16s V4 Miseq run. I would like to know how much you loaded? And the size of your library is ?

            Thanks.

            Comment

            Latest Articles

            Collapse

            • seqadmin
              Essential Discoveries and Tools in Epitranscriptomics
              by seqadmin




              The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
              04-22-2024, 07:01 AM
            • seqadmin
              Current Approaches to Protein Sequencing
              by seqadmin


              Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
              04-04-2024, 04:25 PM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by seqadmin, 04-25-2024, 11:49 AM
            0 responses
            19 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-24-2024, 08:47 AM
            0 responses
            19 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-11-2024, 12:08 PM
            0 responses
            62 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-10-2024, 10:19 PM
            0 responses
            60 views
            0 likes
            Last Post seqadmin  
            Working...
            X