Seqanswers Leaderboard Ad



No announcement yet.
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to combine reads from different samples after pipeline?

    Hi everyone,

    this is more of a statistical than a bioinformatics problem, but I thought this was probably the best forum to post it in as everyone probably knows their way around sequencing data.

    My project:
    I want to look at the bacterial composition of cow farmers noses and cows and see how much the nasal microbiome of the cow farmer is influenced by the contact to cows (compared to a non-exposed control group)

    My samples:
    Nasal swabs from cow farmers and cows:
    Number of farms: 30
    two cows sampled per farm.
    Varying number of farmers sampled on each farm: range, 1-4

    What I did so far:
    I performed amplicon sequencing using the 16S V4 region
    Platform: Illumina MiSeq 2*250 bp
    pipeline: mothur

    What I have now:
    A shared file with the number of reads per OTU with a total number of 550 OTUs and 300 samples.

    My problem:
    My PI insists to combine the farmers to make one meta farmer in case I have more than one farmer per farm and to create one meta-cow out of the two cows I have.
    He says that it makes only sense to look at the beta diversity with these pooled samples. So far I could not convince him to not want to see the pooled data.
    I have a bad feeling to pool different samples together because it distorts the results in my opinion.
    My PIs suggestion is to take all the farmers from one farm and add up all reads from each OTU and then divide this read number by the number of farmers (aka I am taking the mean number of reads per OTU).
    However, I think this will leave me with a much higher richness than those farmers actually have.

    My questions:
    Does anyone have a good (statistical) reason why it is wrong to pool your samples like that? (something that convinces PIs)

    Does anyone know a publication where samples have been pooled after sequencing? (Usually they get pooled right in the beginning)

    How should I combine those samples instead if taking the mean number of reads is not the way to do it?

    Thanks everyone for reading this far and I am greatly appreciating any input.

Latest Articles


  • seqadmin
    Latest Developments in Precision Medicine
    by seqadmin

    Technological advances have led to drastic improvements in the field of precision medicine, enabling more personalized approaches to treatment. This article explores four leading groups that are overcoming many of the challenges of genomic profiling and precision medicine through their innovative platforms and technologies.

    Somatic Genomics
    “We have such a tremendous amount of genetic diversity that exists within each of us, and not just between us as individuals,”...
    05-24-2024, 01:16 PM
  • seqadmin
    Recent Advances in Sequencing Analysis Tools
    by seqadmin

    The sequencing world is rapidly changing due to declining costs, enhanced accuracies, and the advent of newer, cutting-edge instruments. Equally important to these developments are improvements in sequencing analysis, a process that converts vast amounts of raw data into a comprehensible and meaningful form. This complex task requires expertise and the right analysis tools. In this article, we highlight the progress and innovation in sequencing analysis by reviewing several of the...
    05-06-2024, 07:48 AM





Topics Statistics Last Post
Started by seqadmin, 05-24-2024, 07:15 AM
0 responses
Last Post seqadmin  
Started by seqadmin, 05-23-2024, 10:28 AM
0 responses
Last Post seqadmin  
Started by seqadmin, 05-23-2024, 07:35 AM
0 responses
Last Post seqadmin  
Started by seqadmin, 05-22-2024, 02:06 PM
0 responses
Last Post seqadmin