Hello,
I'm looking for demo data to assess how well (or badly) human reads removal works in our microbiome pipeline. That is, I need input data where the outcome is known so I can compare the results we obtain with the expected results.
I assume this would work best if the input data had been carefully engineered in such a way that (1) it contains a mix of human and non-human (microbial) reads and (2) the ground truth (expected outcome) is known. I tried building my own synthetic (Illumina-based) FASTQ file of 15 human reads and 5 non-human reads but the software we are using at the moment (BMTagger) failed to identify the human reads.
Someone must have verified the different human reads removal tools that exist but I am unable to find demo data sets. Could anyone advise and/or point me towards existing demo files?
Thank you!
M
I'm looking for demo data to assess how well (or badly) human reads removal works in our microbiome pipeline. That is, I need input data where the outcome is known so I can compare the results we obtain with the expected results.
I assume this would work best if the input data had been carefully engineered in such a way that (1) it contains a mix of human and non-human (microbial) reads and (2) the ground truth (expected outcome) is known. I tried building my own synthetic (Illumina-based) FASTQ file of 15 human reads and 5 non-human reads but the software we are using at the moment (BMTagger) failed to identify the human reads.
Someone must have verified the different human reads removal tools that exist but I am unable to find demo data sets. Could anyone advise and/or point me towards existing demo files?
Thank you!
M