Hi All,
I have a large number of sequences in a fasta file, and each read contains both a forward and reverse inline barcode. I have far more reads than necessary for my downstream application. I am looking for a way to subset the fasta file by both forward and reverse barcode (this is how sample ID is established). I only need 1000 reads per sample, but currently have many times that. I have a working perl script for demultiplexing them, but the script takes upwards of 5 days to run. Therefore, I am looking for a way to subset the reads before demultiplexing. Any suggestions would be welcome.
Thanks in advance!
I have a large number of sequences in a fasta file, and each read contains both a forward and reverse inline barcode. I have far more reads than necessary for my downstream application. I am looking for a way to subset the fasta file by both forward and reverse barcode (this is how sample ID is established). I only need 1000 reads per sample, but currently have many times that. I have a working perl script for demultiplexing them, but the script takes upwards of 5 days to run. Therefore, I am looking for a way to subset the reads before demultiplexing. Any suggestions would be welcome.
Thanks in advance!
Comment