My friend wrote a script which hopefully is helpful to others.
The script determines barcodes that are inline and 5' of the sequence.
It takes the raw forward read and the mate read fastq files as input.
It determines the barcode from the read file and puts the read and its mate in order in two files read1 and read2 (according to the barcode). It does not trim the barcode though. Thus, it preserves the order and de-barcodes simultaneously.
Barcode file should be tab delimited, eg
BC1 AGTCGAG
BC2 GCTGACG
... .....
usage of this perl script is
perl [Script path]/5'bc_splitter_for_paired_end_sequence.pl -b [Barcode Filepath] -l [Barcode Length, integer] -m [Allowed number of mismatches, integer] -o [output suffix.fastq] -1 [READ1 Filepath] -2 [READ 2 filepath]
Hope this helps someone.
The script determines barcodes that are inline and 5' of the sequence.
It takes the raw forward read and the mate read fastq files as input.
It determines the barcode from the read file and puts the read and its mate in order in two files read1 and read2 (according to the barcode). It does not trim the barcode though. Thus, it preserves the order and de-barcodes simultaneously.
Barcode file should be tab delimited, eg
BC1 AGTCGAG
BC2 GCTGACG
... .....
usage of this perl script is
perl [Script path]/5'bc_splitter_for_paired_end_sequence.pl -b [Barcode Filepath] -l [Barcode Length, integer] -m [Allowed number of mismatches, integer] -o [output suffix.fastq] -1 [READ1 Filepath] -2 [READ 2 filepath]
Hope this helps someone.