Originally posted by GenoMax
View Post
For context: we are converting FastQ files into unmapped CRAMs for storage, and the FastQ to SAM intermediate conversion is done with reformat.sh. My main issue here is keeping the QC vendor flag in place.
I already have a workaround, as I also have to keep other things like the UMIs (if present) and the barcode. But these bits of information imply adding tags, which are optional, so I'm not complaining about them. But the QC vendor flag is not optional. It is there. And not filling it means you are assigning a "QC vendor pass" independently of the information in the input.
In any case, if someone wants to take a look at how to keep all information from the FastQ file into a SAM file, the four fields if the comment in the ID line are candidates:
- The read end, which could be deducted later if you accept to standardize the output
- The QC vendor flag, which will be coded in the FLAGS field
- The control bits, which should be zero and potentially ignored (with no clear place to store them in case it is needed)
- The index barcode, which should be stored as the BC:Z: tag
The other data present in the FastQ file is already present in the SAM (read name, sequence and qualities).
Leave a comment: