We used sffToCA to convert sff files from a number of mate pair libraries into .frg files. We were wondering why the number of input reads reported in the sffToCA .stats output file (in the field numReadsInSFF and also reported as the total of each filtering/analysis procedure) is remarkably different from the number of fragments in the generated .frg files (for one library we got 584072 input reads reported in the .stats file, and 699213 fragments in the .fgr file).
We counted the number of fragments in the .frg files checking FRG records (i.e. blocks starting with {FRG and ending in }) or grep-ping "seq:" in the file.
To what filtering procedure is this difference accountable for?
Thanks
We counted the number of fragments in the .frg files checking FRG records (i.e. blocks starting with {FRG and ending in }) or grep-ping "seq:" in the file.
To what filtering procedure is this difference accountable for?
Thanks
Comment