I am trying to find a tool that would do merging/overlapping of PE reads when the fragment is fully contained within the reads and without having to know the adapters ahead of time. The program PANDA and FLASH (and others) will merge PE reads into a single read however they are geared towards cases where the fragment is a subset of the read. E.g.
However I am thinking of the situation of:
Both Panda and Flash can remove adapters before making the merge however if the adapter is short (say 4 bases) then I am not confident that the programs will be able to do so. Perhaps a better program would be one that matches the first bases of R1 to the region close to the end of R2 and vice-versa and then only output the merged read where both R1 and R2 match. In other words a merging where the adapter does not need to be known a priori.
Hope that this makes sense. Any suggestions? Thanks.
Code:
R1: ----------> R2: <---------- Frag: ----------------
Code:
R1: -------------> R2: <-------------- Frag: ------------
Hope that this makes sense. Any suggestions? Thanks.
Comment