I'm looking for a way to detect positions in the genome where there is a pileup of soft clipped reads. I've attached an image with an example of what this situation would look like in IGV. Essentially, I'm looking for a tool similar to samtools mpileup, but with the important difference that I want to see soft-clipped reads. I think the issue is that soft-clipped based are not technically aligned at these positions. I know that I could write a script to parse the CIGAR string for each read and detect locations like these, but I'm wondering, is there a tool that can quickly report the locations where reads start getting soft-clipped?
I'm imagining a version of samtools mpileup that would report something like this:
10 141352 N 105 a$A$A$a$aSASASaSaaaaAAaAAaAaAaaaAaAaAAAaAAaaAAaAAaaAaAaAAaAAAaAAaAaAAAAAAaAAaaAaaaaaaaaaAaaaAaAAaAaaAaaAAaaaaAaA^]a @<@?;?>[email protected]@[email protected]????A>@@[email protected]@[email protected]@@?>@:>[email protected]@[email protected][email protected]@>[email protected]@@[email protected][email protected][email protected][email protected]@?>[email protected][email protected]>>@>[email protected]@@?A>@>?A>[email protected]=?>=??>=?=C>9>
where the "$", as usual, means that reads are ending at this position, whereas the "S" would mean that bases are "aligned" and soft-clipped at this position.
I'm imagining a version of samtools mpileup that would report something like this:
10 141352 N 105 a$A$A$a$aSASASaSaaaaAAaAAaAaAaaaAaAaAAAaAAaaAAaAAaaAaAaAAaAAAaAAaAaAAAAAAaAAaaAaaaaaaaaaAaaaAaAAaAaaAaaAAaaaaAaA^]a @<@?;?>[email protected]@[email protected]????A>@@[email protected]@[email protected]@@?>@:>[email protected]@[email protected][email protected]@>[email protected]@@[email protected][email protected][email protected][email protected]@?>[email protected][email protected]>>@>[email protected]@@?A>@>?A>[email protected]=?>=??>=?=C>9>
where the "$", as usual, means that reads are ending at this position, whereas the "S" would mean that bases are "aligned" and soft-clipped at this position.
Comment