I have reads that contain repeats of 10 nt (conserved sequence is known). I want to split the reads into subunits, using the 10 nt as "marker" to know where to split.
As example (conserved sequence is cccgggttta):
>
acagtacccgggtttaatcgatcgatcgtacccgggtttagtacgtacgatcgtcccgggtttatgctgtcgtc
To get:
>
acagtacccgggttta
>
atcgatcgatcgtacccgggttta
>
gtacgtacgatcgtcccgggttta
>
tgctgtcgtc
Help is appreciated, thank you
As example (conserved sequence is cccgggttta):
>
acagtacccgggtttaatcgatcgatcgtacccgggtttagtacgtacgatcgtcccgggtttatgctgtcgtc
To get:
>
acagtacccgggttta
>
atcgatcgatcgtacccgggttta
>
gtacgtacgatcgtcccgggttta
>
tgctgtcgtc
Help is appreciated, thank you