Hi,
I haven't posted before but have used the forum a lot during the construction of my libraries.
I recently received a second library back from a sequencing facility and after a brief look through it seems that the sequence data is almost all random rather than the targeted RAD markers aimed for. The fastQC files seem to suggest that the restriction cut sites are to blame showing much lower quality scores than the rest of the read and in most cases not matching the expected cut sequence for MseI or PstI. If I run through the processing pipeline the number of stacks assembled is drastically lower than last time and the coverage has dropped from a mean of 30x to around 5x.
My best guess is that the for some reason the adapters (with the restriction site specific overhang) have ligated to pretty much anything and everything in the digest reaction rather than targeting the restriction fragments and as a result I have sequenced a much more diverse pool of fragments at a much lower coverage. Unfortunately that means most of it is useless
A few other details. I have checked for adapter contamination in the reads and there is very little (I checked and double checked this throughout the library prep too) so i dont think its adapter dimerization. This is the second library using the same method and the first one worked fine. To further confuse matters the sequencing facility had to resequence the library as there was issues with overclustering. They had the same issues again the second time but reckon the data is fine to use.
It may be a case of degraded oligos used to make up the adapters (i used the same ones for both libraries) but if so why is it just the cut site that is low quality (the rest of the adapter quality is high)? And if so i dont understand how the ligation and ligation QC during prep could have been so successful with degraded adapters or overhangs? And even if this was the case I would have thought that in a pool of purified digested DNA that most free ends in the digested pool would be RADtag ends anyway so I would expect something in my sequence data?
Sorry for so many questions. My heart sank when i found this out and I am still digging through the data for answers. Any help in figuring out what has gone wrong would be much appreciated.
Many thanks
Alex
I haven't posted before but have used the forum a lot during the construction of my libraries.
I recently received a second library back from a sequencing facility and after a brief look through it seems that the sequence data is almost all random rather than the targeted RAD markers aimed for. The fastQC files seem to suggest that the restriction cut sites are to blame showing much lower quality scores than the rest of the read and in most cases not matching the expected cut sequence for MseI or PstI. If I run through the processing pipeline the number of stacks assembled is drastically lower than last time and the coverage has dropped from a mean of 30x to around 5x.
My best guess is that the for some reason the adapters (with the restriction site specific overhang) have ligated to pretty much anything and everything in the digest reaction rather than targeting the restriction fragments and as a result I have sequenced a much more diverse pool of fragments at a much lower coverage. Unfortunately that means most of it is useless

A few other details. I have checked for adapter contamination in the reads and there is very little (I checked and double checked this throughout the library prep too) so i dont think its adapter dimerization. This is the second library using the same method and the first one worked fine. To further confuse matters the sequencing facility had to resequence the library as there was issues with overclustering. They had the same issues again the second time but reckon the data is fine to use.
It may be a case of degraded oligos used to make up the adapters (i used the same ones for both libraries) but if so why is it just the cut site that is low quality (the rest of the adapter quality is high)? And if so i dont understand how the ligation and ligation QC during prep could have been so successful with degraded adapters or overhangs? And even if this was the case I would have thought that in a pool of purified digested DNA that most free ends in the digested pool would be RADtag ends anyway so I would expect something in my sequence data?
Sorry for so many questions. My heart sank when i found this out and I am still digging through the data for answers. Any help in figuring out what has gone wrong would be much appreciated.
Many thanks
Alex
Comment