Originally posted by fwessely
View Post
It is true that reads prepared by the Cokus protocol contain sequence tags in the start, however there are also other protocols out there which produce non-directional libraries without any tags. Examples can be found in reads from Smallwood et al, 2011, Hansen et al., 2011 or all kinds of target amplified regions. Bismark does not exploit tags internally, so data from Cokus et al. or Popp et al. need to have the first 5 bp removed before performing alignments.
If you are not entirely sure about the nature of a library you might want to run just the first 100000 reads or so (using -u 100000). This should finish in under a minute, and you can guess the type of library by the strand alignment ratio. If a library was directional, OT:OB:CTOT:CTOB should have a ration of 1:1:0:0 (maybe like 1% to the complementary strands). A non directional library produces roughly the same amount of alignments for each strand, but due the way the alignments work for reads that contain either no C at all or only methylated Cs, the ratio for non-directional libraries typically looks like 2:2:1:1.
I hope this helps,
Best wishes,
Felix
Comment