Announcement

Collapse
No announcement yet.

alignment of bisulfite treated reads

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • sci_guy
    replied
    Originally posted by ondovb View Post
    We just released SOCS version 2, which has a mode that is fully bisulfite-tolerant for SOLiD data.
    Thanks! I'll take a look. I have more SOLiD data coming my way soon.

    Leave a comment:


  • bioinfosm
    replied
    bsmap is another tool. I have used it on bisulphite reads and it seems to work well

    Leave a comment:


  • ondovb
    replied
    We just released SOCS version 2, which has a mode that is fully bisulfite-tolerant for SOLiD data. It's available at:

    http://solidsoftwaretools.com/gf/project/socs/

    It will take longer than using a standard algorithm with converted genomes (due to the complexity of the problem), but there won't be any bias in the results.

    Leave a comment:


  • frozenlyse
    replied
    Originally posted by nilshomer View Post
    I don't share your disdain for colorspace since it is quite powerful. For example the false positive rate for SNPs is a lot lower since you need two specific errors next to each other to get a SNP.

    Anyhow, under the current chemistry, bisulphite sequencing would be difficult from a bioinformatic perspective for longer reads (>=50bp) on the SOLiD platform. You could consider a targeted pulldown on the Illumina platform to make up for the lower capacity (per dollar).
    have you compared the same sample sequenced by illumina vs solid? personally i am quite platform agnostic now that they have comparable levels of throughput and read length, however unless anybody has sequenced the same sample on both platforms i am still not decided as to which gives the best combination of cost vs read length vs throughput

    however, i definitely agree with sci_guy - solid colorspace is currently quite useless for bisulfite sequencing... this can be overcome bioinformatically (computationally expensive) however no-one has attempted this as yet.

    Leave a comment:


  • sci_guy
    replied
    Bis-Seq relies upon counting the C's vs T's in aligned reads so for an unbiased statistic you want alignment potential of a bisulphite-treated DNA read to be equivalent regardless of C density.

    With SOLiD you really want to align to a hypomethylated genome (No C's) and a hypermethylated genome (C's remain at CpG sites) since proprocessing the reads to convert C's to T's in colorspace is not possible. Reads with intermediate levels of methylation will be regarded as having SNPs in the alignment pipeline (two colorspace changes in a row). So, if your read has a fair number of CpG sites (say a read at a CpG island) and it goes over your alignment mismatch threshold it won't align when it is a perfectly good read. This creates a confounder where there is lowered alignment potential to high density CpG regions within the genome and to CpG sites near high population frequency SNPs or INDELs. You can counter for this by relaxing the number of mismatches allowed (and introduce false positive alignments) or align to a number of permuted bisulphite references. Preprocessed reads with Illumina have none of these issues. If you have a plant genome with CNG and CNN methylation then SOLiD is not a wise choice at all.

    I'm not some sort of Illumina fan-boy. I originally chose SOLiD owing to error checking built into colorspace and the increased number of reads per dollar. However for a second experiment I've swapped to Illumina owing to the potential alignment bias issue and Illumina's increases in bandwidth later this year.

    Leave a comment:


  • nilshomer
    replied
    Originally posted by sci_guy View Post
    If you're using Illumina the easiest (bias-free) way is to preprocess your bisulphite reads to convert C's to T's (remembering where they are) and align it to a reference with all C's changed to T's. Then write a script to introduce the C's back in, or relate these as tables in a database.

    As for SOLiD, all this is horrible in colorspace. If you're trying to avoid alignment bias due to methylation differences SOLiD has some bioinformatic issues. You're required to permute the reference a great deal or slacken up the mismatches allowed, sorting out the noise later down the track. If you convert SOLiD reads back into basespace you'll pay a fairly reasonable price - any errors in the read will frameshift base calls 3' to the error <Grumble> <Grumble>

    Nils, have you tried aligning SOLiD bisulphite reads?
    I don't share your disdain for colorspace since it is quite powerful. For example the false positive rate for SNPs is a lot lower since you need two specific errors next to each other to get a SNP.

    Anyhow, under the current chemistry, bisulphite sequencing would be difficult from a bioinformatic perspective for longer reads (>=50bp) on the SOLiD platform. You could consider a targeted pulldown on the Illumina platform to make up for the lower capacity (per dollar).

    Leave a comment:


  • sci_guy
    replied
    If you're using Illumina the easiest (bias-free) way is to preprocess your bisulphite reads to convert C's to T's (remembering where they are) and align it to a reference with all C's changed to T's. Then write a script to introduce the C's back in, or relate these as tables in a database.

    As for SOLiD, all this is horrible in colorspace. If you're trying to avoid alignment bias due to methylation differences SOLiD has some bioinformatic issues. You're required to permute the reference a great deal or slacken up the mismatches allowed, sorting out the noise later down the track. If you convert SOLiD reads back into basespace you'll pay a fairly reasonable price - any errors in the read will frameshift base calls 3' to the error <Grumble> <Grumble>

    Nils, have you tried aligning SOLiD bisulphite reads?

    Leave a comment:


  • totalnew
    replied
    novoalign can do bisulfite sequencing, but novoalign is not free charge.

    Leave a comment:


  • nilshomer
    replied
    Originally posted by fadista View Post
    Hi,

    I would like to know if any of the available next-gen alignment algorithms like maq, bwa, bowtie or others are able to align bisulfite treated reads from a methylation-seq experiment.

    This is a rather tricky alignment because it requires that C's in the reference sequence be allowed to align against T's in the bisulfite-treated reads, without a penalty.

    Maybe one possiblity is to use alignment algorithms with a custom scoring matrix?
    I have tried BFAST and MAQ (and BWA) to do this. For BFAST, there are details in the reference manual.

    Leave a comment:


  • xwu
    replied
    I am aware that novoalign has bisulphite sequencing alignment function built in, but not sure about the performance.

    Leave a comment:


  • fadista
    started a topic alignment of bisulfite treated reads

    alignment of bisulfite treated reads

    Hi,

    I would like to know if any of the available next-gen alignment algorithms like maq, bwa, bowtie or others are able to align bisulfite treated reads from a methylation-seq experiment.

    This is a rather tricky alignment because it requires that C's in the reference sequence be allowed to align against T's in the bisulfite-treated reads, without a penalty.

    Maybe one possiblity is to use alignment algorithms with a custom scoring matrix?
Working...
X