alignment of bisulfite treated reads

sci_guy replied

03-12-2010, 09:25 PM
Originally posted by ondovb View Post

We just released SOCS version 2, which has a mode that is fully bisulfite-tolerant for SOLiD data.

Thanks! I'll take a look. I have more SOLiD data coming my way soon.
Leave a comment:
bioinfosm replied

03-12-2010, 10:05 AM
bsmap is another tool. I have used it on bisulphite reads and it seems to work well
Leave a comment:
ondovb replied

03-12-2010, 10:01 AM
We just released SOCS version 2, which has a mode that is fully bisulfite-tolerant for SOLiD data. It's available at:

Page not found - Solid Software: Situs Berita Program Hosting dan Coding

http://solidsoftwaretools.com/gf/project/socs/

It will take longer than using a standard algorithm with converted genomes (due to the complexity of the problem), but there won't be any bias in the results.
Leave a comment:
frozenlyse replied

07-08-2009, 04:35 AM
Originally posted by nilshomer View Post

I don't share your disdain for colorspace since it is quite powerful. For example the false positive rate for SNPs is a lot lower since you need two specific errors next to each other to get a SNP.

Anyhow, under the current chemistry, bisulphite sequencing would be difficult from a bioinformatic perspective for longer reads (>=50bp) on the SOLiD platform. You could consider a targeted pulldown on the Illumina platform to make up for the lower capacity (per dollar).

have you compared the same sample sequenced by illumina vs solid? personally i am quite platform agnostic now that they have comparable levels of throughput and read length, however unless anybody has sequenced the same sample on both platforms i am still not decided as to which gives the best combination of cost vs read length vs throughput

however, i definitely agree with sci_guy - solid colorspace is currently quite useless for bisulfite sequencing... this can be overcome bioinformatically (computationally expensive) however no-one has attempted this as yet.
Leave a comment:
sci_guy replied

07-07-2009, 04:32 PM
Bis-Seq relies upon counting the C's vs T's in aligned reads so for an unbiased statistic you want alignment potential of a bisulphite-treated DNA read to be equivalent regardless of C density.

With SOLiD you really want to align to a hypomethylated genome (No C's) and a hypermethylated genome (C's remain at CpG sites) since proprocessing the reads to convert C's to T's in colorspace is not possible. Reads with intermediate levels of methylation will be regarded as having SNPs in the alignment pipeline (two colorspace changes in a row). So, if your read has a fair number of CpG sites (say a read at a CpG island) and it goes over your alignment mismatch threshold it won't align when it is a perfectly good read. This creates a confounder where there is lowered alignment potential to high density CpG regions within the genome and to CpG sites near high population frequency SNPs or INDELs. You can counter for this by relaxing the number of mismatches allowed (and introduce false positive alignments) or align to a number of permuted bisulphite references. Preprocessed reads with Illumina have none of these issues. If you have a plant genome with CNG and CNN methylation then SOLiD is not a wise choice at all.

I'm not some sort of Illumina fan-boy. I originally chose SOLiD owing to error checking built into colorspace and the increased number of reads per dollar. However for a second experiment I've swapped to Illumina owing to the potential alignment bias issue and Illumina's increases in bandwidth later this year.
Leave a comment:
nilshomer replied

07-07-2009, 06:35 AM
Originally posted by sci_guy View Post

If you're using Illumina the easiest (bias-free) way is to preprocess your bisulphite reads to convert C's to T's (remembering where they are) and align it to a reference with all C's changed to T's. Then write a script to introduce the C's back in, or relate these as tables in a database.

As for SOLiD, all this is horrible in colorspace. If you're trying to avoid alignment bias due to methylation differences SOLiD has some bioinformatic issues. You're required to permute the reference a great deal or slacken up the mismatches allowed, sorting out the noise later down the track. If you convert SOLiD reads back into basespace you'll pay a fairly reasonable price - any errors in the read will frameshift base calls 3' to the error <Grumble> <Grumble>

Nils, have you tried aligning SOLiD bisulphite reads?

I don't share your disdain for colorspace since it is quite powerful. For example the false positive rate for SNPs is a lot lower since you need two specific errors next to each other to get a SNP.

Anyhow, under the current chemistry, bisulphite sequencing would be difficult from a bioinformatic perspective for longer reads (>=50bp) on the SOLiD platform. You could consider a targeted pulldown on the Illumina platform to make up for the lower capacity (per dollar).
Leave a comment:
sci_guy replied

07-07-2009, 05:14 AM
If you're using Illumina the easiest (bias-free) way is to preprocess your bisulphite reads to convert C's to T's (remembering where they are) and align it to a reference with all C's changed to T's. Then write a script to introduce the C's back in, or relate these as tables in a database.

As for SOLiD, all this is horrible in colorspace. If you're trying to avoid alignment bias due to methylation differences SOLiD has some bioinformatic issues. You're required to permute the reference a great deal or slacken up the mismatches allowed, sorting out the noise later down the track. If you convert SOLiD reads back into basespace you'll pay a fairly reasonable price - any errors in the read will frameshift base calls 3' to the error <Grumble> <Grumble>

Nils, have you tried aligning SOLiD bisulphite reads?
Leave a comment:
totalnew replied

07-06-2009, 11:27 AM
novoalign can do bisulfite sequencing, but novoalign is not free charge.
Leave a comment:
nilshomer replied

07-06-2009, 07:44 AM
Originally posted by fadista View Post

Hi,

I would like to know if any of the available next-gen alignment algorithms like maq, bwa, bowtie or others are able to align bisulfite treated reads from a methylation-seq experiment.

This is a rather tricky alignment because it requires that C's in the reference sequence be allowed to align against T's in the bisulfite-treated reads, without a penalty.

Maybe one possiblity is to use alignment algorithms with a custom scoring matrix?

I have tried BFAST and MAQ (and BWA) to do this. For BFAST, there are details in the reference manual.
Leave a comment:
xwu replied

07-06-2009, 07:14 AM
I am aware that novoalign has bisulphite sequencing alignment function built in, but not sure about the performance.
Leave a comment:
fadista started a topic alignment of bisulfite treated reads

07-06-2009, 01:37 AM
alignment of bisulfite treated reads

Hi,

I would like to know if any of the available next-gen alignment algorithms like maq, bwa, bowtie or others are able to align bisulfite treated reads from a methylation-seq experiment.

This is a rather tricky alignment because it requires that C's in the reference sequence be allowed to align against T's in the bisulfite-treated reads, without a penalty.

Maybe one possiblity is to use alignment algorithms with a custom scoring matrix?
Tags: alignment, bisulfite, methylation

Previous 1 2 3 template Next

Exploring the Dynamics of the Tumor Microenvironment

by seqadmin

The complexity of cancer is clearly demonstrated in the diverse ecosystem of the tumor microenvironment (TME). The TME is made up of numerous cell types and its development begins with the changes that happen during oncogenesis. “Genomic mutations, copy number changes, epigenetic alterations, and alternative gene expression occur to varying degrees within the affected tumor cells,” explained Andrea O’Hara, Ph.D., Strategic Technical Specialist at Azenta. “As...
- Channel: Articles
07-08-2024, 03:19 PM

Topics	Statistics	Last Post
Gene Misexpression in the Healthy Human Population by seqadmin Started by seqadmin, Yesterday, 06:46 AM	0 responses 9 views 0 likes	Last Post by seqadmin Yesterday, 06:46 AM
New Method for Rapid Genetic Diagnosis of Mendelian Disorders by seqadmin Started by seqadmin, 07-24-2024, 11:09 AM	0 responses 26 views 0 likes	Last Post by seqadmin 07-24-2024, 11:09 AM
Advancing Nanopore Technology for Portable Sensing Devices by seqadmin Started by seqadmin, 07-19-2024, 07:20 AM	0 responses 160 views 0 likes	Last Post by seqadmin 07-19-2024, 07:20 AM
New RNA-Based Gene Writing Technology Achieves Precise Gene Integration by seqadmin Started by seqadmin, 07-16-2024, 05:49 AM	0 responses 127 views 0 likes	Last Post by seqadmin 07-16-2024, 05:49 AM

Seqanswers Leaderboard Ad

Announcement

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment: