Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • lh3
    replied
    Yes, this makes sense. Thank you, ondovb.

    Leave a comment:


  • ondovb
    replied
    Originally posted by lh3 View Post
    Suppose the original genomic sequence is ACGTTCA and another position has sequence ATGTTCA. The 2nd C is unmethylated. One of the possible reads you can get is ACGTTtA. According to sci_guy's description, BSMAP prefers ACGTTCA in mapping, but gsnap regards the alignment ambiguous.
    I tried this in GSNAP (you have to pad everything to reach min lengths) and it chose ACGTTCA.

    I think the confusion is coming from that last sentence of the intro that sci_guy quoted...when they say "explicit detection", I think they just intended that to mean it can tell T->C apart from C->T, and treat T->C appropriately as an error.

    Leave a comment:


  • lh3
    replied
    Suppose the original genomic sequence is ACGTTCA and another position has sequence ATGTTCA. The 2nd C is unmethylated. One of the possible reads you can get is ACGTTtA. According to sci_guy's description, BSMAP prefers ACGTTCA in mapping, but gsnap regards the alignment ambiguous.

    Leave a comment:


  • ondovb
    replied
    Originally posted by sci_guy View Post
    From my interpretation GSNAP will penalise improperly converted bisulfite reads, but will not make use of the "C" information present in the read, while BSMAP will happily align improperly converted reads but can make use of "C" information.
    The way I read it, they both function similarly in this respect. GSNAP hashes with a reduced alphabet, but will only allow C->T changes when it actually assesses the alignments. So they are both making use of reference C information, but neither of them will know the difference between methylation and incomplete conversion.

    As far as I can tell from the papers, they should theoretically have the same sensitivity and specificity with respect to bisulfite changes.

    Leave a comment:


  • sci_guy
    replied
    I'm reading the GSNAP paper more throughly now as it looks really good for a project I'm involved with - variant detection in a region of linkage.

    The last sentence of the introduction is: "The data structures in GSNAP allow it to align BS-seq reads with explicit detection of genomic-T to read-C mismatches, against either a reference sequence or a SNP-tolerant reference space."

    From my interpretation GSNAP will penalise improperly converted bisulfite reads, but will not make use of the "C" information present in the read, while BSMAP will happily align improperly converted reads but can make use of "C" information.

    Leave a comment:


  • sci_guy
    replied
    Hua used an interesting recursive strategy to map more maps back to the Arabidopsis genome. After aligning she took the unmapped reads and chopping off the first base and the last few bases, then with recursive rounds of aligning and progressively chopping off more 3' end bases got 90% of reads to map. It seems the reads mapped back in the 2nd and later rounds were actually meaningful. Quite impressive.

    I also found out Stuart Stephen from the CSIRO plant industry group has also baked up a really nice aligner that is robust to bisulfite. The paper is coming soon...

    Leave a comment:


  • bioinfosm
    replied
    thanks sci_guy

    Leave a comment:


  • sci_guy
    replied
    @lh3. I'm going to workshop over the next couple of days. It seems somebody else in my organisation has been using BSMAP with Arabidopsis bisulphite-Seq data. Below is their talk abstract. BSMAP would be particularly good for plant genomes considering all the CNG and CNN methylation. I'll see if I can get any slides.

    "Hua Ying (CSIRO)
    Approaches to mapping high-throughput bisulfite sequencing reads: High-throughput bisulfite sequencing is an attractive approach for analyzing genome-wide methylation patterns at a single-base-pair resolution. Although combining bisulfite conversion and high-throughput sequencing is increasingly widespread, its analysis is still problematic and limited to a few publications. A major challenge is the alignment of bisulfite-converted short reads to the reference genome due to increased search space and reduced sequence complexity as a result of the bisulfite conversion. Here, we took advantage of a recently published mapping algorithm BSMAP and demonstrated that BSMAP is more effective than previously used methods. By applying a two-step mapping strategy, we successfully mapped more than 90% of bisulfite short reads to the Arabidopsis genome."

    Leave a comment:


  • lh3
    replied
    @sci_guy

    Yes, BSMAP is better in mapping strategy, although I do not know how much practical improvement this may lead to. It would be good to see a head-to-head comparison. Thanks for the information.

    Leave a comment:


  • sci_guy
    replied
    Originally posted by lh3 View Post
    From the gsnap paper, it seems also a decent open-source tool. I have not tried, though.

    Thanks for the heads-up on GSNAP. I just had a look at the paper. It looks very nice. Particularly if they release a colorspace version, I am stuck with SOLiD colorspace data at present I ended up using SHRiMP with a hypermethylated genome (so C's in CpG context are retained) to match on.

    Re: GSNAP bisulfite seq
    In bisulfite mode the program produces two new hash tables, one with C-to-T substitutions and the other having G-to-A substitutions. From the paper: "When GSNAP processes a bisulfite read, it performs a C-to-T substitution of each 12-mer in the read to check against the C-to-T hash table, and a G-to-A substitution of each 12-mer in the reverse complement of the read to check against the G-to-A hash table."

    So, essentially it creates a bisulfite hypomethylated genome and then looks for seed matches within in silico "hypomethylated reads". So all seed matching is in a three base space with no C's present at all. BSMAP is a little cannier. Reads don't have C's removed. Instead, read C's are matched to C's in the reference while T's can be matched to C's or T's iff they come from the read. Another way of thinking about this is that Illumina reads have T's converted to Y's and are matched against a standard (not in silico bisulfite converted) reference genome. In this respect the C's present in the read help to eliminate more dubious alignment candidates; so a slightly more information dense match than purely 3 base matching. An interesting effect is that improperly bisulfite converted material (that containing many unconverted C's) will align as equally well as properly converted material. More work in downstream filtering perhaps but a better estimate of bisulfite conversion instead of just adding up all the C's in mitochrondrial DNA mapped reads.
    Last edited by sci_guy; 03-22-2010, 03:02 PM.

    Leave a comment:


  • lh3
    replied
    From the gsnap paper, it seems also a decent open-source tool. I have not tried, though.

    Leave a comment:


  • sci_guy
    replied
    I don't have access to the slides but the material is covered essentially in their BSMAP paper.

    lh3 - Yes, I forgot about Novoalign. I should qualify my statement and suggest that BSMAP is perhaps the best free bisulfite aligner out there at present.

    Leave a comment:


  • lh3
    replied
    novoalign and gsnap (http://www.gene.com/share/gmap/) also do bisulfite alignment. So far as I know all existing programs for bisulfite alignment take very similar strategy.

    Leave a comment:


  • bioinfosm
    replied
    Originally posted by sci_guy View Post
    I saw Wei Li talk about BSMAP at the AACR 2010 Cancer Epigenetics meeting. It was a nice talk. I like their use of what cytosines are present in the read to extract as much information as possible without creating bias.

    It's probably the best Illumina bisulfite aligner out there at the moment.
    Thats interesting to know. Is it possible for you to share that talk/slides?

    Leave a comment:


  • sci_guy
    replied
    Originally posted by bioinfosm View Post
    bsmap is another tool. I have used it on bisulphite reads and it seems to work well
    I saw Wei Li talk about BSMAP at the AACR 2010 Cancer Epigenetics meeting. It was a nice talk. I like their use of what cytosines are present in the read to extract as much information as possible without creating bias.

    It's probably the best Illumina bisulfite aligner out there at the moment.

    Leave a comment:

Latest Articles

Collapse

  • seqadmin
    Essential Discoveries and Tools in Epitranscriptomics
    by seqadmin


    The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist on Modified Bases...
    Yesterday, 07:01 AM
  • seqadmin
    Current Approaches to Protein Sequencing
    by seqadmin


    Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
    04-04-2024, 04:25 PM

ad_right_rmr

Collapse

News

Collapse

Topics Statistics Last Post
Started by seqadmin, 04-11-2024, 12:08 PM
0 responses
49 views
0 likes
Last Post seqadmin  
Started by seqadmin, 04-10-2024, 10:19 PM
0 responses
50 views
0 likes
Last Post seqadmin  
Started by seqadmin, 04-10-2024, 09:21 AM
0 responses
43 views
0 likes
Last Post seqadmin  
Started by seqadmin, 04-04-2024, 09:00 AM
0 responses
55 views
0 likes
Last Post seqadmin  
Working...
X