Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Bowtie (Bismark) 100% of reads fail to align

    I am using bismark in order to observe the progression of methylation on specific CpG sites of different viral variants. One of the steps of bismark is to use Bowtie in order to align paired end reads to a CT converted reference genome. At this step all reads fail to align. This is strange since the issue is isolated to this single variant. I also aligned all of the reads for this variant to a CT converted version of the reference genome with BWA and all of the reads align at the expected position. Is this due to bowtie masking the reads because of low complexity. Or is there something else that is different between BWA and bowtie that is causing the issue?

    Using bowtie just to align reads to the converted reference yields the same alignment error. I also tried playing with the options in bowtie to allow more mismatches and increase the insertion size, but the same failure persists.

  • #2
    How many of the reads get soft-clipped when you use BWA (I'm assuming via bwa-meth)? I think bismark is using end-to-end alignment, which won't work well if you need some soft-clipping (you could alternatively use Trim Galore! beforehand).

    Comment


    • #3
      Thank you for your response Ryan. There I quality trim the reads beforehand with print-seq. I did think that it could be the misalignment in the tails that is causing the issue and actually tries over trimming by 20+ bases. This did not help. I made the alignment with vanilla bwa and used the bisulfite converted reference from bismark.

      Thank you for pointing me towards bwa-meth. It looks promising and I will likely just switch to it if I cannot get the bowtie to work properly.

      Comment


      • #4
        Hi Mykhaylo,

        I just ran a few tests with your data and it looks like the reason for the poor alignment rates is that your data is riddled with Insertions between bases 120-150.

        The general quality towards the 3' end is poor but not not shocking (see the attached FastQC profile).

        There are however lots and lots of insertions towards the 3' end (up to 80% for certain positions, see the attached BamQC plot), which is the reason for the poor mapping efficiency. I suspect that something weird might have happened during the run, or maybe it is just some kind of artefact due to the sequence composition? Just briefly looking over it there are at least 10 CTTs and other CCCTTT repeats in the region in question... Alternatively it could of course be the case that the reference genome in that very regions is simply wrong.

        Hard trimming the reads to 110bp and Bismark defaults (as in quite strict) already brought the mapping efficiency up to > 80%, allowing more InDels with --score_min L,0,-0.4 brought it up to almost 97%. Just allowing more mismatches on the file as you provided it --score_min L,0,-0.6 also yielded 96% mapping efficiency.

        Switching tools is one thing and fine (you can only hope that the data will be clipped), but you need to understand that the data provided (or potentially the genome for the region in question) is flawed.

        Cheers, Felix
        Attached Files

        Comment


        • #5
          Bisulfighter for mapping of bisulfite-converted reads

          Hi everyone!
          I am trying to use Bisulfighter instead of Bismark for mapping and mc detection of bisulfite-converted samples. I used bsf-call and all seems ok apparently, the mapping works. But then, when parsing the .maf file produced, It tells "ERROR Exception has occured". Does anyone have an idea of what could be the problem? Second question, does anyone have an idea of what is the meaning of the "blocks" into the .maf file?
          Really thank you.
          Roberta

          Comment

          Latest Articles

          Collapse

          • seqadmin
            Strategies for Sequencing Challenging Samples
            by seqadmin


            Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
            03-22-2024, 06:39 AM
          • seqadmin
            Techniques and Challenges in Conservation Genomics
            by seqadmin



            The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

            Avian Conservation
            Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
            03-08-2024, 10:41 AM

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by seqadmin, 03-27-2024, 06:37 PM
          0 responses
          12 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 03-27-2024, 06:07 PM
          0 responses
          11 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 03-22-2024, 10:03 AM
          0 responses
          53 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 03-21-2024, 07:32 AM
          0 responses
          68 views
          0 likes
          Last Post seqadmin  
          Working...
          X