Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • lukas1848
    Member
    • Jun 2011
    • 54

    reference based assembly from bam?

    Hi,

    I have mapped 100 bp Illumina reads to my reference genome and I wondered if there is any tool that can create an assembly of my reads from the bam file I created or if I have to start with the fastq files again using a reference based assembler.

    Maybe I am missing something, but shouldn't it be fairly easy to assembly scaffolds from bam files?
  • swbarnes2
    Senior Member
    • May 2008
    • 910

    #2
    Originally posted by lukas1848 View Post
    Hi,

    I have mapped 100 bp Illumina reads to my reference genome and I wondered if there is any tool that can create an assembly of my reads from the bam file I created or if I have to start with the fastq files again using a reference based assembler.

    Maybe I am missing something, but shouldn't it be fairly easy to assembly scaffolds from bam files?
    Velvet will take .bams as input, but it will treat them like fastqs. It won't use alignment data in the assembly, just the reads.

    Comment

    • Wallysb01
      Senior Member
      • Feb 2011
      • 286

      #3
      There are a few programs that do reference guided genome assembly if that's what you're looking for. I'm not as familiar with these tools, but a quick google search found a few: Mosaik, which supports bams: http://bioinformatics.bc.edu/marthlab/Mosaik

      Also AMOScmp (or AMOScmp-shortread): http://sourceforge.net/apps/mediawik...?title=AMOScmp

      I know there are others, but again, this isn't something I do.

      What are you trying to do anyway? Are you reassembling the genome a specific individual for calling variants, or is this transcriptome? Are you using a related species, or is this same species?

      Comment

      • lukas1848
        Member
        • Jun 2011
        • 54

        #4
        Thanks.
        I sequenced individuals of a different population than the one used for the reference genome. The two populations have distinct phenotypes and I primarily want to call SNPs. But since I have the aligned reads, I thought it would be nice and easy to assembly these into scaffolds as well.

        I found this pipe

        samtools mpileup -uf ref.fa aln.bam | bcftools view -cg - | vcfutils.pl vcf2fq > cns.fq
        from http://samtools.sourceforge.net/mpileup.shtml

        and I wondered if the consensus sequence created here, is just what I was looking for. If not, what does the consensus sequence provide?

        Comment

        • Wallysb01
          Senior Member
          • Feb 2011
          • 286

          #5
          Originally posted by lukas1848 View Post
          Thanks.
          I sequenced individuals of a different population than the one used for the reference genome. The two populations have distinct phenotypes and I primarily want to call SNPs. But since I have the aligned reads, I thought it would be nice and easy to assembly these into scaffolds as well.

          I found this pipe

          from http://samtools.sourceforge.net/mpileup.shtml

          and I wondered if the consensus sequence created here, is just what I was looking for. If not, what does the consensus sequence provide?
          Well, it will recreate a genome for that individual, I believe filling in any gaps based on the reference, though I'm not sure about that. So, it just depends on what you want.

          Comment

          • lukas1848
            Member
            • Jun 2011
            • 54

            #6
            Originally posted by Wallysb01 View Post
            I believe filling in any gaps based on the reference, though I'm not sure about that.
            The consensus sequence would then basically conceal any larger deletions or insertions if gaps would be filled with reference sequence and insertions would just end up in the bin as unmapped reads, right?

            So I guess to end up with a proper assembly for my sample, I would need to start a reference based assembly from scratch using something like Velvet.

            Comment

            • Wallysb01
              Senior Member
              • Feb 2011
              • 286

              #7
              Originally posted by lukas1848 View Post
              The consensus sequence would then basically conceal any larger deletions or insertions if gaps would be filled with reference sequence and insertions would just end up in the bin as unmapped reads, right?

              So I guess to end up with a proper assembly for my sample, I would need to start a reference based assembly from scratch using something like Velvet.
              I'm guessing if it sees big gaps in aligned reads it will not assume that it is actually a large indel, so yes, de novo assembly or reference guided assembly is probably your better option if you're looking for larger structural variation.

              Comment

              Latest Articles

              Collapse

              • SEQadmin2
                From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
                by SEQadmin2


                Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


                The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
                ...
                Yesterday, 10:05 AM
              • SEQadmin2
                Single-Cell Sequencing at an Inflection Point: Early Impacts of New Platforms and Emerging Trends
                by SEQadmin2


                With the launch of new single-cell sequencing platforms in 2026, the field stands at an exciting inflection point. This article surveys the most impactful advances in the field and discusses how they’re reshaping research in cancer, immunology, and beyond.


                Introduction

                Single-cell sequencing technologies have undergone remarkable advances over the past decade, transitioning from low-throughput experimental approaches to highly scalable platforms capable of...
                05-22-2026, 06:42 AM
              • SEQadmin2
                Environmental Genomics in the Age of NGS: From Microbes to Conservation Strategies
                by SEQadmin2

                Studying ecosystems means dealing with complex, multi-species communities that are hard to observe at scale. This complexity, however, hides many important questions to be answered, from how biogeochemical cycles work and how climate change can affect species distribution to how conservation strategies can work best.


                Genomics, particularly since the expansion of NGS, has transformed ecosystem ecology. By sequencing environmental DNA, we can now assess biodiversity without direct...
                05-06-2026, 09:04 AM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by SEQadmin2, Yesterday, 12:03 PM
              0 responses
              17 views
              0 reactions
              Last Post SEQadmin2  
              Started by SEQadmin2, Yesterday, 11:40 AM
              0 responses
              13 views
              0 reactions
              Last Post SEQadmin2  
              Started by SEQadmin2, 05-28-2026, 11:40 AM
              0 responses
              29 views
              0 reactions
              Last Post SEQadmin2  
              Started by SEQadmin2, 05-26-2026, 10:12 AM
              0 responses
              31 views
              0 reactions
              Last Post SEQadmin2  
              Working...