Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • RockChalkJayhawk
    Senior Member
    • Mar 2009
    • 192

    pileup2bam

    Anyone aware of a tool that can convert a pileup file to BAM files.
  • maubp
    Peter (Biopython etc)
    • Jul 2009
    • 1544

    #2
    That doesn't make sense. BAM files (like SAM files) describe where individual reads are placed, whereas a pileup file is a coverage based summary of many reads.

    Comment

    • RockChalkJayhawk
      Senior Member
      • Mar 2009
      • 192

      #3
      Yes, but theoretically you can re-create the BAM information from the pileup because all the information is stored in the pileup file. Or am I mistaken?

      Comment

      • Alex Renwick
        Member
        • Jul 2011
        • 44

        #4
        Going from pileup to bam is a theoretical possibility, but I doubt anyone has taken the trouble to make it a reality. Even the best you could do would miss out on some information, like pairing. Why do you want to do this?

        Comment

        • RockChalkJayhawk
          Senior Member
          • Mar 2009
          • 192

          #5
          The problem is I am using an RNA-Seq workflow that produces a pileup rather than a BAM, so I can't annotate it with GATK.

          Comment

          • maubp
            Peter (Biopython etc)
            • Jul 2009
            • 1544

            #6
            What kind of workflow? If you are using Galaxy it probably is making a BAM file but hiding it by default - check the hidden entries.

            Comment

            • Alex Renwick
              Member
              • Jul 2011
              • 44

              #7
              There must be an easier way to get what you want than resurrecting a bam from pileup. You could look at the files upstream of the pileup and see if they can be translated to bam format, or, on the other hand, go forward to create a vcf (or some other bed/gff file) from the pileup and annotate that.

              Comment

              • Peterlorre
                Junior Member
                • Oct 2015
                • 2

                #8
                I find myself in this situation using legacy data (e.g., original sequencing files from an old project were lost but pileups are still available). I wanted to bump this thread to see if anyone has come up with a good solution to this problem- any utilities out there that can convert a pileup to a bam or something like it?

                Comment

                • gringer
                  David Eccles (gringer)
                  • May 2011
                  • 845

                  #9
                  You need to be more specific about the pileup format that is used. A text-based format like the output of 'samtools mpileup' would be easier to convert to SAM than a graphical image, for example. But even given the huge amount of information that could be extracted from 'samtools mpileup', it doesn't store everything. For example, the sequence names are not preserved, and no pairing information is retained.

                  Any pileup output which is nothing more than coverage values without linking between different bases will not be sufficient for generating a source SAM file.

                  Comment

                  • Peterlorre
                    Junior Member
                    • Oct 2015
                    • 2

                    #10
                    That's a good point- I'm talking about the text-based format, and the experiment was run single-end so pairing isn't important. It's true that I wouldn't have the original read names, but naively that doesn't seem very important- is there software that actually uses those, beyond for read pairing?

                    That said, I suppose that I hadn't considered whether the pileup format explicitly maintains the individual bases within reads. I had always assumed that the base calls at each position were ordered readwise, but it occurs to me that this may not be the case.

                    Anyway, it sounds like a tool to do this conversion probably doesn't exist. Thank you for your input.

                    Comment

                    • gringer
                      David Eccles (gringer)
                      • May 2011
                      • 845

                      #11
                      mpileup does retain read order information in the per-base coverage, but I'm not aware of any other text-based formats that do... actually, I'm not aware of any other text-based formats that do anything more than coverage alone.

                      Comment

                      Latest Articles

                      Collapse

                      • SEQadmin2
                        From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
                        by SEQadmin2


                        Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


                        The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
                        ...
                        06-02-2026, 10:05 AM
                      • SEQadmin2
                        Single-Cell Sequencing at an Inflection Point: Early Impacts of New Platforms and Emerging Trends
                        by SEQadmin2


                        With the launch of new single-cell sequencing platforms in 2026, the field stands at an exciting inflection point. This article surveys the most impactful advances in the field and discusses how they’re reshaping research in cancer, immunology, and beyond.


                        Introduction

                        Single-cell sequencing technologies have undergone remarkable advances over the past decade, transitioning from low-throughput experimental approaches to highly scalable platforms capable of...
                        05-22-2026, 06:42 AM
                      • SEQadmin2
                        Environmental Genomics in the Age of NGS: From Microbes to Conservation Strategies
                        by SEQadmin2

                        Studying ecosystems means dealing with complex, multi-species communities that are hard to observe at scale. This complexity, however, hides many important questions to be answered, from how biogeochemical cycles work and how climate change can affect species distribution to how conservation strategies can work best.


                        Genomics, particularly since the expansion of NGS, has transformed ecosystem ecology. By sequencing environmental DNA, we can now assess biodiversity without direct...
                        05-06-2026, 09:04 AM

                      ad_right_rmr

                      Collapse

                      News

                      Collapse

                      Topics Statistics Last Post
                      Started by SEQadmin2, Yesterday, 08:59 AM
                      0 responses
                      13 views
                      0 reactions
                      Last Post SEQadmin2  
                      Started by SEQadmin2, 06-02-2026, 12:03 PM
                      0 responses
                      22 views
                      0 reactions
                      Last Post SEQadmin2  
                      Started by SEQadmin2, 06-02-2026, 11:40 AM
                      0 responses
                      19 views
                      0 reactions
                      Last Post SEQadmin2  
                      Started by SEQadmin2, 05-28-2026, 11:40 AM
                      0 responses
                      31 views
                      0 reactions
                      Last Post SEQadmin2  
                      Working...