Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • jnfass
    Member
    • Aug 2008
    • 88

    samtools pileup for multiple diploid individuals?

    Is it appropriate to use samtools pileup (which uses maq's consensus- and SNP-calling model) on pooled reads from multiple diploid individuals? I'm looking for SNPs both within the read population, and between the reads and the reference. I've got 6 individuals in separate samples from a species close to the reference, at low enough depth where I should probably just forget about mapping each individual separately (in other words, I can forget about calling a genotype for each individual). But I'd still like to call the most likely consensus and SNP for this population ...

    Comment

    • bekkari
      Member
      • Oct 2009
      • 10

      Does any one know about any tool/program to convert SAM output to BED format, if so please let me know

      Comment

      • zlu
        Member
        • Nov 2008
        • 34

        Originally posted by zee View Post
        Is there a way to convert a SAM consensus output (using -c option for pileup) to the old maq-style .cns consensus?

        I have some maq-based pipelines I would like to use on my BWA results.
        Has anyone had any luck with this? In addition, in MAQ, you can dump unmapped reads into a separate file, is there such a function/tool in samtools? Thank you.

        Comment

        • apfejes
          Senior Member
          • Feb 2008
          • 236

          bekkari:

          I believe ConvertToBed.jar in the Vancouver Short Read Analysis Package can do it.

          Anthony
          The more you know, the more you know you don't know. —Aristotle

          Comment

          • bekkari
            Member
            • Oct 2009
            • 10

            if you use BOWTIE as an alignment algorithm, there is an option (--un) to dump all unmapped reads into a file.

            Comment

            • zlu
              Member
              • Nov 2008
              • 34

              Originally posted by bekkari View Post
              if you use BOWTIE as an alignment algorithm, there is an option (--un) to dump all unmapped reads into a file.

              But I'm using BWA. Is there an unmapped flag in the sam file?

              Comment

              • xiang
                Member
                • Mar 2009
                • 13

                Partial pileup for samtools?

                samtools pileup takes ages, so does varfilter.

                Can samtools pileup work on one chromosome? It would be
                must easier for parallelization.

                Comment

                • lh3
                  Senior Member
                  • Feb 2008
                  • 686

                  Comment

                  • zlu
                    Member
                    • Nov 2008
                    • 34

                    properly mapped Flag

                    Perhaps I have misunderstood it but isn't right that properly mapped flag (P string) are only used when read pairs are mapped to the same chromosome with correct insert size? I have about 6% of the properly mapped reads with the P string flag that have mate mapped to different chromosome with 0 insert size as shown below. Has anyone seen this before?

                    EBRI093151:1:90:555:299#0 pPR1 Chr11 107308221 23 36M Chr12 49 0 TATCCTATTCGAAAGTCGCCATGACCGTGGACATGA BCCBCBBACCB?CBA@BBACCBCAAB<6<?BABBBB XT:A:U NM:i:0 SM:i:23 AM:i:23 X0:i:1 X1:i:1 XM:i:0 XO:i:0 XG:i:0 MD:Z:36

                    EBRI093151:1:90:555:299#0 pPr2 Chr12 49 23 11S7M1D12M6S Chr11 107308221 0 CTACCGCTTGGGTGGTCATGAATGATTAGCACGCCC AB99@B=BBA>ACCBCCBBCBBCBBBBCBCBB@B@A XT:A:M NM:i:4 SM:i:23 AM:i:23 XM:i:3 XO:i:1 XG:i:1 MD:Z:3A3^T2T5C3

                    Comment

                    • lh3
                      Senior Member
                      • Feb 2008
                      • 686

                      Try the latest version of bwa.

                      Comment

                      • zlu
                        Member
                        • Nov 2008
                        • 34

                        Originally posted by lh3 View Post
                        Try the latest version of bwa.
                        Heng,

                        This was done with BWA 0.5.4.

                        For resequencing project, does it really matter if the mates are not properly mapped? Can I instead just filter out those reads with low mapping qualities?

                        Thank you.

                        Comment

                        • nilshomer
                          Nils Homer
                          • Nov 2008
                          • 1283

                          Originally posted by zlu View Post
                          Perhaps I have misunderstood it but isn't right that properly mapped flag (P string) are only used when read pairs are mapped to the same chromosome with correct insert size? I have about 6% of the properly mapped reads with the P string flag that have mate mapped to different chromosome with 0 insert size as shown below. Has anyone seen this before?

                          EBRI093151:1:90:555:299#0 pPR1 Chr11 107308221 23 36M Chr12 49 0 TATCCTATTCGAAAGTCGCCATGACCGTGGACATGA BCCBCBBACCB?CBA@BBACCBCAAB<6<?BABBBB XT:A:U NM:i:0 SM:i:23 AM:i:23 X0:i:1 X1:i:1 XM:i:0 XO:i:0 XG:i:0 MD:Z:36

                          EBRI093151:1:90:555:299#0 pPr2 Chr12 49 23 11S7M1D12M6S Chr11 107308221 0 CTACCGCTTGGGTGGTCATGAATGATTAGCACGCCC AB99@B=BBA>ACCBCCBBCBBCBBBBCBCBB@B@A XT:A:M NM:i:4 SM:i:23 AM:i:23 XM:i:3 XO:i:1 XG:i:1 MD:Z:3A3^T2T5C3
                          I don't see anywhere in the specification how to set the "properly paired" bit. I would guess this is aligner dependent.

                          Comment

                          • suseq
                            Junior Member
                            • Sep 2009
                            • 3

                            varFilter out put

                            Hi,

                            I have used samtools to analyse variations using varFilter. So I have imported an alignment file from BWA in sam format, have sorted and run:
                            1. samtools pileup -vcf ...
                            2. samtools.pl varFilter...| awk '$6>=20' ...

                            It did run but I have problems to interpret all the columns. What I think is:
                            column 1: chromosome
                            column 2: first base coordinate from the ref.
                            column 3: ref. base
                            column 4: consensus base
                            column 5: ???
                            column 6: mapping quality
                            column 7: ???
                            column 8: read depth
                            column 9: read base column
                            column 10: ???

                            Does somebody know which values are in column 5, 7 and 10? I could not find this information.
                            Last edited by suseq; 11-23-2009, 01:17 AM.

                            Comment

                            • lh3
                              Senior Member
                              • Feb 2008
                              • 686



                              and



                              the pileup command.

                              Comment

                              • zlu
                                Member
                                • Nov 2008
                                • 34

                                I'm wondering for a genome assembly project, will duplicate removal (with samtolls rmdup) and flitering out low mapping quality (e.g mapQ <10) improve my assembly? What do people usually do after mapping with e.g bwa for QC purpose?

                                Another slightly different issue. Does it matter if 2 fatsq files have the exact identical headers (from 2 solexa runs)? How does samtools sort the bam file? Does it take the header IDs into consideration?

                                Thank you.

                                Comment

                                Latest Articles

                                Collapse

                                • SEQadmin2
                                  From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
                                  by SEQadmin2


                                  Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


                                  The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
                                  ...
                                  Yesterday, 10:05 AM
                                • SEQadmin2
                                  Single-Cell Sequencing at an Inflection Point: Early Impacts of New Platforms and Emerging Trends
                                  by SEQadmin2


                                  With the launch of new single-cell sequencing platforms in 2026, the field stands at an exciting inflection point. This article surveys the most impactful advances in the field and discusses how they’re reshaping research in cancer, immunology, and beyond.


                                  Introduction

                                  Single-cell sequencing technologies have undergone remarkable advances over the past decade, transitioning from low-throughput experimental approaches to highly scalable platforms capable of...
                                  05-22-2026, 06:42 AM
                                • SEQadmin2
                                  Environmental Genomics in the Age of NGS: From Microbes to Conservation Strategies
                                  by SEQadmin2

                                  Studying ecosystems means dealing with complex, multi-species communities that are hard to observe at scale. This complexity, however, hides many important questions to be answered, from how biogeochemical cycles work and how climate change can affect species distribution to how conservation strategies can work best.


                                  Genomics, particularly since the expansion of NGS, has transformed ecosystem ecology. By sequencing environmental DNA, we can now assess biodiversity without direct...
                                  05-06-2026, 09:04 AM

                                ad_right_rmr

                                Collapse

                                News

                                Collapse

                                Topics Statistics Last Post
                                Started by SEQadmin2, Yesterday, 12:03 PM
                                0 responses
                                19 views
                                0 reactions
                                Last Post SEQadmin2  
                                Started by SEQadmin2, Yesterday, 11:40 AM
                                0 responses
                                14 views
                                0 reactions
                                Last Post SEQadmin2  
                                Started by SEQadmin2, 05-28-2026, 11:40 AM
                                0 responses
                                29 views
                                0 reactions
                                Last Post SEQadmin2  
                                Started by SEQadmin2, 05-26-2026, 10:12 AM
                                0 responses
                                31 views
                                0 reactions
                                Last Post SEQadmin2  
                                Working...