Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Phylogeny on SNP

    Hello,

    I'm searching for a tool inferring phylogenies of different species via snp calls? Any suggestions? I got a table containing different positions and snp calls and want to infer the phylogeny for the different species via those snps. Atm, i really don't know how to tackle the problem except binarize them and cluster via different algorithms.


    Thanks,


    Phil

  • #2
    Most phylogeny estimation tools (phylip, phyml, paup*, MrBayes, *BEAST etc) require their input to be in fasta or phylip format. SNPs alone are tricky for those tools since there's a lot of ignored data (everything in between the SNPs), which makes estimating branch lengths difficult.
    Also keep in mind that there might not actually *be* a simple tree underlying your data - recombination and incomplete lineage sorting will make the ancestry of the sequences a potentially complex network, not a simple tree.
    With those caveats, I think making a fasta-formatted input file is your best bet.
    good luck!

    Comment


    • #3
      Originally posted by brofallon View Post
      .
      With those caveats, I think making a fasta-formatted input file is your best bet.
      good luck!
      So it is a possibility to just concat the snp-calls to a complete sequence and do the analysis on that. Gaining a network isn't such a bad thing...

      Comment


      • #4
        If you concat the SNPs, and therefore ignore all invariant sites, you'll probably get approximately the correct tree topology, but branch lengths that are much too long. Some programs may break under these conditions, I'm not entirely sure. I'd be curious to hear what the results look like if you do it...
        B

        Comment


        • #5
          Originally posted by brofallon View Post
          If you concat the SNPs, and therefore ignore all invariant sites, you'll probably get approximately the correct tree topology, but branch lengths that are much too long. Some programs may break under these conditions, I'm not entirely sure. I'd be curious to hear what the results look like if you do it...
          B
          Does anybody know how to concatenate the SNPs to make the fasta sequence? I have the same problem now.

          Comment


          • #6
            Hi everybody,
            Me too i really need and answer !!!!!!!

            I have the reads from 4 DNA diploid strains... One genome de reference well annotated...
            I made a SNPs calling with CLC and a Venn diagram to represent the similarity and the difference between my 4 strains...

            And now I BLOCK !!!!

            I would like to make a phylogenetic tree with the SNPs data (not with the number of the SNPs but) with the nucleotide information from the SNPs (INDEL, mutation, rate of mutation).

            It should exist on software which code the SNPs on something like a diploid code (AA, A- or --) for each SNPs position... and create a tree with this information !!!

            Can you help me please !!!

            Thank you

            Marie-Mathilde

            Comment


            • #7
              Keep in mind that it's unlikely that there's is a phylogenetic tree that underlies the data. Recombinations are likely to make the trees differ from SNP to SNP, so taking a bunch of SNPs and forcing them into a non-recombining tree may not be that helpful.
              You can try ACG (arup.utah.edu/acg) - it can make recombining trees from SNPs from a VCF (or multiple vcfs) and a reference

              Comment


              • #8
                I assume that these programs really only need the
                numbers of mutual differences between the
                samples. So you should be able to input this
                differences-matrix directly.
                (better for few samples with long DNA, many differences)

                Making a fasta from the vcf is also straightforward,
                I just wrote a program for that (SNPs only), handling the chromosomes
                separately. You could also merge the chromosomes ...
                but that gives long fastas and you'd be back to the differeves-matrix
                option

                ----------edit-------------------------------

                just use mtDNA and y- not-recombining-area for maternal and paternal
                phylo-trees separately (primates ?)

                ---------edit------------------------------------

                hmm, there should be a program that filters the recombined chunks
                and computes the distance in the closely-related areas only

                ---------edit--------------------------------

                take one of the 2 phases/alleles/haplotypes/zygotes at random
                (e.g. hapmap has them sorted alphabetically so taking the
                first one can give bias)

                -------------------------------------
                Last edited by gsgs; 12-19-2012, 05:45 PM.

                Comment


                • #9
                  have an excel file including snps (mutational and recommbinant). How to extract the mutaional snps only into a new fasta file?

                  Comment


                  • #10
                    save the excel as text-file, post some lines as an example

                    Comment


                    • #11
                      Still need help

                      Hello everybody,
                      I really need to manage to make a phylogenetic tree with my SNP.
                      Because i am not bio-informaticien i used clcgenomic to "map and call" my SNPs.
                      Now i have a file which look like:

                      Chromosome Region Reference Allele Strain
                      contig_1 145 A G d
                      contig_1 487 G A a, d, f
                      contig_1 682 C G b, d
                      contig_333 1156 T G a
                      contig_1234 566 C T b
                      contig_1234 612 C G b, d

                      So i have 4 strains (a,b,d and f), 1 reference genome with lot of contig.
                      Can somebody help me?

                      Thank you very much

                      Comment

                      Latest Articles

                      Collapse

                      • seqadmin
                        Strategies for Sequencing Challenging Samples
                        by seqadmin


                        Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                        03-22-2024, 06:39 AM
                      • seqadmin
                        Techniques and Challenges in Conservation Genomics
                        by seqadmin



                        The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                        Avian Conservation
                        Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                        03-08-2024, 10:41 AM

                      ad_right_rmr

                      Collapse

                      News

                      Collapse

                      Topics Statistics Last Post
                      Started by seqadmin, 03-27-2024, 06:37 PM
                      0 responses
                      13 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, 03-27-2024, 06:07 PM
                      0 responses
                      12 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, 03-22-2024, 10:03 AM
                      0 responses
                      53 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, 03-21-2024, 07:32 AM
                      0 responses
                      69 views
                      0 likes
                      Last Post seqadmin  
                      Working...
                      X