Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • sphil
    Senior Member
    • Apr 2010
    • 192

    Phylogeny on SNP

    Hello,

    I'm searching for a tool inferring phylogenies of different species via snp calls? Any suggestions? I got a table containing different positions and snp calls and want to infer the phylogeny for the different species via those snps. Atm, i really don't know how to tackle the problem except binarize them and cluster via different algorithms.


    Thanks,


    Phil
  • brofallon
    Member
    • May 2011
    • 26

    #2
    Most phylogeny estimation tools (phylip, phyml, paup*, MrBayes, *BEAST etc) require their input to be in fasta or phylip format. SNPs alone are tricky for those tools since there's a lot of ignored data (everything in between the SNPs), which makes estimating branch lengths difficult.
    Also keep in mind that there might not actually *be* a simple tree underlying your data - recombination and incomplete lineage sorting will make the ancestry of the sequences a potentially complex network, not a simple tree.
    With those caveats, I think making a fasta-formatted input file is your best bet.
    good luck!

    Comment

    • sphil
      Senior Member
      • Apr 2010
      • 192

      #3
      Originally posted by brofallon View Post
      .
      With those caveats, I think making a fasta-formatted input file is your best bet.
      good luck!
      So it is a possibility to just concat the snp-calls to a complete sequence and do the analysis on that. Gaining a network isn't such a bad thing...

      Comment

      • brofallon
        Member
        • May 2011
        • 26

        #4
        If you concat the SNPs, and therefore ignore all invariant sites, you'll probably get approximately the correct tree topology, but branch lengths that are much too long. Some programs may break under these conditions, I'm not entirely sure. I'd be curious to hear what the results look like if you do it...
        B

        Comment

        • campy
          Junior Member
          • May 2011
          • 1

          #5
          Originally posted by brofallon View Post
          If you concat the SNPs, and therefore ignore all invariant sites, you'll probably get approximately the correct tree topology, but branch lengths that are much too long. Some programs may break under these conditions, I'm not entirely sure. I'd be curious to hear what the results look like if you do it...
          B
          Does anybody know how to concatenate the SNPs to make the fasta sequence? I have the same problem now.

          Comment

          • mm.perrineau
            Member
            • Sep 2012
            • 12

            #6
            Hi everybody,
            Me too i really need and answer !!!!!!!

            I have the reads from 4 DNA diploid strains... One genome de reference well annotated...
            I made a SNPs calling with CLC and a Venn diagram to represent the similarity and the difference between my 4 strains...

            And now I BLOCK !!!!

            I would like to make a phylogenetic tree with the SNPs data (not with the number of the SNPs but) with the nucleotide information from the SNPs (INDEL, mutation, rate of mutation).

            It should exist on software which code the SNPs on something like a diploid code (AA, A- or --) for each SNPs position... and create a tree with this information !!!

            Can you help me please !!!

            Thank you

            Marie-Mathilde

            Comment

            • brofallon
              Member
              • May 2011
              • 26

              #7
              Keep in mind that it's unlikely that there's is a phylogenetic tree that underlies the data. Recombinations are likely to make the trees differ from SNP to SNP, so taking a bunch of SNPs and forcing them into a non-recombining tree may not be that helpful.
              You can try ACG (arup.utah.edu/acg) - it can make recombining trees from SNPs from a VCF (or multiple vcfs) and a reference

              Comment

              • gsgs
                Senior Member
                • Oct 2009
                • 139

                #8
                I assume that these programs really only need the
                numbers of mutual differences between the
                samples. So you should be able to input this
                differences-matrix directly.
                (better for few samples with long DNA, many differences)

                Making a fasta from the vcf is also straightforward,
                I just wrote a program for that (SNPs only), handling the chromosomes
                separately. You could also merge the chromosomes ...
                but that gives long fastas and you'd be back to the differeves-matrix
                option

                ----------edit-------------------------------

                just use mtDNA and y- not-recombining-area for maternal and paternal
                phylo-trees separately (primates ?)

                ---------edit------------------------------------

                hmm, there should be a program that filters the recombined chunks
                and computes the distance in the closely-related areas only

                ---------edit--------------------------------

                take one of the 2 phases/alleles/haplotypes/zygotes at random
                (e.g. hapmap has them sorted alphabetically so taking the
                first one can give bias)

                -------------------------------------
                Last edited by gsgs; 12-19-2012, 05:45 PM.

                Comment

                • mmmm
                  Senior Member
                  • Jul 2013
                  • 131

                  #9
                  have an excel file including snps (mutational and recommbinant). How to extract the mutaional snps only into a new fasta file?

                  Comment

                  • gsgs
                    Senior Member
                    • Oct 2009
                    • 139

                    #10
                    save the excel as text-file, post some lines as an example

                    Comment

                    • mm.perrineau
                      Member
                      • Sep 2012
                      • 12

                      #11
                      Still need help

                      Hello everybody,
                      I really need to manage to make a phylogenetic tree with my SNP.
                      Because i am not bio-informaticien i used clcgenomic to "map and call" my SNPs.
                      Now i have a file which look like:

                      Chromosome Region Reference Allele Strain
                      contig_1 145 A G d
                      contig_1 487 G A a, d, f
                      contig_1 682 C G b, d
                      contig_333 1156 T G a
                      contig_1234 566 C T b
                      contig_1234 612 C G b, d

                      So i have 4 strains (a,b,d and f), 1 reference genome with lot of contig.
                      Can somebody help me?

                      Thank you very much

                      Comment

                      Latest Articles

                      Collapse

                      ad_right_rmr

                      Collapse

                      News

                      Collapse

                      Topics Statistics Last Post
                      Started by SEQadmin2, Yesterday, 10:09 AM
                      0 responses
                      10 views
                      0 reactions
                      Last Post SEQadmin2  
                      Started by SEQadmin2, 06-04-2026, 08:59 AM
                      0 responses
                      20 views
                      0 reactions
                      Last Post SEQadmin2  
                      Started by SEQadmin2, 06-02-2026, 12:03 PM
                      0 responses
                      27 views
                      0 reactions
                      Last Post SEQadmin2  
                      Started by SEQadmin2, 06-02-2026, 11:40 AM
                      0 responses
                      21 views
                      0 reactions
                      Last Post SEQadmin2  
                      Working...