Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Phylogeny on SNP

    Hello,

    I'm searching for a tool inferring phylogenies of different species via snp calls? Any suggestions? I got a table containing different positions and snp calls and want to infer the phylogeny for the different species via those snps. Atm, i really don't know how to tackle the problem except binarize them and cluster via different algorithms.


    Thanks,


    Phil

  • #2
    Most phylogeny estimation tools (phylip, phyml, paup*, MrBayes, *BEAST etc) require their input to be in fasta or phylip format. SNPs alone are tricky for those tools since there's a lot of ignored data (everything in between the SNPs), which makes estimating branch lengths difficult.
    Also keep in mind that there might not actually *be* a simple tree underlying your data - recombination and incomplete lineage sorting will make the ancestry of the sequences a potentially complex network, not a simple tree.
    With those caveats, I think making a fasta-formatted input file is your best bet.
    good luck!

    Comment


    • #3
      Originally posted by brofallon View Post
      .
      With those caveats, I think making a fasta-formatted input file is your best bet.
      good luck!
      So it is a possibility to just concat the snp-calls to a complete sequence and do the analysis on that. Gaining a network isn't such a bad thing...

      Comment


      • #4
        If you concat the SNPs, and therefore ignore all invariant sites, you'll probably get approximately the correct tree topology, but branch lengths that are much too long. Some programs may break under these conditions, I'm not entirely sure. I'd be curious to hear what the results look like if you do it...
        B

        Comment


        • #5
          Originally posted by brofallon View Post
          If you concat the SNPs, and therefore ignore all invariant sites, you'll probably get approximately the correct tree topology, but branch lengths that are much too long. Some programs may break under these conditions, I'm not entirely sure. I'd be curious to hear what the results look like if you do it...
          B
          Does anybody know how to concatenate the SNPs to make the fasta sequence? I have the same problem now.

          Comment


          • #6
            Hi everybody,
            Me too i really need and answer !!!!!!!

            I have the reads from 4 DNA diploid strains... One genome de reference well annotated...
            I made a SNPs calling with CLC and a Venn diagram to represent the similarity and the difference between my 4 strains...

            And now I BLOCK !!!!

            I would like to make a phylogenetic tree with the SNPs data (not with the number of the SNPs but) with the nucleotide information from the SNPs (INDEL, mutation, rate of mutation).

            It should exist on software which code the SNPs on something like a diploid code (AA, A- or --) for each SNPs position... and create a tree with this information !!!

            Can you help me please !!!

            Thank you

            Marie-Mathilde

            Comment


            • #7
              Keep in mind that it's unlikely that there's is a phylogenetic tree that underlies the data. Recombinations are likely to make the trees differ from SNP to SNP, so taking a bunch of SNPs and forcing them into a non-recombining tree may not be that helpful.
              You can try ACG (arup.utah.edu/acg) - it can make recombining trees from SNPs from a VCF (or multiple vcfs) and a reference

              Comment


              • #8
                I assume that these programs really only need the
                numbers of mutual differences between the
                samples. So you should be able to input this
                differences-matrix directly.
                (better for few samples with long DNA, many differences)

                Making a fasta from the vcf is also straightforward,
                I just wrote a program for that (SNPs only), handling the chromosomes
                separately. You could also merge the chromosomes ...
                but that gives long fastas and you'd be back to the differeves-matrix
                option

                ----------edit-------------------------------

                just use mtDNA and y- not-recombining-area for maternal and paternal
                phylo-trees separately (primates ?)

                ---------edit------------------------------------

                hmm, there should be a program that filters the recombined chunks
                and computes the distance in the closely-related areas only

                ---------edit--------------------------------

                take one of the 2 phases/alleles/haplotypes/zygotes at random
                (e.g. hapmap has them sorted alphabetically so taking the
                first one can give bias)

                -------------------------------------
                Last edited by gsgs; 12-19-2012, 05:45 PM.

                Comment


                • #9
                  have an excel file including snps (mutational and recommbinant). How to extract the mutaional snps only into a new fasta file?

                  Comment


                  • #10
                    save the excel as text-file, post some lines as an example

                    Comment


                    • #11
                      Still need help

                      Hello everybody,
                      I really need to manage to make a phylogenetic tree with my SNP.
                      Because i am not bio-informaticien i used clcgenomic to "map and call" my SNPs.
                      Now i have a file which look like:

                      Chromosome Region Reference Allele Strain
                      contig_1 145 A G d
                      contig_1 487 G A a, d, f
                      contig_1 682 C G b, d
                      contig_333 1156 T G a
                      contig_1234 566 C T b
                      contig_1234 612 C G b, d

                      So i have 4 strains (a,b,d and f), 1 reference genome with lot of contig.
                      Can somebody help me?

                      Thank you very much

                      Comment

                      Latest Articles

                      Collapse

                      • seqadmin
                        Non-Coding RNA Research and Technologies
                        by seqadmin




                        Non-coding RNAs (ncRNAs) do not code for proteins but play important roles in numerous cellular processes including gene silencing, developmental pathways, and more. There are numerous types including microRNA (miRNA), long ncRNA (lncRNA), circular RNA (circRNA), and more. In this article, we discuss innovative ncRNA research and explore recent technological advancements that improve the study of ncRNAs.

                        Nobel Prize for MicroRNA Discovery
                        This week,...
                        10-07-2024, 08:07 AM
                      • seqadmin
                        Recent Developments in Metagenomics
                        by seqadmin





                        Metagenomics has improved the way researchers study microorganisms across diverse environments. Historically, studying microorganisms relied on culturing them in the lab, a method that limits the investigation of many species since most are unculturable1. Metagenomics overcomes these issues by allowing the study of microorganisms regardless of their ability to be cultured or the environments they inhabit. Over time, the field has evolved, especially with the advent...
                        09-23-2024, 06:35 AM

                      ad_right_rmr

                      Collapse

                      News

                      Collapse

                      Topics Statistics Last Post
                      Started by seqadmin, 10-02-2024, 04:51 AM
                      0 responses
                      103 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, 10-01-2024, 07:10 AM
                      0 responses
                      111 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, 09-30-2024, 08:33 AM
                      1 response
                      114 views
                      0 likes
                      Last Post EmiTom
                      by EmiTom
                       
                      Started by seqadmin, 09-26-2024, 12:57 PM
                      0 responses
                      21 views
                      0 likes
                      Last Post seqadmin  
                      Working...
                      X