Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • phastCons download from UCSC

    Hi, there,
    I want to do some analysis on a list of genomic regions in mouse and access their conservation by using phastCons scores. I'm a bit confused with different files available on UCSC table browser. For mm9, there are

    1. Vertebrate Cons (phastCons30way) and

    2. Vertebrate El (phastConsElements30way)

    What's the difference between the two? Their table schema is also different.

    The Vertebrate Cons has 14 fields and I'm not sure which field has the phastCons score...

    The Vertebrate El has only 6 fields with the last filed being score and I guess this is phastCons score?

    Also, can I add or average scores from different regions?

    Thanks so much!

  • #2
    Output "data points" from the Table Browser to get conservation score data from the phastCons30way table.

    However, I think that you are better off downloading the mm9 phastCons scores directly from UCSC's website. Compression of data presented by the UCSC browser can introduce errors, particularly if you are aggregating scores over multiple regions. Data obtained directly from file downloads are not compressed or modified.

    Note also that phastCons and phyloP conservation scores are not generated for alignment gaps and unaligned nucleotides. You may want to filter those regions before aggregation.

    Another good place to ask these sorts of questions is the UCSC Genome mailing list.

    Comment


    • #3
      Originally posted by AlexReynolds View Post
      Output "data points" from the Table Browser to get conservation score data from the phastCons30way table.

      However, I think that you are better off downloading the mm9 phastCons scores directly from UCSC's website. Compression of data presented by the UCSC browser can introduce errors, particularly if you are aggregating scores over multiple regions. Data obtained directly from file downloads are not compressed or modified.

      Note also that phastCons and phyloP conservation scores are not generated for alignment gaps and unaligned nucleotides. You may want to filter those regions before aggregation.

      Another good place to ask these sorts of questions is the UCSC Genome mailing list.
      I downloaded the scores directly from the site you mentioned. It's something I was looking for. Thanks a lot for that!

      I was looking at the data points from the table browser and I only saw the 2nd last column is the sumData and it says "sum of the data points, for average and stddev calc" in the description. I think it's the sum of the individual nucleotide scores in that region. Do you know how they decide on the range to calculate the sumData? I think the table browser has the data in wiggle format and my question is probably the same as how they decide on where to draw the line on the genome to make the wiggle file.

      Thanks so much!!!
      Last edited by gene_x; 03-07-2013, 12:45 PM.

      Comment


      • #4
        Also, what's the difference between Vertebrate Cons(phastCons30way) and Vertebrate El (phastConsElements30way)?

        Comment


        • #5
          I'd recommend that you put these questions up on the UCSC Genome mailing list, which is run by UCSC staff who are more familiar with how the data are prepared before posting on the genome browser.

          Comment


          • #6
            Originally posted by AlexReynolds View Post
            I'd recommend that you put these questions up on the UCSC Genome mailing list, which is run by UCSC staff who are more familiar with how the data are prepared before posting on the genome browser.
            OK.. I'll do that.

            Comment


            • #7
              I figured it out.. the Vertebrate El track is the elements predicted to be conserved, it's basically regions with continuously high phastCons scores.

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Essential Discoveries and Tools in Epitranscriptomics
                by seqadmin




                The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
                04-22-2024, 07:01 AM
              • seqadmin
                Current Approaches to Protein Sequencing
                by seqadmin


                Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                04-04-2024, 04:25 PM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, Today, 08:47 AM
              0 responses
              9 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-11-2024, 12:08 PM
              0 responses
              60 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-10-2024, 10:19 PM
              0 responses
              57 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-10-2024, 09:21 AM
              0 responses
              53 views
              0 likes
              Last Post seqadmin  
              Working...
              X