Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • SV reads in whole genome

    I'm looking at reads from genomes that definitely include structural variation (based on Breakdancer and SVDetect) and I'm running into something that I suspect to be an error.

    I'm looking at the distribution of the distance between read pairs that are not properly paired in regions across the genome and I'm consistently finding that the distribution of the mean distance of reads where either of the pair maps within the centromere is different from the distribution of mean distance in the arms. There are also more of these reads found within the centromere, but the read depth is lower than in other regions.

    I would suspect that a lot more of these reads could be errors, since the centromere is poorly mapped to begin with. However, I'm not sure how to filter them out if so. Is there a standard analysis step I may be missing in assessing these reads? As far as I can tell each of the SV detection tools assesses the reads differently.
    Last edited by saraki; 08-31-2014, 08:19 AM.

  • #2
    Assuming the centromere is very-low-complexity and the part in the published genome is mainly collapsed repeats, the reads mapping there will be uninformative. You might be able to resolve this by filtering out reads with low mapping scores or that map multiple times. If that does not help, you could mask the genome to remove low-complexity or highly-repetitive areas prior to mapping, to avoid false-positive signals.

    P.S. Note my use of the word "Assuming". I'm not really sure about the structure of the centromere.
    Last edited by Brian Bushnell; 08-31-2014, 10:45 AM.

    Comment


    • #3
      At this point I appear to be looking at unique reads, however a lot of these reads have poor map quality scores due to the insert size so that's not informative, I didn't look at the per-base quality though. Is that what you meant?

      I failed to mention that this is in human data. I know there are more repeat regions in centromeres, but there are also mapped genes. According to UCSC these regions are reasonably mappable as well.

      Comment

      Latest Articles

      Collapse

      • seqadmin
        Essential Discoveries and Tools in Epitranscriptomics
        by seqadmin




        The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
        04-22-2024, 07:01 AM
      • seqadmin
        Current Approaches to Protein Sequencing
        by seqadmin


        Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
        04-04-2024, 04:25 PM

      ad_right_rmr

      Collapse

      News

      Collapse

      Topics Statistics Last Post
      Started by seqadmin, 04-25-2024, 11:49 AM
      0 responses
      20 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-24-2024, 08:47 AM
      0 responses
      20 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-11-2024, 12:08 PM
      0 responses
      62 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-10-2024, 10:19 PM
      0 responses
      61 views
      0 likes
      Last Post seqadmin  
      Working...
      X