Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • 2QC or !2QC and whether to check for adapter pollution?

    Hi guys,

    In the case of mapping Illumina reads to a reference genome, I was discussing with various people whether to perform QC on raw fastq files or not, I get the impression that there is a 'for' and a 'against' stand point amongst NGS scientists... As I understand it, there is basically 2 point of views:

    1) NO, do not QC! The aligner will take care of the bad quality reads and 3' quality deterioration and adapter contamination by soft clipping. The aligner STAR was mentioned in this context

    2) YES, do QC! There is no reason to 'confuse' the aligner by introducing reads and bases and adapters, which are known to be of bad quality/technical artefact. Basically why keep noise, if it is easily identified and removed.

    I'm interested in hearing your take on this matter?

    Cheers,
    Leon

  • #2
    In the case of RNAseq it probably doesn't matter too terribly much. The general best practice is to trim adapters (you can be pretty liberal at leaving contamination in) and then optionally perform minimal quality trimming (not removing anything above phred=5). This is assuming that you're using something like STAR that uses local alignment.

    For other *seq experiments the situation can be completely different. With bisulfite sequencing, for example, you'd just decrease the quality of your data by not aggressively trimming adapters (though even there I don't think aggressive quality trimming is particularly useful). It's all a question of how much error you can tolerate in downstream analyses.

    Comment


    • #3
      Whether an aligner is local or global, adapter sequence in the data is never going to improve results, and always poses a risk of spurious placement and bad alignments. If you can remove contaminant sequence with high precision, you should always do so prior to any analysis. Even basic questions like "What percent of my DNA came from this organism?" can't be answered usefully if, say, 50% of your data is adapter-dimers. Which you would never know without QC.

      Quality-trimming is another matter, and as Devon said, it depends on how much error can be tolerated downstream. Mapping, for example, can be more accurate when including low-quality bases; if you have a 120bp repeat element, a 150bp read with 50 low-quality bases may map uniquely while a 100bp trimmed read will not. But assemblers may be less tolerant.
      Last edited by Brian Bushnell; 10-06-2014, 08:38 AM.

      Comment

      Latest Articles

      Collapse

      • seqadmin
        Essential Discoveries and Tools in Epitranscriptomics
        by seqadmin




        The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
        04-22-2024, 07:01 AM
      • seqadmin
        Current Approaches to Protein Sequencing
        by seqadmin


        Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
        04-04-2024, 04:25 PM

      ad_right_rmr

      Collapse

      News

      Collapse

      Topics Statistics Last Post
      Started by seqadmin, 04-25-2024, 11:49 AM
      0 responses
      19 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-24-2024, 08:47 AM
      0 responses
      18 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-11-2024, 12:08 PM
      0 responses
      62 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-10-2024, 10:19 PM
      0 responses
      60 views
      0 likes
      Last Post seqadmin  
      Working...
      X