Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • arcolombo698
    Senior Member
    • Nov 2013
    • 142

    Using BLAST and Quality Control

    hello.

    So I am trying to trim out duplicate sequences from fastQC, but have come across lots of duplicated sequences for an RNAseq project.

    I have already trimmed out the adapters, and I expect the duplication levels to be higher for an RNAseq experiment.

    Now after QC generates the sequences that are highly represented, how do I determine from BLAST if I should trim out the duplicated sequences, or to keep them?

    I can use trimmomatic to paste the sequences that are dupes, into the adapter.fa file and remove them.

    However some of the sequences that are duplicating are RNA from mitochondria which is interesting to look at.

    but other sequences are relating to RNA from chromosomes 16, 2, X, etc... should I remove these?

    And I found a repeating sequence called, "Homo sapiens unplaced genomic contig, GRCh37.p13 Primary Assembly" .. and am not sure if I should cut this out?
  • kmcarr
    Senior Member
    • May 2008
    • 1181

    #2
    Short answer, do not do any duplicate removal from RNA-Seq data.

    Somewhat longer answer, the purpose of duplicate removal is to eliminate reads duplicated through the PCR amplification step. If you are sequencing genomic DNA and see two identical reads (or read pairs) the probability is extremely high that they are the result of PCR duplication so you should remove all but one copy of the read. For RNA-Seq data you can not assume PCR duplication. In fact the opposite assumption, that they are not the result of PCR duplication but truly independent reads, is the more likely explanation. You should therefore keep all of these identical reads.

    Comment

    • arcolombo698
      Senior Member
      • Nov 2013
      • 142

      #3
      thank you so much. you are correct.

      Comment

      Latest Articles

      Collapse

      ad_right_rmr

      Collapse

      News

      Collapse

      Topics Statistics Last Post
      Started by SEQadmin2, 06-09-2026, 11:58 AM
      0 responses
      24 views
      0 reactions
      Last Post SEQadmin2  
      Started by SEQadmin2, 06-05-2026, 10:09 AM
      0 responses
      29 views
      0 reactions
      Last Post SEQadmin2  
      Started by SEQadmin2, 06-04-2026, 08:59 AM
      0 responses
      39 views
      0 reactions
      Last Post SEQadmin2  
      Started by SEQadmin2, 06-02-2026, 12:03 PM
      0 responses
      61 views
      0 reactions
      Last Post SEQadmin2  
      Working...