Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Problems with QC results of Illumina Body Map 2.0

    Dear all:

    I'm new in the sequencing analysis.

    Recently, I've downloaded the RNA-seq data of Illumina Body Map 2.0.
    (The accession number is GSE30611.)

    The data was applied this following processes:
    (1) All the SRA files were converted to Sanger FASTAQ.
    (2) Sanger FASTQ files without any manipulation was uploaded to Galaxy.
    (3) Fastx toolkit was used to draw the quality plot of these sequences.

    Finally, I got some strange results of quality control.
    For example, both of single-end and pair-end sequence from Brian tissue shows bad QC from the first base to 10th base nucleotide (shown in following figures). It's not a typical QC plot for Illumian sequencing.

    Single-end:

    Pair-end:


    I think there may be some problems in library-constructing step.
    Does anyone can give me any idea why I got these results?

    Thanks!

    Best,
    Yi
    Yi John Huang (PhD student)
    886-3-2118800 ext. 3731
    Graduate Institute of Biomedical Science, Chang Gung University

  • #2
    I believe the pattern of lower quality scores in the beginning is quite common in Illumina data. Overall, these quality scores are remarkably good. The median never drops below 30!

    Comment


    • #3
      Originally posted by kopi-o View Post
      I believe the pattern of lower quality scores in the beginning is quite common in Illumina data. Overall, these quality scores are remarkably good. The median never drops below 30!
      Thanks for replying!

      In facts, I've never seen this kind of data before...
      The data I processed before contained lower quality at the end of sequences only...

      Therefore, I feel confused...

      Why or how does the Illumina generate the lower quality score in the beginning sequences?

      Thanks again!
      Yi John Huang (PhD student)
      886-3-2118800 ext. 3731
      Graduate Institute of Biomedical Science, Chang Gung University

      Comment


      • #4
        See this thread:

        Bridged amplification & clustering followed by sequencing by synthesis. (Genome Analyzer / HiSeq / MiSeq)

        Comment


        • #5
          Excuse me, there is another question here:

          The first quantile (25%) for the first 10 bases in this data is lower than Q30 even Q20. Indeed, that means there are 25% reads contains bad q-score in the first 10 bases, doesn't it?

          Should I trim the first 10 bases that with lower quality out for all sequences at the beginning of data processing? or trim out only the sequences that contain lower q-score?
          Yi John Huang (PhD student)
          886-3-2118800 ext. 3731
          Graduate Institute of Biomedical Science, Chang Gung University

          Comment

          Latest Articles

          Collapse

          • seqadmin
            Essential Discoveries and Tools in Epitranscriptomics
            by seqadmin




            The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
            04-22-2024, 07:01 AM
          • seqadmin
            Current Approaches to Protein Sequencing
            by seqadmin


            Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
            04-04-2024, 04:25 PM

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by seqadmin, 04-25-2024, 11:49 AM
          0 responses
          19 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-24-2024, 08:47 AM
          0 responses
          17 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-11-2024, 12:08 PM
          0 responses
          62 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-10-2024, 10:19 PM
          0 responses
          60 views
          0 likes
          Last Post seqadmin  
          Working...
          X