Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Help with FastQC results

    Hi,
    I have two sets of Illumina Single End RNA-Seq 50 bp data (two differents days of mammalian cell culture). The kit used was KAPA Stranded RNA-Seq Kit with RiboErase.

    Unfortunately the results from FastQC are not as expected. But the problem is that I am not exactly sure how to interpret the data and what to say about the plots.

    Both datasets show the same results. The plot of per base sequence quality is OK (I think) and also the plot of Adapter content, but the plots of GC content and Kmer content look very weird. Also, the duplication levels.

    I am happy to get any advices about what is wrong in this data or possible explanations for this results.

    Thanks for any help

    Ileana

    First three results of
    Overrepresented sequences:
    Sequence
    CGACGGGGGGCCCCGCGGGGCCGAGAAGAAGAGGAGGGGGAGGCGAGGAGG Count: 187325
    Percentage: 1.0857026079582217
    Possible Source: No Hit

    Sequence GGACAGGAGAGCGGTCGCGCCGTGGGAGGGGCGGCCCGGCCCCCACCGCGG Count: 98598
    Percentage: 0.571456590094567
    Possible Source: No Hit

    Sequence CCCGAGACGAGTGGCTCTCCGCACCGGTCCCCGGTCCCGACGCGCGGCGGG Count: 95732
    Percentage: 0.5548457603899987
    Possible Source: No Hit
    Attached Files

  • #2
    Out of the graphs you attached the GC content one looks rather strange. Is this known to be an extremely GC rich organism?

    Comment


    • #3
      No, the GC content is around 40%

      Comment


      • #4
        Have you checked a few sequences (e.g. by blast) to see if they are from the right organism (and are not some kind of contamination)?

        Comment


        • #5
          I did a quick search and didn't found possible contamination. Could be some rRNA and / or mitochondrial RNA? I did found some of these.

          Comment


          • #6
            Neither of those should skew GC content that way. Perhaps someone else will have further suggestions.

            You should go ahead and start analyzing the data.

            Comment


            • #7
              According to plots GC content of reads is 60%. This seems to be result of some GC rich reads with high duplication rates (20% has over 10 k dup rate). I would check reads with dup rate over 1k to see what they are and if they make bilogical sense. If they are not rRNA or from repetative regions, I would suspect some library prep issues.

              Comment


              • #8
                Dear Ileanadrt,

                I am a member of the Kapa Biosystems Technical Support Team.

                We would love to help you troubleshoot this further. Would you be willing to share the type of mammalian cell line you are using?

                Thanks and best regards,
                Adriana

                Comment

                Latest Articles

                Collapse

                • seqadmin
                  Best Practices for Single-Cell Sequencing Analysis
                  by seqadmin



                  While isolating and preparing single cells for sequencing was historically the bottleneck, recent technological advancements have shifted the challenge to data analysis. This highlights the rapidly evolving nature of single-cell sequencing. The inherent complexity of single-cell analysis has intensified with the surge in data volume and the incorporation of diverse and more complex datasets. This article explores the challenges in analysis, examines common pitfalls, offers...
                  Today, 07:15 AM
                • seqadmin
                  Latest Developments in Precision Medicine
                  by seqadmin



                  Technological advances have led to drastic improvements in the field of precision medicine, enabling more personalized approaches to treatment. This article explores four leading groups that are overcoming many of the challenges of genomic profiling and precision medicine through their innovative platforms and technologies.

                  Somatic Genomics
                  “We have such a tremendous amount of genetic diversity that exists within each of us, and not just between us as individuals,”...
                  05-24-2024, 01:16 PM

                ad_right_rmr

                Collapse

                News

                Collapse

                Topics Statistics Last Post
                Started by seqadmin, Today, 08:18 AM
                0 responses
                8 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, Today, 08:04 AM
                0 responses
                10 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 06-03-2024, 06:55 AM
                0 responses
                13 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 05-30-2024, 03:16 PM
                0 responses
                27 views
                0 likes
                Last Post seqadmin  
                Working...
                X