Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • hamcan
    Member
    • Nov 2016
    • 19

    FastQC Report

    I ran a HiSeq on environmental samples and the purpose of the run was to blast my sequences against the NCBI-nr database to see what species my reads match to. I am not doing a denovo assembly or genome assembly.

    My FastQc report passes in all aspects except: per base sequence content, per sequence GC content and kmer content.
    Should I be worried? How much should I rely on a fastqc report?

    Thank you in advance!
  • mastal
    Senior Member
    • Mar 2009
    • 666

    #2
    I would worry most about the per base sequence content, depending on what it looks like, why it didn't pass.

    Comment

    • hamcan
      Member
      • Nov 2016
      • 19

      #3
      Originally posted by mastal View Post
      I would worry most about the per base sequence content, depending on what it looks like, why it didn't pass.
      Below are the pictures of the FastQC failed reports:
      Attached Files

      Comment

      • mastal
        Senior Member
        • Mar 2009
        • 666

        #4
        The per base sequence content looks OK, you might need to trim the last few bases at the 3' ends of the reads. The kmer plot looks like there might be adapters at the 3' ends of the reads too.

        Comment

        • Zapages
          Member
          • Oct 2012
          • 98

          #5
          I would trim up the first 19 bps at the 5' end (which probably are the adapters) and trim the last 50 bps at the 3' end.

          Also I would suggest increasing the kmer count to k 10 in FastQC to get a better idea of things for the 3' end for how much to trim.

          All the best with your project.

          -Zapages

          Comment

          • Brian Bushnell
            Super Moderator
            • Jan 2014
            • 2709

            #6
            Originally posted by Zapages View Post
            I would trim up the first 19 bps at the 5' end (which probably are the adapters) and trim the last 50 bps at the 3' end.
            I think not; Nextera libraries normally look like that at the beginning due to shearing bias, but the bases are correct. The 3' end looks like adapter sequence, though, and should be adapter-trimmed.

            Comment

            • GenoMax
              Senior Member
              • Feb 2008
              • 7142

              #7
              Originally posted by Zapages View Post
              I would trim up the first 19 bps at the 5' end (which probably are the adapters) and trim the last 50 bps at the 3' end.

              Also I would suggest increasing the kmer count to k 10 in FastQC to get a better idea of things for the 3' end for how much to trim.

              All the best with your project.

              -Zapages
              No trimming necessary. Refer to this post by Dr. Simon Andrews, author of FastQC.

              Comment

              • Zapages
                Member
                • Oct 2012
                • 98

                #8
                Originally posted by GenoMax View Post
                No trimming necessary. Refer to this post by Dr. Simon Andrews, author of FastQC.
                Very interesting development and something that I always thought about this too when I was working on my data sets.

                Since the biased composition is created by the selection of sequencing fragments and not by base call errors the only effect of trimming would be to change from having a library which starts over biased positions, to having a library which starts slightly downstream of biased positions.

                Thank you for sharing.

                I did a lot of RNA-Seq analysis last year and earlier this year. This news was not known at that time... When I free time, I definitely will go back and check some of my old results and see if there is any improvement in my differential expression results.

                Whilst the warnings generated by this problem reflect a real issue it’s not something which can be fixed, and doesn’t seem to have any serious consequences for downstream analysis. Ironically if you are producing RNA-Seq libraries it would make for better QC if you were to focus on libraries which didn’t have this artefact in them, as they would be the ones which were truly suspicious.
                I guess, we should go with more expensive PCR-free approaches: https://konradpaszkiewicz.wordpress....biased-genome/

                Thoughts?

                Would you recommend this approach for older generated data that used TruSeq Library Prep kits or had 5' that were really messy? As I think back, I remember dealing with some pretty messy RNA-Seq that had to be cleaned up from Illumina HiSeq 2500 machines. I will give my old results another look when I am free.

                Comment

                • hamcan
                  Member
                  • Nov 2016
                  • 19

                  #9
                  Originally posted by Brian Bushnell View Post
                  I think not; Nextera libraries normally look like that at the beginning due to shearing bias, but the bases are correct. The 3' end looks like adapter sequence, though, and should be adapter-trimmed.
                  Hey, it was adapter trimmed at the 3' end! So i'm not sure what is going on..suggestions?

                  Comment

                  • hamcan
                    Member
                    • Nov 2016
                    • 19

                    #10
                    Originally posted by GenoMax View Post
                    No trimming necessary. Refer to this post by Dr. Simon Andrews, author of FastQC.
                    This would explain the 5' end, but how about the 3' end?

                    Comment

                    • Brian Bushnell
                      Super Moderator
                      • Jan 2014
                      • 2709

                      #11
                      You may have used the wrong adapter sequences, or simply had incomplete trimming. I suggest starting with the raw reads and performing adapter-trimming as in the post I linked, then looking at the results.

                      Comment

                      Latest Articles

                      Collapse

                      • SEQadmin2
                        Nine Things a Sample Prep Scientist Thinks About Before Sequencing
                        by SEQadmin2


                        I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.

                        Here are nine questions we think about, in roughly the order they matter, before...
                        06-18-2026, 07:11 AM
                      • SEQadmin2
                        From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
                        by SEQadmin2


                        Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


                        The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
                        ...
                        06-02-2026, 10:05 AM

                      ad_right_rmr

                      Collapse

                      News

                      Collapse

                      Topics Statistics Last Post
                      Started by SEQadmin2, 06-17-2026, 06:09 AM
                      0 responses
                      36 views
                      0 reactions
                      Last Post SEQadmin2  
                      Started by SEQadmin2, 06-09-2026, 11:58 AM
                      0 responses
                      100 views
                      0 reactions
                      Last Post SEQadmin2  
                      Started by SEQadmin2, 06-05-2026, 10:09 AM
                      0 responses
                      121 views
                      0 reactions
                      Last Post SEQadmin2  
                      Started by SEQadmin2, 06-04-2026, 08:59 AM
                      0 responses
                      113 views
                      0 reactions
                      Last Post SEQadmin2  
                      Working...