Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • Vinn
    Member
    • Nov 2014
    • 21

    K-mer content failed on 5' end - advice needed

    Hi folks,

    I am trying to do adapter and low quality trimming of a fungal genome (prepared with Illumina DNA nano kit and sequenced with HiSeq 2000 100PE). After using BBduk to trim adapters and low quality reads as following

    >./bbduk.sh in1=R1.fastq.gz in2=R2.fastq.gz out1=R1_q25.fastq.gz out2=R2_q25.fastq.gz ktrim=r k=21 mink=11 hdist=2 tpe tbo ref=resources/adapters.fa qtrim=rl trimq=25

    Still FASTQC showed a K-mer content warning for both R1 and R2 reads [ https://goo.gl/photos/Lsyt7YJeQnjB8HQq5 ]. Can I have your opinion how shall I handle my data? Shall I just remove the first 20 bases to be on a safe side? Or it is normal behavior for a library prepared with the nano kit?

    Thanks in advance and have a great day!
    Last edited by Vinn; 04-21-2017, 06:47 AM.
  • GenoMax
    Senior Member
    • Feb 2008
    • 7142

    #2
    What kind of analysis are you trying to do? In general I have never worried about k-mer warnings from FastQC.

    Comment

    • Vinn
      Member
      • Nov 2014
      • 21

      #3
      Originally posted by GenoMax View Post
      What kind of analysis are you trying to do? In general I have never worried about k-mer warnings from FastQC.
      Hi GenoMax, thanks for your reply. I would like to do de novo assembly.

      Comment

      • GenoMax
        Senior Member
        • Feb 2008
        • 7142

        #4
        Take a look at @Brian's suggestions in this thread. I have provided a link for a specific post but take a look at the whole thread. He should be along with more later.

        Comment

        • Vinn
          Member
          • Nov 2014
          • 21

          #5
          Thank you, I will read the thread through.

          Comment

          • Brian Bushnell
            Super Moderator
            • Jan 2014
            • 2709

            #6
            Kmer-content spikiness at the beginning of the read is normal for many fragmentation methodologies and should not be removed. I'm not sure what's going on at the end, though...

            Comment

            • Vinn
              Member
              • Nov 2014
              • 21

              #7
              Thanks for your reply Brian. Just to be on a safe side, do you think it is better to trim the end off?

              Comment

              • Brian Bushnell
                Super Moderator
                • Jan 2014
                • 2709

                #8
                Excessive trimming reduces accuracy, and will degrade the results of any experiment. If you want to be confident that bases are genomic rather than artificial, I suggest you follow this methodology:

                1) Map the reads to the reference (if you don't have a reference, you can make a quick assembly with Tadpole) with BBMap like this:

                Code:
                bbmap.sh in=reads.fq ref=ref.fa mhist=mhist.txt qhist=qhist.txt
                2) Plot mhist with R or Excel with a log-scale Y-axis to look at the positional error rates.

                If there is not an increased error rate in a region of the read, there is no reason to trim it. And conversely, it is prudent to trim if there is a high error rate at one end or the other.

                Comment

                • Vinn
                  Member
                  • Nov 2014
                  • 21

                  #9
                  Thanks so much Brian for your advice. I will try as you suggested.

                  Comment

                  Latest Articles

                  Collapse

                  • SEQadmin2
                    From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
                    by SEQadmin2


                    Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


                    The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
                    ...
                    06-02-2026, 10:05 AM
                  • SEQadmin2
                    Single-Cell Sequencing at an Inflection Point: Early Impacts of New Platforms and Emerging Trends
                    by SEQadmin2


                    With the launch of new single-cell sequencing platforms in 2026, the field stands at an exciting inflection point. This article surveys the most impactful advances in the field and discusses how they’re reshaping research in cancer, immunology, and beyond.


                    Introduction

                    Single-cell sequencing technologies have undergone remarkable advances over the past decade, transitioning from low-throughput experimental approaches to highly scalable platforms capable of...
                    05-22-2026, 06:42 AM
                  • SEQadmin2
                    Environmental Genomics in the Age of NGS: From Microbes to Conservation Strategies
                    by SEQadmin2

                    Studying ecosystems means dealing with complex, multi-species communities that are hard to observe at scale. This complexity, however, hides many important questions to be answered, from how biogeochemical cycles work and how climate change can affect species distribution to how conservation strategies can work best.


                    Genomics, particularly since the expansion of NGS, has transformed ecosystem ecology. By sequencing environmental DNA, we can now assess biodiversity without direct...
                    05-06-2026, 09:04 AM

                  ad_right_rmr

                  Collapse

                  News

                  Collapse

                  Topics Statistics Last Post
                  Started by SEQadmin2, 06-02-2026, 12:03 PM
                  0 responses
                  19 views
                  0 reactions
                  Last Post SEQadmin2  
                  Started by SEQadmin2, 06-02-2026, 11:40 AM
                  0 responses
                  14 views
                  0 reactions
                  Last Post SEQadmin2  
                  Started by SEQadmin2, 05-28-2026, 11:40 AM
                  0 responses
                  29 views
                  0 reactions
                  Last Post SEQadmin2  
                  Started by SEQadmin2, 05-26-2026, 10:12 AM
                  0 responses
                  31 views
                  0 reactions
                  Last Post SEQadmin2  
                  Working...