Seqanswers Leaderboard Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • gspirito
    Junior Member
    • Nov 2019
    • 2

    Extract reads from paired-end fastq based on specific adapters with bbduk

    Hello everyone, I am using bbduk.sh (from bbmap toolkit) to extract reads from paired-end fastq files based on the presence of specific adapters in the 5' of the sequence in the "_1" fastq file.

    I am using this command:

    Code:
    ./bbmap/bbduk.sh -Xmx1g in1=reads_1.fastq.gz in2=reads_2.fastq.gz outm1=matched1.fastq.gz outm2=matched2.fastq.gz literal=AAACCTGAGAAACCTA k=16 hdist=0 -rcomp=f
    The problem is that other that the correct reads, the output file contains also other reads which do not include the adapter sequence, es:

    # from reads_1.fastq.gz
    @SRR9262917.232075 232075/1
    GCATGCGAGTAGCGGTGGTTCTTATA
    +
    FFFFFFFFFFFFFFFFFFFFFFFFFF

    # from reads_2.fastq.gz
    @SRR9262917.232075 232075/2
    AAGCAGTGGTATCAACGCAGAGTACATGGGATTCCATAGCCCTGTGGTTTTTATAGATCTTGTAAACCCCAAACCTGGGAAACCTAGTGGC
    +
    FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFF,FFFFFFFFF

    Does anyone know why this may be happening and how to avoid this?

    Thanks in advance.
  • GenoMax
    Senior Member
    • Feb 2008
    • 7142

    #2
    You could add a "restrictleft=N" N=certain number of bases to look only in that area. Also adding "minlength=N" will exclude small reads like the first example. Also try setting k to something smaller (8) so it has better chances of matching correctly.

    I hope "-rcomp=f" is a typo. There should be no - at beginning.
    Last edited by GenoMax; 11-08-2019, 06:44 AM.

    Comment

    • gspirito
      Junior Member
      • Nov 2019
      • 2

      #3
      Thank you! That worked

      Comment

      Latest Articles

      Collapse

      • seqadmin
        New Genomics Tools and Methods Shared at AGBT 2025
        by seqadmin


        This year’s Advances in Genome Biology and Technology (AGBT) General Meeting commemorated the 25th anniversary of the event at its original venue on Marco Island, Florida. While this year’s event didn’t include high-profile musical performances, the industry announcements and cutting-edge research still drew the attention of leading scientists.

        The Headliner
        The biggest announcement was Roche stepping back into the sequencing platform market. In the years since...
        03-03-2025, 01:39 PM
      • seqadmin
        Investigating the Gut Microbiome Through Diet and Spatial Biology
        by seqadmin




        The human gut contains trillions of microorganisms that impact digestion, immune functions, and overall health1. Despite major breakthroughs, we’re only beginning to understand the full extent of the microbiome’s influence on health and disease. Advances in next-generation sequencing and spatial biology have opened new windows into this complex environment, yet many questions remain. This article highlights two recent studies exploring how diet influences microbial...
        02-24-2025, 06:31 AM

      ad_right_rmr

      Collapse

      News

      Collapse

      Topics Statistics Last Post
      Started by seqadmin, Today, 07:27 AM
      0 responses
      10 views
      0 reactions
      Last Post seqadmin  
      Started by seqadmin, Yesterday, 12:50 PM
      0 responses
      14 views
      0 reactions
      Last Post seqadmin  
      Started by seqadmin, 03-03-2025, 01:15 PM
      0 responses
      185 views
      0 reactions
      Last Post seqadmin  
      Started by seqadmin, 02-28-2025, 12:58 PM
      0 responses
      283 views
      0 reactions
      Last Post seqadmin  
      Working...