Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Adapter trimming with BBduk

    Hi, I am working with some MiSeq 16S and ITS2 amplicon sequence data generated by JGI. Previously I utilized the data after it was quality controlled and merged but now I am going back to the original raw interleaved files to learn how to do the initial steps myself. The first question I have is about adapter trimming. I am using BBduk and its adapters.fa reference file to trim adapters. This is a relatively simple (possibly silly) question, but in the example in the BBduk manual it trims the right (3' adapter) by specifying "ktrim=r", but no left trimming. Is there a reason trimming on the 5' end is not necessary or should it be done also (with "ktrim=l")?

    Additionally, it seems like most the dialogue people have is about trimming adapters and PCR primers. However, is there any need to look for artifacts associated with the forward and reverse primer pads?

    Thanks

  • #2
    Originally posted by PeatMaster View Post
    This is a relatively simple (possibly silly) question, but in the example in the BBduk manual it trims the right (3' adapter) by specifying "ktrim=r", but no left trimming. Is there a reason trimming on the 5' end is not necessary or should it be done also (with "ktrim=l")?
    For good libraries (ones that use standard protocols without inline barcodes at beginning of reads etc) one expects to have contaminants (e.g. adapters) show at the end of a read. This is specially true if the insert turns out to be shorter than you expect (and you are sequencing longer than the length of the fragment). Once the insert is completely sequenced the read will go into the adapter at 3'-end (and even beyond into void, you will see AAAAA etc if that happens).

    If you expect to have contaminants present on the left (5'-end) of the reads you can certainly run ktrim=l.

    Additionally, it seems like most the dialogue people have is about trimming adapters and PCR primers. However, is there any need to look for artifacts associated with the forward and reverse primer pads?

    Thanks
    I am not sure what you are referring to by "primer pads". BBDuk will scan/trim any sequence you provide (you can add it as a fasta record to adapters file or provide it on command line as literal=ACTGGT,TTTGGTG option).

    Comment


    • #3
      GenoMax, thanks for the reply. The adapter trimming on the 3' end makes sense to me now. Thanks.

      What I mean by the primer pads are best shown in the the supplement from Tremblay et al. 2015 (see link below). They are attached to the primers (between the adapter and the primer on the 5' end, and between the primer and barcode on the 3' end). I admit that I am not totally positive what their role is, but I assume they are utilized as part of the primer construct for all JGI MiSeq 16S and ITS2 runs. Since it is located closer to the fragment that is being sequenced than the adapter, it would be even more likely to be present as an artifact. Is this true or do I have something incorrect?




      Thanks for the help

      Comment


      • #4
        In the example you attached the resulting sequence files don't have any extra sequence at the 5'-end.

        Some (e.g. amplicon) sequences involves the same sequence for multiple samples. As a result, an effort is made to increase the nucleotide diversity of the sequence (it is not good to have a specific base for a given cycle for every cluster in case of illumina sequencing) by adding base padding of varying length. If some of the extra sequence does appear on the 5-end then that can be trimmed with bbduk.

        Comment


        • #5
          Thanks

          OK, I see. Thanks for the help.

          Comment

          Latest Articles

          Collapse

          • seqadmin
            Multiomics Techniques Advancing Disease Research
            by seqadmin


            New and advanced multiomics tools and technologies have opened new avenues of research and markedly enhanced various disciplines such as disease research and precision medicine1. The practice of merging diverse data from various ‘omes increasingly provides a more holistic understanding of biological systems. As Maddison Masaeli, Co-Founder and CEO at Deepcell, aptly noted, “You can't explain biology in its complex form with one modality.”

            A major leap in the field has
            ...
            02-08-2024, 06:33 AM
          • seqadmin
            The 3D Genome: New Technologies and Emerging Insights
            by seqadmin


            The study of three-dimensional (3D) genomics explores the spatial structure of genomes and their role in processes like gene expression and DNA replication. By employing innovative technologies, researchers can study these arrangements to discover their role in various biological processes. Scientists continue to find new ways in which the organization of DNA is involved in processes like development1 and disease2.

            Basic Organization and Structure
            Understanding...
            01-22-2024, 03:25 PM

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by seqadmin, Yesterday, 08:57 AM
          0 responses
          12 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 02-14-2024, 09:19 AM
          0 responses
          43 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 02-12-2024, 03:37 PM
          0 responses
          409 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 02-09-2024, 03:36 PM
          0 responses
          648 views
          0 likes
          Last Post seqadmin  
          Working...
          X