Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • finding adapters for trimming

    Hi All,

    I am a total newbies in this field. I want to know the trend of the community for adapter trimming steps.

    I have got 50bp single end reads (Sanger / Illumina 1.9). Primary goal is to align the reads using bismark, and then extract methylation scores using 'methylkit'. There were three overrepresented sequences in FastQC report. Then I ran trim_galore using the default settings. trim_galore(which basically uses 'cutadapt') trimmed the universal adapter but still there are two overrepresented sequences left in the fastQC report.

    I have read so many posts related to trimming last 3-4 days but still I am so confused. The summary I have got is that FastQC tells us about adapter contamination, but it may not tell about the actual adapter sequence.

    1. Is it a MUST to trim all the overrepresented sequences or just trimming the universal adapter is fine?
    2. What is the easiest way to find the sequences that need to be trimmed?

    Any help/suggestion is greatly appreciated.

  • #2
    1) The best practice is to trim the actual adapter sequences used in your library.
    2) The best way to find that is to ask the people who made the library.

    But, if you have paired reads, you can also find your adapter sequences with BBMerge like this:

    bbmerge.sh in1=read1.fq in2=read2.fq outa=adapters.fa

    BBDuk includes all standard Illumina adapters in "/resources/adapters.fa". If you do not know which adapters were used, and are unable to find out, I recommend using that as the reference.

    Since you are using single-ended reads, it's difficult to automatically empirically determine the adapter sequences. So, unless you can get them from the people who made the library, I suggest using that reference.
    Last edited by Brian Bushnell; 11-22-2015, 10:43 PM.

    Comment


    • #3
      Brian - thanks for parsing Illumina's PDF and making the adapters available. It looks like as of Nov 9 2015 Illumina updated their adapter sequence document. Are there any notable changes that aren't present in the BBDuk adapter sequence fasta?

      Oligonucleotide (oligo) sequences of Illumina adapters used in AmpliSeq, Nextera, TruSeq, and TruSight library prep kits.
      Attached Files

      Comment


      • #4
        Ah, thanks for notifying me... I'll look at it.

        Comment


        • #5
          Thanks a lot for the response Brian.

          I have single reads this time. Do you have any suggestions for the overrepresented sequences that do not match with any actual adapter (''No Hit" as described by fastqc)?

          Originally posted by Brian Bushnell View Post
          1) The best practice is to trim the actual adapter sequences used in your library.
          2) The best way to find that is to ask the people who made the library.

          But, if you have paired reads, you can also find your adapter sequences with BBMerge like this:

          bbmerge.sh in1=read1.fq in2=read2.fq outa=adapters.fa

          BBDuk includes all standard Illumina adapters in "/resources/adapters.fa". If you do not know which adapters were used, and are unable to find out, I recommend using that as the reference.

          Since you are using single-ended reads, it's difficult to automatically empirically determine the adapter sequences. So, unless you can get them from the people who made the library, I suggest using that reference.

          Comment


          • #6
            @bluepoison: I suggest you try adapter-removal using BBDuk and adapters.fa, and see if fastQC still detects overrepresented sequences. If not, everything should be fine! But if it does, you may have a new adapter sequence, so please reply in that case.

            @turnersd: Unfortunately... there are a lot of new adapter indexes in the latest Illumina letter that you linked - dozens. They are for human-specific tests, like autism, cancer, and other possibly genetic disorders. And as always, Illumina makes no effort to indicate which indexes go with which adapters. So, it looks like a huge amount of effort now to make a complete set of Illumina adapter sequences complete with indexes.

            JGI does not do any human sequencing, so none of that is relevant to us. But for everyone else out there - I really hope Illumina, or someone in the community, compiles a full list of the new human-specific adapter sequences. Because there are so many, and I have no way to empirically determine whether the new sequences are correct (since we don't use them), it's not really possible for me to generate them. Illumina would provide the full, indexed adapter-sequences for trimming if they had the slightest concern for their end users, which they unfortunately do not appear to have.

            So far, it's not clear to me which adapters go with new indexes, or why they even need new indexes for cancer versus autism, etc. Seems like a marketing ploy. But probably the new indices only affect amplicon sequencing and are irrelevant to randomly-shared libraries.

            Comment

            Latest Articles

            Collapse

            • seqadmin
              Non-Coding RNA Research and Technologies
              by seqadmin




              Non-coding RNAs (ncRNAs) do not code for proteins but play important roles in numerous cellular processes including gene silencing, developmental pathways, and more. There are numerous types including microRNA (miRNA), long ncRNA (lncRNA), circular RNA (circRNA), and more. In this article, we discuss innovative ncRNA research and explore recent technological advancements that improve the study of ncRNAs.

              Nobel Prize for MicroRNA Discovery
              This week,...
              10-07-2024, 08:07 AM
            • seqadmin
              Recent Developments in Metagenomics
              by seqadmin





              Metagenomics has improved the way researchers study microorganisms across diverse environments. Historically, studying microorganisms relied on culturing them in the lab, a method that limits the investigation of many species since most are unculturable1. Metagenomics overcomes these issues by allowing the study of microorganisms regardless of their ability to be cultured or the environments they inhabit. Over time, the field has evolved, especially with the advent...
              09-23-2024, 06:35 AM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by seqadmin, 10-11-2024, 06:55 AM
            0 responses
            11 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 10-02-2024, 04:51 AM
            0 responses
            110 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 10-01-2024, 07:10 AM
            0 responses
            114 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 09-30-2024, 08:33 AM
            1 response
            119 views
            0 likes
            Last Post EmiTom
            by EmiTom
             
            Working...
            X