Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Help a newbie: adapter filtering

    Hi all,

    I'm a virgin here, I was referred to this forum by a wonderful post-doc. Myself, I'm a first year PhD student so go easy on me...

    So I'm working with NGS data (duh) and I have a couple of questions.

    A little background. I have ~350 fastq paired end read files. Illumina hiseq I'm guessing.

    1. Is the adapter sequence the same for each file?

    2. How do I determine the adapter sequence? I've been using fastQC and under overrepresented sequences, I get this "GATCGGAAGAGCGTCGTGTAGGGAAAGAGGGTAGATCTCGGTGGTCGCCG"

    Now when I run my adapter filtering program, cutadapt for example, is the whole sequence my adapter? The post-doc truncated this sequence to "AGATCGGAAGAGC" is he right?

    I'm a little confused on the reasoning behind adapter filtering (I comprehend why you do it; to remove the adapter region which is not a part of the query sequence).

    Thanks for all the help.

  • #2
    Here is a nice primer on TruSeq adapters: http://tucf-genomics.tufts.edu/docum...q_Adapters.pdf

    You could end up with adapter contamination if you have adapter dimers or have smaller than expected inserts. Since they are not part of real sequence you want to remove them (specially if you are going to do any de novo assembly).

    Ways of how to filter them:

    1. http://www.ark-genomics.org/events-o...-illumina-data

    2. http://onetipperday.blogspot.com/201...torprimer.html

    3. Trim Galore (http://www.bioinformatics.babraham.a...s/trim_galore/) is a wrapper for cutadapt (mentioned in both blog posts) that makes using cutadapt easy.

    4. Entire set of various illumina adapter sequences: http://support.illumina.com/download...es_letter.ilmn
    Last edited by GenoMax; 01-17-2014, 05:48 PM.

    Comment


    • #3
      Quick question, I'm working with a viral sequence ~10kb. Should I be less promiscuous with the filtering because of the small size?

      On the other hand, the coverage is quite good ~10,000x coverage. (DEEP SEQ HOMIE)

      Comment


      • #4
        What is the aim of your experiment? Sometimes having too deep a coverage may actually be detrimental e.g. if you were trying to call SNV's.

        What % of your reads contain adapters?

        Comment


        • #5
          In a nutshell, we want to compare the sequences of virus in people that had successful treatment vs those who have failed.

          Eventually we would want to refine our techniques, both in vitro and in silico, in order to capture the viral population within each patient.

          Comment

          Latest Articles

          Collapse

          • seqadmin
            Essential Discoveries and Tools in Epitranscriptomics
            by seqadmin




            The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
            04-22-2024, 07:01 AM
          • seqadmin
            Current Approaches to Protein Sequencing
            by seqadmin


            Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
            04-04-2024, 04:25 PM

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by seqadmin, 04-25-2024, 11:49 AM
          0 responses
          20 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-24-2024, 08:47 AM
          0 responses
          20 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-11-2024, 12:08 PM
          0 responses
          62 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-10-2024, 10:19 PM
          0 responses
          61 views
          0 likes
          Last Post seqadmin  
          Working...
          X