Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • arcolombo698
    replied
    Hi. Okay you are using Trimmomatic.

    You first need to know which prep kit was used on the data. For my experiment we had used ILLUMINA prep kit that was found on their website and you can easily download the list of adapters used in the experiment because the covariate file has the prep kit name. We used the TruSeq2 Prep kit (if I remember correctly)

    The thing to realize is to understand how trimming works.

    There are 3 ' and 5' adapter sequences that attach to both ends. The universal adapter attaches to the 5' end of read 1 and read 1 also has the indexed adapter on the 3' end.

    when read 1 is sequenced into the NGS machine, the machine detects the Universal adapter (because there is a primer attached onto the universal adapter) and read 1 skips the universal adapter, and the actual read 1 is everything in the flow cell lane that is after the universal adapter (i.e. <read 1 content><adapter region>

    Then since this is paired end data, the second read 2 is sequenced, and the second read ends up with the reverse complement of the universal adapter. So if you know the universal adapter used in the experiment, merely calculate the reverse compliment and enter that into the TruSeq-2.fa if it is not already there.


    Now how to remove the universal adapter?
    Well read 2 is generated by reading the opposite direction 5' --> 3' and now the indexed adapter is detected by the machine and skips it. So the read 2 contains the fragment content and also the reverse complement of the universal adapter.

    So all you need to do when using trimmomatic is
    1) make sure that trimmomatic removes all the content that FOLLOWS the match, and not the exact match itself
    2) find the common index for all the indexed adapters and enter that into the adapter.fa file
    3) enter the reverse complement of the universal adapter into the adapter.fa file.

    Check the alignment files after trimming.

    Leave a comment:


  • MalcolmHoutz
    replied
    Trimmomatic: Which supplied illumina adapter file do I use?

    Trimmomatic includes Illumina-supplied adapter fasta files:
    NexteraPE-PE.fa
    TruSeq2-SE.fa
    TruSeq3-PE.fa
    TruSeq2-PE.fa
    TruSeq3-PE-2.fa
    TruSeq3-SE.fa

    I don't know which one to use. My data is paired end. When I asked the Primary Investigator, she gave me this response:

    I'm not sure which of the adapter fa files it is. The index sequences are are from Epicenter: http://www.epibio.com/docs/default-s...s.pdf?sfvrsn=8 all are from set 1. As for the adapter sequences, they are from the "scriptseq kit".


    I have been using TruSeq3-PE.fa, but only because I read this is common for recently sequenced data. I read in another forum TruSeq2-PE.fa is pretty generic, and should work. I am not sure what to do, and would appreciate some guidance. Thanks.

    Leave a comment:


  • kcchan
    replied
    It's a feature that's been in CASAVA and BCL2FASTQ for a few years, but it's never worked really well.

    Leave a comment:


  • blancha
    replied
    Good to know that the built-in software can do the trimming. I'd still rather have the raw data, and set the trimming parameters myself though.

    Leave a comment:


  • cement_head
    replied
    Ok, thanks. I called Illumina and the HiSeq 2000 machine can do trimming - it a CLI flag on the FASTQ generation.

    It turns out the adaptors were not trimmed.

    - Regards

    Leave a comment:


  • blancha
    replied
    To my knowledge, no trimming is performed by the HiSeq 2000. The HiSeq 2000 only calls the bases. Trimming the adapter sequences, if present, is a downstream step.

    Our local sequencing centre, with many HiSeq 2000 machines, never trims the adapters at the level of the HiSeq 2000. They do the trimming later, if necessary, with Trimmomatic.

    Leave a comment:


  • cement_head
    replied
    Hello,

    With the HiSeq 2000, what is the default for adaptor trimming? Is it "on" or "off" when generating FASTQ files?

    Thanks

    Leave a comment:


  • microgirl123
    replied
    The MiSeq has adapter trimming built in if you include it on the sample sheet. We generally do.

    Leave a comment:


  • exo
    replied
    I have a relatively dumb question. Doesnt the MiSeq have an integrated adaptor trimming option?

    Leave a comment:


  • arcolombo698
    replied
    Adapter Trimming

    Hello.

    I have the same question.

    FastQC can return the output of which sequences are overrepresented. Does this mean we need to removed?

    How do you trim the adapters? You can use the ILLUMINACLIP but I don't know how to create the adapter.fa file.

    Advice?

    But this forum says that if you align with tophat you don't need to cut the adapters

    Application of sequencing to RNA analysis (RNA-Seq, whole transcriptome, SAGE, expression analysis, novel organism mining, splice variants)



    "If you ignore the adapters , using the alignment in Tophat, actually filters the adapters out becuase
    they are not in the transcriptome, so when you are aligning your sequence ot a trasncriptome, the adapters will not get aliged
    because they are not in the transcriptome"

    Leave a comment:


  • figo1019
    replied
    Originally posted by dpryan View Post
    You're pretty unlikely to find the entire adapter sequence in any of the reads. You'll want to look into something like cutadapt or trim_galore to make your life easier.
    Hey Thanks dpryan ... I tried trim_galore today ... but still in the fastqc file I am getting these over represented sequences

    ATGACACTCAAACAGGCATGCTCCACGGAATACCATGGAGCGCAAGGTGC 1155666 2.5956349017221085 No Hit
    AATGACGCTCGAACAGGCATGCCCCTCGGAATACCAAGGGGCGCAATGTG 225179 0.5057538004361837 No Hit
    AAGACACTCAAACAGGCATGCCTCTCGGAATACCAAGAGGCGCAAGGTGC 218636 0.4910581711090531 No Hit
    GATCGTCGGACTGTAGAACTCTGAACGTGTAGATCTCGGTGGTCGCCGTATCATTAAAAA 119619 0.2686652123616139 Illumina RNA PCR Primer (100% over 50bp)
    GATCGTCGGACTGTAGAACTCTGAACGTGTAGATCTCGGTGGTCGCCGTATCATTAAAAAA 111925 0.251384428005364 Illumina RNA PCR Primer (100% over 50bp)
    AAATGACGCTCAAACAGGCATGCCCTTTGGAATACCAAAGGGCGCAATGT 104210 0.2340564774843778 No Hit
    ACAAACCCTTGTGTCGAGGGCTGACTTTCAATAGATCGCAGCGAGGGAGC 71881 0.16144528987673504 No Hit
    GATCGTCGGACTGTAGAACTCTGAACGTGTAGATCTCGGTGGTCGCCGTATCATTAAA 46463 0.10435626248303084 Illumina RNA PCR Primer (100% over 50bp)

    So , do i need to remove all these also from my sequences. I hope i am not too much bugging you.

    Regards

    Leave a comment:


  • dpryan
    replied
    You're pretty unlikely to find the entire adapter sequence in any of the reads. You'll want to look into something like cutadapt or trim_galore to make your life easier.

    Leave a comment:


  • figo1019
    started a topic Illumina adapter trimming

    Illumina adapter trimming

    Hi All,

    I am a total newbies in this field. I have to assemble RNA seq data. Before that I need to trim the sequences. I have got 100bp illumina paired end reads in two files. I also got the adaptors sequences P5 and P7.
    5-AATGATACGGCGACCACCGAGATCTACACGTTCAGAGTTCTACAGTCCGACGATC-(insert)-ACCTTAAGAGCCCACGGTTCCTTGAGGTCAGTGXXXXXXTAGAGCATACGGCAGAAGACGAAC-3

    But when for example I use the grep -c 'AATGATACGGCGACCACCGAGATCTACACGTTCAGAGTTCTACAGTCCGACGATC' file_name to count the adapters.i cannot find a single one. I am totally a fresher if any one can help me out in detail. I tried to read the on the forums different answers but I am confused.

    regards

Latest Articles

Collapse

  • seqadmin
    Recent Developments in Metagenomics
    by seqadmin





    Metagenomics has improved the way researchers study microorganisms across diverse environments. Historically, studying microorganisms relied on culturing them in the lab, a method that limits the investigation of many species since most are unculturable1. Metagenomics overcomes these issues by allowing the study of microorganisms regardless of their ability to be cultured or the environments they inhabit. Over time, the field has evolved, especially with the advent...
    09-23-2024, 06:35 AM
  • seqadmin
    Understanding Genetic Influence on Infectious Disease
    by seqadmin




    During the COVID-19 pandemic, scientists observed that while some individuals experienced severe illness when infected with SARS-CoV-2, others were barely affected. These disparities left researchers and clinicians wondering what causes the wide variations in response to viral infections and what role genetics plays.

    Jean-Laurent Casanova, M.D., Ph.D., Professor at Rockefeller University, is a leading expert in this crossover between genetics and infectious...
    09-09-2024, 10:59 AM

ad_right_rmr

Collapse

News

Collapse

Topics Statistics Last Post
Started by seqadmin, 10-02-2024, 04:51 AM
0 responses
8 views
0 likes
Last Post seqadmin  
Started by seqadmin, 10-01-2024, 07:10 AM
0 responses
13 views
0 likes
Last Post seqadmin  
Started by seqadmin, 09-30-2024, 08:33 AM
0 responses
18 views
0 likes
Last Post seqadmin  
Started by seqadmin, 09-26-2024, 12:57 PM
0 responses
16 views
0 likes
Last Post seqadmin  
Working...
X