Clip truseq adapter sequence from fastq

uniportdb

Junior Member

Join Date: Jun 2011

Posts: 5
- Share
- Tweet
#1

Clip truseq adapter sequence from fastq

01-28-2013, 12:55 PM

Here's my public history https://main.g2.bx.psu.edu/u/davidkim/h/ngsworkshop

I am trying to learn NGS using the public data https://www.ncbi.nlm.nih.gov/geo/que...i?acc=GSE39083

Fastqc on EBI_SRA__SRP014008_File__SRR518493.fastq.gz__1
found TruSeq adapters in the reads.
Sequence Count Percentage Possible Source
GATCGGAAGAGCACACGTCTGAACTCCAGTCACCCATCATAT 347588 3.340743581 TruSeq Adapter, Index 2 (97% over 37bp)
CGGCAGTCCACTCCGGTACGCTATCCCACTACTGCCTACCAC 159445 1.532460443 No Hit (possibly ncRNA? )

I did 'clip sequence' (with default setting and output only non-clipped sequences) and then I got only 83% of reads. But the fastqc showed only 3.3% were the truseq adapter sequences. Why did 'clip sequence' tool discard 17% of reads?

Clipping Adapter GATCGGAAGAGCACACGTCTGAACTCCAGTCACCCATCATAT
Input 10404510 reads.
Output 8683051 reads.
discarded 451106 too-short reads.
discarded 386260 adapter-only reads.
discarded 868903 clipped reads.

Thanks,
Tags: adapter trimming, clip, trueseq
fkrueger

Senior Member

Join Date: Sep 2009

Posts: 627
- Share
- Tweet
#2

01-29-2013, 01:28 AM

FastQC flags up sequences that make up more than 0.1% of your total library and it does so by looking for exact matches of the entire sequence. If you have adapter contamination at various positions in your read the total sequence is likely to be always different, or maybe just doesn't quite add up to 0.1%. The k-mer plot may sometimes help you identify adapter contamination that starts at various positions in the read. If you use tools like Cutadapt or Trim Galore you will get a histogram-type output that shows you exactly which part of the adapter was found and removed.
Comment

Previous template Next

Best Practices for Single-Cell Sequencing Analysis

by seqadmin

While isolating and preparing single cells for sequencing was historically the bottleneck, recent technological advancements have shifted the challenge to data analysis. This highlights the rapidly evolving nature of single-cell sequencing. The inherent complexity of single-cell analysis has intensified with the surge in data volume and the incorporation of diverse and more complex datasets. This article explores the challenges in analysis, examines common pitfalls, offers...
- Channel: Articles
06-06-2024, 07:15 AM
Latest Developments in Precision Medicine

by seqadmin

Technological advances have led to drastic improvements in the field of precision medicine, enabling more personalized approaches to treatment. This article explores four leading groups that are overcoming many of the challenges of genomic profiling and precision medicine through their innovative platforms and technologies.

Somatic Genomics
“We have such a tremendous amount of genetic diversity that exists within each of us, and not just between us as individuals,”...
- Channel: Articles
05-24-2024, 01:16 PM

Topics	Statistics	Last Post
The Adaptation of the Cell Cycle in Multiciliated Cells by seqadmin Started by seqadmin, 06-07-2024, 06:58 AM	0 responses 13 views 0 likes	Last Post by seqadmin 06-07-2024, 06:58 AM
New Method for DNA Sequence Amplification by seqadmin Started by seqadmin, 06-06-2024, 08:18 AM	0 responses 21 views 0 likes	Last Post by seqadmin 06-06-2024, 08:18 AM
New Tools Enhance Single-Molecule DNA Analysis with Minimal Samples by seqadmin Started by seqadmin, 06-06-2024, 08:04 AM	0 responses 20 views 0 likes	Last Post by seqadmin 06-06-2024, 08:04 AM
SIX2 Protein Identified as a Key Player in Prostate Cancer Treatment Resistance by seqadmin Started by seqadmin, 06-03-2024, 06:55 AM	0 responses 14 views 0 likes	Last Post by seqadmin 06-03-2024, 06:55 AM

Seqanswers Leaderboard Ad

Announcement

Clip truseq adapter sequence from fastq

Comment

Latest Articles

ad_right_rmr

News