Unconfigured Ad

**Torst** · 09-20-2010, 11:40 PM

Originally posted by mmartin View Post

I'm pleased to announce the tool 'cutadapt
http://cutadapt.googlecode.com/

It seems your code only runs under Python 2.6 ?

For Centos 5.x, which is a bit behind, I had to install the "python26" packages and change the #!/usr/bin/python to #!/usr/bin/python26.

**mmartin** · 09-20-2010, 11:54 PM

Yes, Python 2.6 is needed, thanks for the pointer. It wouldn't be hard to support Python 2.5, but some 2.6 features make the transition to the Python 3 syntax easier, so I would like to stick to it. I have updated the homepage to reflect the requirement of Python 2.6.

**HiroMishima** · 11-16-2010, 08:16 PM

3'-end partial match of adapters

Hi,

I have a question about Cutadapt version 0.3.

Does Cutadapt cut partial sequences of adapters?

According to "Statistics for adapter" messages, Cutadapt seems to recognize 3'-end partial match of adapters. However, only full-matched adapter sequences are removed in output files.

**mmartin** · 11-17-2010, 03:35 AM

Yes, cutadapt recognizes partial adapters. That is, if your adapter is ADAPTER and your read is MYSEQUENCEADAP, then the resulting sequence is MYSEQUENCE. In fact, these are some examples of input sequences that will result in MYSEQUENCE:
MYSEQUENCEADAPTER
MYSEQUENCEADAP
MYSEQUENCEADPAPTERSOMETHINGELSE

Could you give an example of the problematic read you encounter and the output of cutadapt for that read?

**HiroMishima** · 11-17-2010, 04:32 AM

Originally posted by mmartin View Post

Could you give an example of the problematic read you encounter and the output of cutadapt for that read?

I found that I used two -a options and used adapter sequences were almost reverse complement each other. Probably I do not have to use two -a options in this case. Hopefully, these examples clarify the situation.

sample.fastq:

Code:

@read1
GATCCTCCTGGAGCTGGCTGATACCAGTATACCAGTGCTGATTGTTGAATTTCAGGAATTTCTCAAGCTCGGTAGC
+
hhhhhhhhhhahhhhhehhffhghhehdgghhheddggfhfhhgffhddhhfffhhffhfgggffddfdfffcdfb
@read2
CTCGAGAATTCTGGATCCTCTCTTCTGCTACCTTTGGGATTTGCTTGCTCTTGGTTCTCTAGTTCTTGTAGTGGTG
+
hhhhhhhhhhhhhhhhhhhhhhhhhhgghghhhhhhhhgaddeeadaa^dadaa_aaaaababca_aa__^[T^[Z

And next result is OK:

Code:

$python cutadapt -a CTCGAGAATTCTGGATCCTC sample.fastq

@read1
CTGGAGCTGGCTGATACCAGTATACCAGTGCTGATTGTTGAATTTCAGGAATTTCTCAAGCTCGGTAGC
+
hhhahhhhhehhffhghhehdgghhheddggfhfhhgffhddhhfffhhffhfgggffddfdfffcdfb
@read2
TCTTCTGCTACCTTTGGGATTTGCTTGCTCTTGGTTCTCTAGTTCTTGTAGTGGTG
+
hhhhhhgghghhhhhhhhgaddeeadaa^dadaa_aaaaababca_aa__^[T^[Z

However, in next results, read1 still contains "GATCCTC" in the 5' end:

Code:

$python cutadapt -a CTCGAGAATTCTGGATCCTC -a GAGGATCCAGAATTCTCGAGTT sample.fastq

@read1
GATCCTCCTGGAGCTGGCTGATACCAGTATACCAGTGCTGATTGTTGAATTTCAGGAATTTCTCAAGCTCGGTAGC
+
hhhhhhhhhhahhhhhehhffhghhehdgghhheddggfhfhhgffhddhhfffhhffhfgggffddfdfffcdfb
@read2
TCTTCTGCTACCTTTGGGATTTGCTTGCTCTTGGTTCTCTAGTTCTTGTAGTGGTG
+
hhhhhhgghghhhhhhhhgaddeeadaa^dadaa_aaaaababca_aa__^[T^[Z

**mmartin** · 11-17-2010, 08:51 AM

Hi, actually, you do have to use two -a options since currently reverse complements are not automatically searched for.

I managed to reproduce the problem you encountered and I have prepared a new release that hopefully fixes it. You can download v0.4 from the homepage and see whether the bug is actually fixed. Thanks for reporting this!

**gaffa** · 11-17-2010, 09:09 AM

I haven't looked into the details of the program, but I wonder how straightforward it would be to use the program to filter out and discard the entire reads that match an adapter, rather just removing that part and re-using the trimmed read?

**mmartin** · 11-17-2010, 10:12 AM

Since this isn't too hard, I just added that feature. cutadapt now has the option "--discard", which does exactly that: If an adapter is found in the read, then the read is discarded and not trimmed.

**HiroMishima** · 11-17-2010, 05:30 PM

Originally posted by mmartin View Post

Hi, actually, you do have to use two -a options since currently reverse complements are not automatically searched for.

I managed to reproduce the problem you encountered and I have prepared a new release that hopefully fixes it. You can download v0.4 from the homepage and see whether the bug is actually fixed. Thanks for reporting this!

Everything's perfect! cutadapt 0.5.1 worked well with two -a options.

I believe that cutadapt is one of the best adopter sequence trimmer especially in term of simpleness and speed.

Thanks again for prompt update.

**sdavis** · 11-18-2010, 05:24 AM

This looks a very useful tool. Could I suggest that you accept gzipped fastq files as an alternative input format as a simple convenience?

**mmartin** · 11-18-2010, 08:57 AM

Good idea. Since this was on my to do list as well, I have just implemented this feature and released cutadapt 0.6.

**bioinfosm** · 11-18-2010, 11:35 AM

cool, that was fast!

**thinkRNA** · 11-18-2010, 11:51 AM

can you please add an option to remove all N's or C's etc? I think this will be helpful. Also, can you describe in detail how error rate is calculated?

**gaffa** · 11-18-2010, 12:21 PM

Originally posted by mmartin View Post

Since this isn't too hard, I just added that feature. cutadapt now has the option "--discard", which does exactly that: If an adapter is found in the read, then the read is discarded and not trimmed.

Fantastic!

Topics	Statistics	Last Post
High-Resolution Sequencing Exposes Hidden Toxoplasma Diversity by SEQadmin2 Started by SEQadmin2, 07-02-2026, 11:08 AM	0 responses 7 views 0 reactions	Last Post by SEQadmin2 07-02-2026, 11:08 AM
New AI Model Captures Long-Range Genomic Signals to Improve RNA Splice Site Prediction by SEQadmin2 Started by SEQadmin2, 06-30-2026, 05:37 AM	0 responses 12 views 0 reactions	Last Post by SEQadmin2 06-30-2026, 05:37 AM
Large-Scale Protein Screen Uncovers Hidden Regulators of Alternative Polyadenylation by SEQadmin2 Started by SEQadmin2, 06-26-2026, 11:10 AM	0 responses 20 views 0 reactions	Last Post by SEQadmin2 06-26-2026, 11:10 AM
Whole-Genome Sequencing Traces Faroe Islands Ancestry to a North Atlantic Founder Population by SEQadmin2 Started by SEQadmin2, 06-17-2026, 06:09 AM	0 responses 54 views 0 reactions	Last Post by SEQadmin2 06-17-2026, 06:09 AM

Unconfigured Ad

cutadapt: A tool that removes adapter sequences

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News