Seqanswers Leaderboard Ad

**proteasome** · 05-25-2011, 01:19 PM

If you're generally looking to cleanup 454 data I would suggest using Galaxy http://main.g2.bx.psu.edu/to convert your sff files to fastq, and then using fastq filters and tools to remove short reads or low quality reads. You can also mask low quality bases (such as in homo-polymers) to Ns without loosing reads.

**Himalaya** · 05-26-2011, 12:09 AM

thanx proteasome for the reply..galaxy is great online tool...problem is uploading files of huge size..

**essvee** · 05-27-2011, 04:42 AM

I suggest trying SeqTrim.
You can set minimum quality based on a defined window size, minimum length, etc.
You can also run it command line, or online.

Object not found!

https://www.scbi.uma.es/seqtrim/

**DZhang** · 05-28-2011, 06:40 PM

Hi,

Check out fastx toolkits (http://hannonlab.cshl.edu/fastx_toolkit/) and SolexaQA (http://solexaqa.sourceforge.net/). Both have simple but neat scripts to do read trimming.

Douglas

https://www.contigexpress.com

**Jose Blanca** · 05-31-2011, 10:54 PM

We have done our own read cleaning pipeline. It works for us, so we have made it available just in case it could be of any use to other people. It is called clean_reads.

**robs** · 06-06-2011, 02:07 PM

I like PRINSEQ (http://prinseq.sourceforge.net/). It comes as web and standalone version and does all the QC and data pre-processing that you need.

The application note also contains a short comparison with similar tools (http://bioinformatics.oxfordjournals.../27/6/863.long).

**Himalaya** · 06-08-2011, 02:39 AM

Originally posted by Jose Blanca View Post

We have done our own read cleaning pipeline. It works for us, so we have made it available just in case it could be of any use to other people. It is called clean_reads.

Hi Jose Blanca..I installed clean_reads with Biopython and psubprocess preinstalled according to requirement but resulted to segmentation fault. have you run the program? Please advice me about the fault if you run it clean. thank you

**Jose Blanca** · 06-08-2011, 02:53 AM

I would need more information. A segmentation fault is quite a strange error in a python program. could you send me the output?

**Himalaya** · 06-08-2011, 03:30 AM

Originally posted by Jose Blanca View Post

I would need more information. A segmentation fault is quite a strange error in a python program. could you send me the output?

Hi Jose
I am using mac os snow leopard. My commandline is: clean_reads -i Pair01.fastq -o ./clean_reads/output_q20_len50_only3end -p 454 -f fastq -g fastq -qual_threshold 20 -only_3_end True -min_len 50. It only gave me one line error 'segmentation fault' and says python quit unexpectedly in separate window with long error report. A small last part of error report is below:
0x7fff8507b000 - 0x7fff85131fff libobjc.A.dylib 227.0.0 (compatibility 1.0.0) <1960E662-D35C-5D98-EB16-D43166AE6A22> /usr/lib/libobjc.A.dylib
0x7fff85288000 - 0x7fff85446fff libicucore.A.dylib 40.0.0 (compatibility 1.0.0) <3D9313BF-97A4-6B65-E583-F6173E64C3C2> /usr/lib/libicucore.A.dylib
0x7fff8643f000 - 0x7fff86461ff7 libexpat.1.dylib 7.2.0 (compatibility 7.0.0) <7D173736-CBDF-F02F-2D07-B38F565D5ED4> /usr/lib/libexpat.1.dylib
0x7fff86462000 - 0x7fff864aaff7 libvDSP.dylib 268.0.1 (compatibility 1.0.0) <98FC4457-F405-0262-00F7-56119CA107B6> /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libvDSP.dylib
0x7fff87df1000 - 0x7fff87df1ff7 com.apple.Accelerate 1.6 (Accelerate 1.6) <15DF8B4A-96B2-CB4E-368D-DEC7DF6B62BB> /System/Library/Frameworks/Accelerate.framework/Versions/A/Accelerate
0x7fff8846a000 - 0x7fff88544fff com.apple.vImage 4.0 (4.0) <B5A8B93B-D302-BC30-5A18-922645DB2F56> /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vImage.framework/Versions/A/vImage
0x7fff88545000 - 0x7fff88d4ffe7 libBLAS.dylib 219.0.0 (compatibility 1.0.0) <2F26CDC7-DAE9-9ABE-6806-93BBBDA20DA0> /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
0x7fffffe00000 - 0x7fffffe01fff libSystem.B.dylib ??? (???) <40DA878D-6D69-FEA3-398B-BBD80C9BFF46> /usr/lib/libSystem.B.dylib

Then i tried to run clean_reads in ubuntu with same command and gave me error:
IOError: [Errno 2] No such file or directory: 'ual_threshold'
i did specify any file 'ual_threshold'. That was option -qual_threshold' i specified.
any advice..please

**Jose Blanca** · 06-08-2011, 04:41 AM

In mac it won't work, because the binaries shiped inside clean_reads are only for linux.
Regarding the linux problem, it's a malformed command line. It should be:
--qual_threshold
instead of:
-qual_threshold
Regards.

**Himalaya** · 06-08-2011, 04:59 AM

Originally posted by Jose Blanca View Post

In mac it won't work, because the binaries shiped inside clean_reads are only for linux.
Regarding the linux problem, it's a malformed command line. It should be:
--qual_threshold
instead of:
-qual_threshold
Regards.

Hi Jose, Thanks a lot. In linux it seems to work now..For the same command again, it gives me output " parameter qual_threshold is incompatible with platform long_with_quality". I tested the --qual_threshold value from 10 to 100 and repeatedly gave the same output.
any advice on this..Thanks for helping me out to run the program.

**Jose Blanca** · 06-08-2011, 10:31 PM

You can not use the --qual_thrshold parameter for long reads (sanger or 454). I have to explain that in the documentation a little. For the long reads we trim the bad quality regions by using lucy so the parameter to change would be lucy_error. qual_threshold is used by the short reads quality trimmer.

**Himalaya** · 06-09-2011, 04:54 AM

Originally posted by Jose Blanca View Post

You can not use the --qual_thrshold parameter for long reads (sanger or 454). I have to explain that in the documentation a little. For the long reads we trim the bad quality regions by using lucy so the parameter to change would be lucy_error. qual_threshold is used by the short reads quality trimmer.

Hi Jose
I am trying to do quality trimming and filtering 454 reads. The adaptors and primers and barcode sequences are already removed.I am not allowed to specific minimum quality threshold to clean bad quality reads. I don't understand why? How does it do quality trimming. Sorry I could not get documentation of clean_reads. And when i specify option -only_3_end True, it gave me error not compatible with platform. So does it mean it trims from 5' and 3' prime ends.

thnx

**Jose Blanca** · 06-09-2011, 05:22 AM

Sorry, I have not explained myself well enough.
clean_reads uses two different algorithms for quality trimming. One for long reads (lucy) and a different one for short reads. If you're cleaning long reads, the parameters aplicable are the lucy parameters: lucy_error, lucy_window and lucy_bracket. These are the parameters that you should tweak to modify the cleaning behaviour when dealing with 454 and sanger reads.

For illumina and solid we didin't manage to use lucy so we implemented a sliding window trimming function. Its parameters are qual_window, qual_threshold, and only_3_end. That's why these parameters can only be used with short reads.

Topics	Statistics	Last Post
Genetic Variants and Diabetes Risk in Childhood Cancer Survivors by seqadmin Started by seqadmin, Yesterday, 08:47 AM	0 responses 16 views 0 likes	Last Post by seqadmin Yesterday, 08:47 AM
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 60 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 60 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 54 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM

Seqanswers Leaderboard Ad

Announcement

454 Data cleaning

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News