Unconfigured Ad

**hypatia** · 10-04-2010, 12:55 PM

Hi, I have the same question as you. Did you figure it out?

**jdanderson** · 10-04-2010, 05:57 PM

Hello Swarbre and Hypatia,

I think Bowtie/Tophat may do some quality trimming in its processing, but if you want to have some more control of the quality of reads that pass you can use FASTX Toolkit:

FASTX-Toolkit

http://hannonlab.cshl.edu/fastx_toolkit/index.html

I just finished using its Barcode Splitter module with good results, and its fairly user friendly. ALso there are some useful tools on Galaxy, by Penn State:

Galaxy

http://main.g2.bx.psu.edu/

Galaxy is a community-driven web-based analysis platform for life science research.

Hope this helps.

Regards,
Johnathon

**hypatia** · 10-05-2010, 05:41 AM

Thanks jdanderson,

I am using Galaxy and I want to follow the quality steps suggested in Wilhelm Nature Protocol papaer for Illumina, but I can not figure it out in Galaxy. (I am new to this world, statistician a lot of other omics data of experience)

**jdanderson** · 10-05-2010, 06:48 AM

Hello Hypatia,

So I just gave the Wilhelm paper (Defining transcribed regions using RNA-seq) a quick look over and it appears that they are talking about the quality filtration that occurs in Gerald, which is part of Illumina's CASAVA program package. If you use Gerald you should get four different output files: export.txt, extended.txt, sequence.txt, sorted.txt.

The sequence.txt file is the one that has the quality filtering already done on it by Gerald (and most people seem to use this file). Also, I believe there is a video tutorial on trimming quality reads in the Galaxy site (although i can't remember how detailed it was).

As for specific SRA files, you may want to look over the protocol for the reformatting procedure as this may provide some insight as to which file type is suggested people start with. I have not looked over their protocol yet, but it should be on their site.

Hope this helps.

Regards,
Johnathon

Topics	Statistics	Last Post
New AI Model Captures Long-Range Genomic Signals to Improve RNA Splice Site Prediction by SEQadmin2 Started by SEQadmin2, Today, 05:37 AM	0 responses 5 views 0 reactions	Last Post by SEQadmin2 Today, 05:37 AM
Large-Scale Protein Screen Uncovers Hidden Regulators of Alternative Polyadenylation by SEQadmin2 Started by SEQadmin2, 06-26-2026, 11:10 AM	0 responses 16 views 0 reactions	Last Post by SEQadmin2 06-26-2026, 11:10 AM
Whole-Genome Sequencing Traces Faroe Islands Ancestry to a North Atlantic Founder Population by SEQadmin2 Started by SEQadmin2, 06-17-2026, 06:09 AM	0 responses 49 views 0 reactions	Last Post by SEQadmin2 06-17-2026, 06:09 AM
Sequencing the Two-Toed Sloth Genome Reveals Jumping Genes Tied to Its Extreme Metabolism by SEQadmin2 Started by SEQadmin2, 06-09-2026, 11:58 AM	0 responses 109 views 0 reactions	Last Post by SEQadmin2 06-09-2026, 11:58 AM

Unconfigured Ad

Filtering RNA-seq data

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News