Seqanswers Leaderboard Ad

**StevenW** · 08-20-2015, 02:39 AM

Hi,
I am responsible for developing FastQ Screen.

The standard way to remove contamination is:
Run FastQ Screen (latest version) with --subset (to process the entire dataset) and --nohits. In the config file include the Bowtie1/2 indices of all the potential contaminants (human genome indices should not be included).

A FastQ file should then be produced containing all the reads that did not map to any of the contaminants.

I am wondering, why do you need to only remove the hits that are classified as 'one-hit/one-library' AND 'multiple-hits/one-library'? Also is this single-end data?

Please feel free to contact me directly to discuss this further.

Kindest regards,
Steven W

**touchsk** · 09-02-2015, 10:14 AM

Thanks Steven

I apologize for the delayed reply.

Yes, I am doing similar to what you have mentioned. I have two config files in place:
- one with just the contaminants for filtering the fastq file as you mentioned
- and one with contaminants and mammalian genomes to generate a figure using your tool that depicts the level of contamination in comparison to its real hit in a mammal (similar plot to the example on the fast_screen webpage.

This is single-end data.

The question on the single-hit was motivated because we have some contaminant-like sequences (custom to our study), that we want to quantify but not remove if they have homologs in mammalian genomes. But again using a similar approach as above, I am able to tackle that (stepwise).

Thanks for a very useful tool.

**StevenW** · 09-07-2015, 04:42 AM

Excellent, so everything is fine?

PS I'm releasing a new version of the software today.

**turnersd** · 09-21-2015, 06:32 AM

hey, what ever happened to the --paired option? Is it still possible to screen paired end data?

**StevenW** · 09-28-2015, 07:05 AM

Hi,
Thanks for your message; I am part of the team responsible for developing FastQ Screen.
We removed the --paired option from the script in a recent update as we felt it was unnecessary and was causing confusion. Mapping forward or reverse reads independently should be perfectly adequate to ascertain whether there is contamination, and will also provide the user with additional information if the forward reads are more prone generally to contamination than the reverse reads (or vice versa). Also, some users were reporting that the script was sometimes failing to detect contamination in --paired mode. For example, if the read pair did not constitute a contiguous region of DNA, or if the paired reads were separated by are large distance (such as RNA seq).
So we now recommend that you screen both read files independently.
Is there any particular reason you would have to use the –paired mode?

**jaas** · 10-02-2017, 04:31 AM

FastQ Screen for bisulfite samples

Hello,

I'm trying to use fastq screen for bisulfite sequencing samples. I've run the test data and that works fine. However, I get a file handle error when running my bisulfite samples:
my code:

PHP Code:


fastq_screen --bisulfite G3.S22.fastq.gz

Output:

PHP Code:


Using fastq_screen v0.11.2

Defaulting to Bowtie 2 for --bisulfite mode

Reading configuration from '/data/Bismark/fastq_screen_v0.11.2/fastq_screen.conf'

Using '/usr/lib/bowtie2/bin/bowtie2' as Bowtie 2 path

Using '/data/Bismark/bismark' as Bismark path

Adding database Daphnia

Using 8 threads for searches

Option --subset set to 100000 reads

Processing G3.S22.fastq.gz

Counting sequences in G3.S22.fastq.gz

Making reduced sequence file with ratio 69:1

Searching G3.S22.fastq.gz_temp_subset.fastq against Daphnia

open: No such file or directory

[main_samview] fail to open "/data/Bismark/fastq_screen_v0.11.2/Daphnia.G3.S22.fastq.gz_temp_subset_bismark_bt2.bam" for reading.

Cannot close filehandle on '/data/Bismark/fastq_screen_v0.11.2/Daphnia.G3.S22.fastq.gz_temp_subset_bismark_bt2.bam' :  at fastq_screen line 1059.

I do get the outputfile and mapping report of the subsample against the first database, so it seems that the mapping did work. It happens regardless of the databases I use. However when using my samples in non bisulfite mode, and mapping them against the regular genome indices, this does not happen. So I do not think my sample file is this issue. Also, I know my bismark genome build indices are fine as I used them with bismark as well.

Any ideas on what is wrong or why this is happening?

Thanks!

**StevenW** · 10-02-2017, 04:54 AM

FastQ Screen Bisulfite Problem

Hi jaas,

I am one of the developers of FastQ Screen. Hopefully we can get this problem resolved quickly.

Would you be able to send me the configuration file you used when running FastQ Screen. This will help me resolve the problem.

Many thanks,

Steven

**jaas** · 10-02-2017, 05:16 AM

Here's the config file. I can't figure out how to send it to you alone. I have changed the extension to a txt file to be able to upload it.

Thanks in advance for your help

Attached Files

fastq_screen.conf.txt (3.2 KB, 58 views)

Topics	Statistics	Last Post
Study Highlights Challenges in Cellular Reprogramming for Regenerative Medicine by seqadmin Started by seqadmin, Today, 06:25 AM	0 responses 13 views 0 likes	Last Post by seqadmin Today, 06:25 AM
New DNA Modification Discovered as Key to Gene Activation in Early Development by seqadmin Started by seqadmin, Yesterday, 01:02 PM	0 responses 12 views 0 likes	Last Post by seqadmin Yesterday, 01:02 PM
Wastewater Analysis Unlocks New Method for Identifying Public Health Threats by seqadmin Started by seqadmin, 09-18-2024, 06:39 AM	0 responses 14 views 0 likes	Last Post by seqadmin 09-18-2024, 06:39 AM
Molecular Markers Shared Across Dementias by seqadmin Started by seqadmin, 09-11-2024, 02:44 PM	0 responses 14 views 0 likes	Last Post by seqadmin 09-11-2024, 02:44 PM

Seqanswers Leaderboard Ad

Announcement

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News