Unconfigured Ad

**zee** · 05-05-2009, 06:12 AM

Layla,

If you have tried that command by now you know it would never work.

At most 2 bfq files can be given and these are assumed to contain paired-end reads. If one file is given then just single-lane.

So, if you have all your *sequence.txt files use a for loop on these astq files:

for file in `ls *sequence.txt`
do
maq fastq2bfq $file $file.bfq
maq map $file.map genome.bfa $file.bfq
done

**dakl** · 05-05-2009, 06:24 AM

Zee,

Would it be possible to parallelize this over several CPU cores in a simple manner? Kind of like PBS for cluster jobs, but locally.

cheers
D

**zee** · 05-05-2009, 06:37 AM

Dakl,

I use novoalign for most of my mutli-core jobs, but it should be possible to do something similar with maq.

If you have a large database to search you might run into some problems. I would split all my files into batches of N-1, N= no. CPUs - I like to keep one free for system, IO,etc.

ls *.fastq | split -l <N-1> BATCH

Then for each batch run a loop

for file in `cat BATCH...`
do
maq fastq2bfq $file $file.bfq
maq map $file.map genome.bfa $file.bfq
done &

And dont forget the "&" which places each loop in the background.

**Layla** · 05-13-2009, 07:35 AM

Hi zee,

Thankyou for your reply but I have been receiving a bizarre error:Assertion failed: (fp_bfa), function ma_match, file match.cc, line 516

for file in `ls *.bfq`
do
./maq map $file.map genome.bfa $file.bfq
done

From the paired end experiment I have a total of 3 pairs stored in one folder:
s_1.bfq s_2.bfq
t_1.bfq t_2.bfq
u_1.bfq u_2.bfq

I didnt understand how/where the ./maq map loop looks at s_1.bfq s_2.bfq and then t_1.bfq t_2.bfq, u_1.bfq u_2.bfq.

Thank you for your help

L

**zee** · 05-13-2009, 07:42 AM

OK, it is a simple change to the ff:

Code:

for base in `echo s t u`; do
  ./maq map $base.map genome.bfa $base"_1.bfq" $base"_2.bfq"
done

Originally posted by Layla View Post

Hi zee,

Thankyou for your reply but I have been receiving a bizarre error:Assertion failed: (fp_bfa), function ma_match, file match.cc, line 516

for file in `ls *.bfq`
do
./maq map $file.map genome.bfa $file.bfq
done

From the paired end experiment I have a total of 3 pairs stored in one folder:
s_1.bfq s_2.bfq
t_1.bfq t_2.bfq
u_1.bfq u_2.bfq

I didnt understand how/where the ./maq map loop looks at s_1.bfq s_2.bfq and then t_1.bfq t_2.bfq, u_1.bfq u_2.bfq.

Thank you for your help

L

**Layla** · 05-13-2009, 08:23 AM

Thanx Zee. Since multiple runs are carried out to increase the number of reads, why is it that a separate .map file is being created for each pair (so a total of 3)? Is the purpose not to merge all the pairs and generate a single .map file to increase genome coverage?

Actually whilst writing, is this where I can use the ./maq merge command?

Cheers
L

**jnfass** · 05-13-2009, 10:34 AM

You got it ... after you've generated all the maps, use maq merge to combine them into one map, from which you can generate a pileup, consensus, etc ...

**dakl** · 05-14-2009, 12:18 AM

Hi all,

Since Maq is optimized for ~2M reads as input, I managed to do the following:

Code:

time maq fastq2bfq -n 2000000 ../50a_fastq.single.fastq 50a

to create several bfq-files containing the reads, and then use the perl module Parallel::ForkManager to fork the process. See the script below for details.

Code:

#!/usr/bin/perl -w

use strict;
use Parallel::ForkManager;

my $pm = new Parallel::ForkManager(4); # number of parallel processes is 4
while(<>){
        chomp;

        # Forks and returns the pid for the child:
        my $pid = $pm->start and next; 
        
        qx/ maq match -c $_.map ~\/hg18\/hg18.bfa $_/; 
        
    $pm->finish; # Terminates the child process
}

Topics	Statistics	Last Post
High-Resolution Sequencing Exposes Hidden Toxoplasma Diversity by SEQadmin2 Started by SEQadmin2, 07-02-2026, 11:08 AM	0 responses 11 views 0 reactions	Last Post by SEQadmin2 07-02-2026, 11:08 AM
New AI Model Captures Long-Range Genomic Signals to Improve RNA Splice Site Prediction by SEQadmin2 Started by SEQadmin2, 06-30-2026, 05:37 AM	0 responses 13 views 0 reactions	Last Post by SEQadmin2 06-30-2026, 05:37 AM
Large-Scale Protein Screen Uncovers Hidden Regulators of Alternative Polyadenylation by SEQadmin2 Started by SEQadmin2, 06-26-2026, 11:10 AM	0 responses 20 views 0 reactions	Last Post by SEQadmin2 06-26-2026, 11:10 AM
Whole-Genome Sequencing Traces Faroe Islands Ancestry to a North Atlantic Founder Population by SEQadmin2 Started by SEQadmin2, 06-17-2026, 06:09 AM	0 responses 54 views 0 reactions	Last Post by SEQadmin2 06-17-2026, 06:09 AM

Unconfigured Ad

multiple runs and maq

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News