Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • Layla
    Member
    • Sep 2008
    • 58

    multiple runs and maq

    How do you do an alignment (./ maq map command) in maq when you have run more than one lane on a Solexa machine on the same sample?
    Can you simply append the reads,

    ./maq map n.read1.bfq in.read2.bfq in.read3.bfq etc etc?


    Cheers
    L
  • zee
    NGS specialist
    • Apr 2008
    • 249

    #2
    Layla,

    If you have tried that command by now you know it would never work.

    At most 2 bfq files can be given and these are assumed to contain paired-end reads. If one file is given then just single-lane.

    So, if you have all your *sequence.txt files use a for loop on these astq files:

    for file in `ls *sequence.txt`
    do
    maq fastq2bfq $file $file.bfq
    maq map $file.map genome.bfa $file.bfq
    done

    Comment

    • dakl
      Member
      • May 2009
      • 15

      #3
      Zee,

      Would it be possible to parallelize this over several CPU cores in a simple manner? Kind of like PBS for cluster jobs, but locally.

      cheers
      D

      Comment

      • zee
        NGS specialist
        • Apr 2008
        • 249

        #4
        Dakl,

        I use novoalign for most of my mutli-core jobs, but it should be possible to do something similar with maq.

        If you have a large database to search you might run into some problems. I would split all my files into batches of N-1, N= no. CPUs - I like to keep one free for system, IO,etc.


        ls *.fastq | split -l <N-1> BATCH


        Then for each batch run a loop

        for file in `cat BATCH...`
        do
        maq fastq2bfq $file $file.bfq
        maq map $file.map genome.bfa $file.bfq
        done &
        And dont forget the "&" which places each loop in the background.

        Comment

        • Layla
          Member
          • Sep 2008
          • 58

          #5
          Hi zee,

          Thankyou for your reply but I have been receiving a bizarre error:Assertion failed: (fp_bfa), function ma_match, file match.cc, line 516

          for file in `ls *.bfq`
          do
          ./maq map $file.map genome.bfa $file.bfq
          done

          From the paired end experiment I have a total of 3 pairs stored in one folder:
          s_1.bfq s_2.bfq
          t_1.bfq t_2.bfq
          u_1.bfq u_2.bfq

          I didnt understand how/where the ./maq map loop looks at s_1.bfq s_2.bfq and then t_1.bfq t_2.bfq, u_1.bfq u_2.bfq.

          Thank you for your help

          L

          Comment

          • zee
            NGS specialist
            • Apr 2008
            • 249

            #6
            OK, it is a simple change to the ff:

            Code:
            for base in `echo s t u`; do
              ./maq map $base.map genome.bfa $base"_1.bfq" $base"_2.bfq"
            done

            Originally posted by Layla View Post
            Hi zee,

            Thankyou for your reply but I have been receiving a bizarre error:Assertion failed: (fp_bfa), function ma_match, file match.cc, line 516

            for file in `ls *.bfq`
            do
            ./maq map $file.map genome.bfa $file.bfq
            done

            From the paired end experiment I have a total of 3 pairs stored in one folder:
            s_1.bfq s_2.bfq
            t_1.bfq t_2.bfq
            u_1.bfq u_2.bfq

            I didnt understand how/where the ./maq map loop looks at s_1.bfq s_2.bfq and then t_1.bfq t_2.bfq, u_1.bfq u_2.bfq.

            Thank you for your help

            L

            Comment

            • Layla
              Member
              • Sep 2008
              • 58

              #7
              Thanx Zee. Since multiple runs are carried out to increase the number of reads, why is it that a separate .map file is being created for each pair (so a total of 3)? Is the purpose not to merge all the pairs and generate a single .map file to increase genome coverage?

              Actually whilst writing, is this where I can use the ./maq merge command?

              Cheers
              L

              Comment

              • jnfass
                Member
                • Aug 2008
                • 88

                #8
                You got it ... after you've generated all the maps, use maq merge to combine them into one map, from which you can generate a pileup, consensus, etc ...

                Comment

                • dakl
                  Member
                  • May 2009
                  • 15

                  #9
                  Hi all,

                  Since Maq is optimized for ~2M reads as input, I managed to do the following:

                  Code:
                  time maq fastq2bfq -n 2000000 ../50a_fastq.single.fastq 50a
                  to create several bfq-files containing the reads, and then use the perl module Parallel::ForkManager to fork the process. See the script below for details.

                  Code:
                  #!/usr/bin/perl -w
                  
                  use strict;
                  use Parallel::ForkManager;
                  
                  my $pm = new Parallel::ForkManager(4); # number of parallel processes is 4
                  while(<>){
                          chomp;
                  
                          # Forks and returns the pid for the child:
                          my $pid = $pm->start and next; 
                          
                          qx/ maq match -c $_.map ~\/hg18\/hg18.bfa $_/; 
                          
                      $pm->finish; # Terminates the child process
                  }

                  Comment

                  Latest Articles

                  Collapse

                  • GATTACAT
                    Reply to Nine Things a Sample Prep Scientist Thinks About Before Sequencing
                    by GATTACAT
                    Love this - good data definitely starts from good input, and poor input can only give relatively poor data. I particularly like the mention of Nanodrop/absorbance based methods for quantification. It's such a toss up if you'll get an accurate reading or what amounts to a randomly generated number, and a lot of library/sequencing related issues can be traced back to poor quant.
                    07-01-2026, 11:43 AM
                  • SEQadmin2
                    Nine Things a Sample Prep Scientist Thinks About Before Sequencing
                    by SEQadmin2


                    I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.

                    Here are nine questions we think about, in roughly the order they matter, before...
                    06-18-2026, 07:11 AM

                  ad_right_rmr

                  Collapse

                  News

                  Collapse

                  Topics Statistics Last Post
                  Started by SEQadmin2, 07-02-2026, 11:08 AM
                  0 responses
                  11 views
                  0 reactions
                  Last Post SEQadmin2  
                  Started by SEQadmin2, 06-30-2026, 05:37 AM
                  0 responses
                  13 views
                  0 reactions
                  Last Post SEQadmin2  
                  Started by SEQadmin2, 06-26-2026, 11:10 AM
                  0 responses
                  20 views
                  0 reactions
                  Last Post SEQadmin2  
                  Started by SEQadmin2, 06-17-2026, 06:09 AM
                  0 responses
                  54 views
                  0 reactions
                  Last Post SEQadmin2  
                  Working...