Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • bowtie and maq questions

    Hi all,

    OK, after spending quite amount of time reading and researching on aligners, although I am still almost as novice as I was before the reading, I am able to run bowtie and maq with my computer (well, at least in some extent).

    In order to help me to learn more about how those aligners work, I ran two experiments like this:

    1. run bowtie as instructed in Tutorial with e_coli_1000.fq which is included in bowtie package
    Code:
    ~/Desktop/genome/HTdata/e_coli_536$ ../../Bowtie/bowtie-0.9.9.2/bowtie e_coli ../../Bowtie/bowtie-0.9.9.2/reads/e_coli_1000.fq > e_coli.bowtie.txt
    The output of this command is
    Code:
    Reported 699 alignments to 1 output stream(s)
    together with a nice output e_coli.bowtie.txt file that I can check what sequence is aligned, where and what the error of that sequence is.

    2. run maq easyrun with the same e_coli.fq file. Since the built-in e_coli in bowtie is E_coli_536, I went to NCBI ftp site and downloaded NC_008253.fna. The MAQ run command is:
    Code:
    :~/Desktop/genome/HTdata/e_coli_536$ maq.pl easyrun -d e_coli NC_008253 ../../Bowtie/bowtie-0.9.9.2/reads/e_coli_1000.fq > e_coli.maq.txt
    e_coli.maq.txt file shows
    Code:
    -- == statmap report ==
    
    -- # single end (SE) reads: 1000
    -- # mapped SE reads: 745 (/ 1000 = 74.5%)
    -- # paired end (PE) reads: 0
    -- # mapped PE reads: 0 (/ 0 = NA%)
    -- # reads that are mapped in pairs: 0 (/ 0 = NA%)
    -- # Q>=30 reads that are moved to meet mate-pair requirement: 0 (/ 0 = NA%)
    -- # Q<30 reads that are moved to meet mate-pair requirement: 0 (NA%)
    So I have some questions:

    a. Why BOWTIE and MAQ gave different results with the same data set (MAQ gave 745 mapped reads and BOWTIE gave 699)? How I can set parameters for both bowtie and maq to get the same results?

    b. e_coli.bowtie.txt is a nice text file together with a summary of the mapped reads and errors. How I can check with MAQ output files to have the same summary file, say a file with a summary of mapped reads and their errors?

    c. what software I can use for post-alignment analysis? I tried maq mapview but I can only see one mapped read at a time. Is there a software which can show a nice alignment view like BLAT with the error as well as the coordinates of the read on the reference?

    Sorry for such a long post and thank you all in advance. Any input will be greatly appreciated.

    D.

  • #2
    For question A:

    Comment


    • #3
      Question C:
      Have you tried consed viewer ( http://www.phrap.org/consed/consed.html#howToGet ) ?

      Comment


      • #4
        Further elaboration on A:

        My fault. The reads/e_coli_1000.fq file I include with Bowtie has instances where an N in the read lines up with a non-zero quality value. The Illumina pipeline (AFAIK) doesn't do this, and Maq automatically rounds quality values corresponding to Ns down to 0. Bowtie doesn't, hence the difference. You can fix the .fq file with this script:

        #!/usr/bin/perl -w

        while(<>) {
        my $name = $_;
        my $seq = <>; chomp($seq);
        my @seqa = split(//, $seq);
        my $name2 = <>;
        my $quals = <>; chomp($quals);
        my @qualsa = split(//, $quals);
        for(my $i = 0; $i <= $#seqa; $i++) {
        $qualsa[$i] = "!" if($seqa[$i] eq 'N');
        }
        print "$name$seq\n$name2";
        print join("", @qualsa) . "\n";
        }
        Then, if you run bowtie with its default parameters on the fastq output by the script, you should see it report 753 alignments.

        Sorry for the confusion,
        Ben

        Comment


        • #5
          Originally posted by Ben Langmead View Post
          Further elaboration on A:

          My fault. The reads/e_coli_1000.fq file I include with Bowtie has instances where an N in the read lines up with a non-zero quality value. The Illumina pipeline (AFAIK) doesn't do this, and Maq automatically rounds quality values corresponding to Ns down to 0. Bowtie doesn't, hence the difference. You can fix the .fq file with this script:


          Then, if you run bowtie with its default parameters on the fastq output by the script, you should see it report 753 alignments.

          Sorry for the confusion,
          Ben
          I got it! Now 753 vs 745 is closed enough . Thanks Ben for your script. Also thank to Pepe for a paper, it is really helpful to novice like me.

          Any other input about MAQ? I think there must have many MAQ users here in the forum

          Thanks,

          D.

          Comment


          • #6
            Originally posted by Coffeebean View Post
            Question C:
            Have you tried consed viewer ( http://www.phrap.org/consed/consed.html#howToGet ) ?
            No, I haven't. But Consed seems... picky to me , as in order to run, it requires other softwares like phred etc... Anyway, I will give it a try. Thanks Coffeebean.

            Comment

            Latest Articles

            Collapse

            • seqadmin
              Strategies for Sequencing Challenging Samples
              by seqadmin


              Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
              03-22-2024, 06:39 AM
            • seqadmin
              Techniques and Challenges in Conservation Genomics
              by seqadmin



              The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

              Avian Conservation
              Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
              03-08-2024, 10:41 AM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by seqadmin, Yesterday, 06:37 PM
            0 responses
            7 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, Yesterday, 06:07 PM
            0 responses
            7 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 03-22-2024, 10:03 AM
            0 responses
            49 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 03-21-2024, 07:32 AM
            0 responses
            66 views
            0 likes
            Last Post seqadmin  
            Working...
            X