Header Leaderboard Ad

Collapse

bfast localalign error

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • bfast localalign error

    Hi,

    I am trying to map exome data using bfast. I received *.sam files from our collaborates. Here's a summary of my method:

    1. Converted *.sam to *.fastq using PICARD
    2. Converted hg19.fa to color space - hg19.fa.cs.brg
    3. Created index for reference - hg19.fa.cs.1.1.bif
    4. Search indexes successful - *.bmf
    5. Perform local alignment - gave me an error! Thanks to Seqanswers forums, I realized that I have to convert reference to nucleotide space as well. So, now I have hg19.fa.nt.brg
    6. Tried to run local alignment again.....but got "AlignColorSpaceGappedConstrained" error!!!

    ************************************************************
    Checking input parameters supplied by the user ...
    Validating fastaFileName chr_all.fa.
    Validating matchFileName03C14605A_bfastMatches_hg19.bmf.
    **** Input arguments look good! *****
    ************************************************************
    ************************************************************
    Printing Program Parameters:
    programMode: [ExecuteProgram]
    fastaFileName: chr_all.fa
    matchFileName: 03C14605A_bfastMatches_hg19.bmf
    scoringMatrixFileName: [Not Using]
    ungapped: [Not Using]
    unconstrained: [Not Using]
    space: [Color Space]
    startReadNum: 1
    endReadNum: 2147483647
    offsetLength: 20
    maxNumMatches: 384
    avgMismatchQuality: 10
    numThreads: 1
    queueLength: 10000
    timing: [Not Using]
    ************************************************************
    ************************************************************
    Reading in reference genome from chr_all.fa.nt.brg.
    In total read 25 contigs for a total of 3095693983 bases
    ************************************************************
    ************************************************************
    Reading match file from 03C14605A_bfastMatches_hg19.bmf.
    ************************************************************
    Performing alignment...
    Reads processed: 0************************************************************
    In function "AlignColorSpaceGappedConstrained": Fatal Error[OutOfRange]. Message: read and reference did not match.
    ***** Exiting due to errors *****
    ************************************************************


    Thank you in advance for your help.

    Nimmi

  • #2
    Can you show the files from within the reference directory as well as the exact commands you run?

    Is it possible you forgot to use -A1 to tell bfast to compute the local alignments in CS?
    -drd

    Comment


    • #3
      Here are the list of files in my directory:

      03C14605A.fastq
      chr_all.fa
      chr_all.fa.cs.brg
      chr_all.fa.cs.1.1.bif
      chr_all.fa.nt.brg
      03C14605A_bfastMatches_hg19.bmf

      Here's the command I used for local alignment:

      $/usr/local/bfast/bin/bfast localalign -f chr_all.fa -m 03C14605A_bfastMatches_hg19.bmf -A 1
      > 03C14605A_bfastMatches_hg19_localalign.baf

      Thank you very much

      Comment


      • #4
        Looks like the fastq is not in the proper format. You need to extract the color sequence and color qualities from the SAM file in the CS/CQ optional tags.

        Comment


        • #5
          Thanks Nils for your response. Here's a few lines from my *.sam file:

          1462_1719_1204_F3 0 chr1 10006 0 50M * 0 0 CTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCT
          [email protected][email protected]<BBB;@=BB?;[email protected]><7=7B>;>@[email protected]<97+B84 NH:i:508 CS:Z:T22301002301002301002301002301002301002301002301002
          1649_965_405_F3 16 chr1 10014 0 50M * 0 0 AACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAA +9==%'4587'1+:[email protected]<5=7;[email protected]:=BAB:<[email protected]<B<[email protected]@ NH:i:511 CS:Z:T00320010320010320010320010320010320010320010300010
          1433_953_927_F3 16 chr1 10026 0 50M * 0 0 AACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAA '%:7+(6-;>%:4(<<8:=-:@268%>>644)::7>9;>[email protected]@[email protected]@B?A NH:i:511 CS:Z:T00320010320010320010320010320010320010320010320010
          1326_1211_1996_F3 0 chr1 10029 0 50M * 0 0 CCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCC
          <96BA6>7<?>/;6<?1482:41-)+1+466-+4752/6/8!;11/-)>1 NH:i:517 CS:Z:T20230100230100230100230100230100230100230100230100
          1379_1501_1111_F3 16 chr1 10047 0 50M * 0 0 CCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCC
          :%6=%!-2+7!%==<)!:1:4+)--8B6914:>1:/46:+;=:[email protected]<</'> NH:i:517 CS:Z:T10010320010320010320010320010320010320010320010320
          1608_1356_225_F3 0 chr1 10048 0 50M * 0 0 CTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCT
          ?>B>@@:AB9>=?><@6B?8<><B;<;;<B</<[email protected]:5929B56/2)@@) NH:i:508 CS:Z:T22301002301002301002301002301002301002301002301002

          Comment


          • #6
            If you have fragment reads, then the following perl code should do it. Pipe the SAM file into it (or use "samtools view <in.bam>" for a BAM file). For paired end or multi-end reads, it does get trickier and the below will not keep the pairings. Note that your input SAM file does not have the CQ tag, and thus color qualities are lost.


            Code:
            #!/bin/perl
            use strict;
            use warnings;
            while(defined(my $line = <STDIN>)) {
                if($line =~ m/^@/) {
                    next;
                }
                chomp($line);
                my $NAME = $line;
                my $CS = $line;
                my $CQ = "";
                $NAME =~ s/^([^\t]+)\t.*/$1/;
                $CS =~ s/.*CS:Z:([^\t]+).*/$1/;
                $CS =~ tr/ /4/;
                if($line =~ m/CQ:Z:([^\t]+)/) {
                    $CQ = $1;
                }   
                else {
                    $CQ = "!" x (length($CS)-1); 
                }   
                print STDOUT "\@$NAME\n$CS\n+\n$CQ\n";
            }
            Nils

            Comment


            • #7
              Thanks Nils for the perl script.

              Just to summarize........I converted *.sam file to *.fastq using PICARD and this fastq file was in nucleotide space. When I ran localalign in bfast I got the "AlignColorSpaceGappedConstrained" error. Then I used perl script posted by Nils to convert *.sam to *.fastq in colorspace. Then re-ran the match command (*.bmf) followed by localalign and everything works fine!

              Thank you all for your suggestions and responses

              Comment

              Working...
              X