Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • mnfuser
    Junior Member
    • Apr 2012
    • 6

    BFAST localalign SOLiD data

    Hi all, I am using BFAST to align SOLiD PE data.

    I had the following error:

    ************************************************************
    Checking input parameters supplied by the user ...
    Validating fastaFileName /state/partition1/genome/bfast/ucsc.hg19.fasta.
    Validating matchFileName/share/GAMES/data/test20120501/solid0121_20100616_PEcllSureSelect_CLL_11.matched.bmf.
    **** Input arguments look good! *****
    ************************************************************
    ************************************************************
    Printing Program Parameters:
    programMode: [ExecuteProgram]
    fastaFileName: /state/partition1/genome/bfast/ucsc.hg19.fasta
    matchFileName: /share/GAMES/data/test20120501/solid0121_20100616_PEcllSureSelect_CLL_11.matched.bmf
    scoringMatrixFileName: [Not Using]
    ungapped: [Not Using]
    unconstrained: [Not Using]
    space: [Color Space]
    startReadNum: 1
    endReadNum: 2147483647
    offsetLength: 20
    maxNumMatches: 384
    avgMismatchQuality: 10
    numThreads: 1
    queueLength: 25000
    timing: [Not Using]
    ************************************************************
    ************************************************************
    Reading in reference genome from /state/partition1/genome/bfast/ucsc.hg19.fasta.nt.brg.
    In total read 93 contigs for a total of 3137161264 bases
    ************************************************************
    ************************************************************
    Reading match file from /share/GAMES/data/test20120501/solid0121_20100616_PEcllSureSelect_CLL_11.matched.bmf.
    ************************************************************
    Performing alignment...
    Reads processed: 0************************************************************

    In function "AlignColorSpaceGappedConstrained": Fatal Error[OutOfRange]. Message: read and reference did not match.


    Before alignment I built references with fast2brg in nt and cs. I created the masks and the indexes as mentioned in the manual (SOLiD section). I matched the .fastq file (obtained by solid2fastq, 2 .csfasta files + 2 .qual files) with the references. I got a 3.5 Gb .bmf file. When I started the local alignement I got an error as showed above.

    What am I doing wrong? I hope anybody could help me..

    Many thanks.
  • nilshomer
    Nils Homer
    • Nov 2008
    • 1283

    #2
    What where your match commands?

    Comment

    • mnfuser
      Junior Member
      • Apr 2012
      • 6

      #3
      Originally posted by nilshomer View Post
      What where your match commands?
      i) Indexes creation using 10 masks (as in BFAST manual):

      $BFASTDIR/bfast/bfast index -n 8 -f $REFERENCEDIR/$REFERENCEFILE -m $MASK<1:10> -w 14 -i 7 -A 1

      10 .bif files created successfully (13 Gb each)

      ii) Matching step

      $BFASTDIR/bfast/bfast match -f $REFERENCEDIR/$REFERENCEFILE -A 1 -r $OUTDIR/$READSFILE.fastq > $OUTDIR/$READSFILE.matched.bmf

      REFERENCEFILE=ucsc.hg19.fa
      READSFILE=solid0121_20100616_PEcllSureSelect_CLL_11 (obtained by running sold2fastq with .csfasta F3 and F5 files and 2 related .qual files)

      Anything wrong?

      Thanks

      Comment

      • nilshomer
        Nils Homer
        • Nov 2008
        • 1283

        #4
        I would take a look at the bfast+bwa branch for paired ends. To debug, I would need a small test case.

        Comment

        • mnfuser
          Junior Member
          • Apr 2012
          • 6

          #5
          I'm doing localalign with -U option and it's working. I will take a look at the bwaaln and related PE pipe to compare results.

          I'm also reporting that while testing reads subsets with -s/-e in localalign, I got good results for such debug intervals. Error rised when localalign worked on the entire dataset, at the beginning of the computation (0 reads processed. I tested the interval 1:3000 and it worked).
          I dont want to bother you more.

          Let me know what you think about..


          Thank for your interest

          Cu,
          Marco

          Comment

          • mnfuser
            Junior Member
            • Apr 2012
            • 6

            #6
            The postprocess step, after alignment with -U option (it worked fine, I guess, and outputted a 4.5 Gb .baf file), gave me a segmentation fault error.

            See below err message:

            ************************************************************
            Checking input parameters supplied by the user ...
            Validating fastaFileName /state/partition1/genome/bfast/ucsc.hg19.fasta.
            Validating alignFileName /share/GAMES/data/test20120501/solid0121_20100616_PEcllSureSelect_CLL_11.matched.bmf.aln.baf.
            Input arguments look good!
            ************************************************************
            ************************************************************
            Printing Program Parameters:
            programMode: [ExecuteProgram]
            fastaFileName: /state/partition1/genome/bfast/ucsc.hg19.fasta
            alignFileName: /share/GAMES/data/test20120501/solid0121_20100616_PEcllSureSelect_CLL_11.matched.bmf.aln.baf
            algorithm: [Best Score]
            space: [Color Space]
            strandedness: [Opposite strand]
            positioning: [Read one first]
            pairing: [Paired End]
            avgMismatchQuality: 10
            scoringMatrixFileName: [Not Using]
            randomBest: [Not Using]
            minMappingQuality: -2147483648
            minNormalizedScore: -2147483648
            insertSizeAvg: 0.000000
            insertSizeStdDev: 0.000000
            numThreads: 8
            queueLength: 100000
            outputFormat: [SAM]
            outputID: [Not Using]
            RGFileName: [Not Using]
            baseQualityType: [MAQ-style]
            timing: [Not Using]
            ************************************************************
            ************************************************************
            Reading in reference genome from /state/partition1/genome/bfast/ucsc.hg19.fasta.nt.brg.
            In total read 93 contigs for a total of 3137161264 bases
            ************************************************************
            Postprocessing...
            ************************************************************
            Estimating paired end distance...
            Found only 0 distances to infer the insert size distribution
            ************************************************************

            In function "GetPEDBins": Warning[OutOfRange]. Variable/Value: b->numDistances.
            Message: Not enough distances to infer insert size distribution.
            ***** Warning *****
            ************************************************************
            /opt/torque/mom_priv/jobs/878.deepseq.unife.it.SC: line 30: 28843 Segmentation fault $BFASTDIR/bfast/bfast postprocess -n 8 -f $REFERENCEDIR/$REFERENCEFILE -i $OUTDIR/$ALNFILE.baf -A 1 -Y 0 > $OUTDIR/$ALNFILE.sam

            Comment

            • nilshomer
              Nils Homer
              • Nov 2008
              • 1283

              #7
              Very interesting, can you create a small test case to debug?

              Comment

              • mnfuser
                Junior Member
                • Apr 2012
                • 6

                #8
                Hi, I ran bfast+BWA on centOS cluster, segmentation foult again (in index creation). It ran fine on my debian laptop. Any issues related to centOS?

                Thank

                Comment

                • nilshomer
                  Nils Homer
                  • Nov 2008
                  • 1283

                  #9
                  I am sorry, there is not enough information to debug.

                  Comment

                  Latest Articles

                  Collapse

                  • SEQadmin2
                    Nine Things a Sample Prep Scientist Thinks About Before Sequencing
                    by SEQadmin2


                    I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.

                    Here are nine questions we think about, in roughly the order they matter, before...
                    06-18-2026, 07:11 AM
                  • SEQadmin2
                    From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
                    by SEQadmin2


                    Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


                    The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
                    ...
                    06-02-2026, 10:05 AM

                  ad_right_rmr

                  Collapse

                  News

                  Collapse

                  Topics Statistics Last Post
                  Started by SEQadmin2, 06-26-2026, 11:10 AM
                  0 responses
                  13 views
                  0 reactions
                  Last Post SEQadmin2  
                  Started by SEQadmin2, 06-17-2026, 06:09 AM
                  0 responses
                  48 views
                  0 reactions
                  Last Post SEQadmin2  
                  Started by SEQadmin2, 06-09-2026, 11:58 AM
                  0 responses
                  107 views
                  0 reactions
                  Last Post SEQadmin2  
                  Started by SEQadmin2, 06-05-2026, 10:09 AM
                  0 responses
                  125 views
                  0 reactions
                  Last Post SEQadmin2  
                  Working...