Header Leaderboard Ad

Collapse

bfast for analyzing AB's SOLiD data

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • bfast for analyzing AB's SOLiD data

    Dear Bfast experts and users,

    I have 4 questions and hope to seek some answers from the community. My questions are interspersed in the bfast workflow described in two parts below.

    PART-1
    1- convert experimental data to bfast input (these are for mate-pair library preps).
    solid2fastq -n 500000 -o reads *.csfasta *.qual
    ("reads.j.fastq" , j=1...,N files created)

    Q1- Is this command right for mate-pair library prep.

    2- reference sequence to nucleotide space and color space
    bfast fasta2brg -f ref_genome.fa
    bfast fasta2brg -f ref_genome.fa -A 1

    3 - create 10 masks using information from manual
    generate 10 bif files (M=10)
    bfast index -f ref_genome.fa -m <mask> -w 14 -i <index number> -A 1

    10 is optimal for analyzing the human genome as suggested by the authors for bfast.
    Q2- Can anyone please suggest a number for the mouse genome? An approximate value would be good enough - say 15 or 50?

    I prefer not to compromise on sensitivity - so I prefer to use all indices to map short reads - part2 .. summarized below.

    PART-2
    1- bfast match
    bfast match -f ref_genome.fa -A 1 -r reads.<N>.fastq > bfast.matches.file.ref_genome.<N>.bmf

    2- bfast localalign
    bfast localalign -f ref_genome.fa -m bfast.matches.file.ref_genome.<N>.bmf -A 1 > bfast.aligned.file.ref_genome.<N>.baf

    3- bfast postprocess
    bfast postprocess -f ref_genome.fa -i bfast.aligned.file.ref_genome.<N>.baf -A 1 > bfast.reported.file.ref_genome.<N>.sam

    Q3- If I choose to split jobs for PART-2, for specific indices, Do I use bmfmerge after #1 and before #2 ?

    Q4 - With bmfmerge, What would be the cutoff / flag for reasonable analysis using the mouse genome? An example command would help please. It has been suggested that a value of "-M 500" may be useful when aligning the Human Genome. I would welcome any suggestions for the mouse genome.

    Hope you can please help,
    Thanks very much in advance,
    cheers,
    another new bfast analyzer.
    ---------------

  • #2
    Q1 - Looks good.
    Q2 - Use the ten provided; they should be sufficient.
    Q3 - Yes.
    Q4 - The "-M" option should match the value in "bfast match" and "bfast localalign". The defaults should match across commands.

    Comment

    Working...
    X