Announcement

Collapse
No announcement yet.

Breakway: Identify Structural Variations in Genomic Data

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Breakway: Identify Structural Variations in Genomic Data

    I would like to announce the release of Breakway, a program for identifying structural variations in genomic data!

    http://breakway.sourceforge.net

    Breakway is a suite of programs (written in PERL) that take aligned genomic data and report structural variation breakpoints. Features include:
    • Takes in BAM formatted input, the current standard for genomic alignments.
    • Compatible with standard output from major alignment algorithms such as BFAST, BWA, MAQ, et cetera.
    • Capable of analyzing data from any major platform--Solexa, SOLiD, 454, et cetera.
    • Empirically identifies structural variation breakpoints.
    • Highly specific analysis generates very few false positives.
    • Includes a suite of downstream tools for annotating identified breakpoints and reducing false positives.
    • Intuitive output tells you the type of event (INT, DEL, or INS), scores, inversion status, and more.


    I've made Breakway so that it will be compatible with pipelines as well.There is the potential for Breakway to be plugged into your genome analysis pipeline to automatically generate a Breakway report.

    Development of Breakway started during analysis of the U87MG whole genome sequence and continued to mature throughout analysis of subsequent genome sequencing projects in the Stanley F. Nelson Lab at UCLA. Since that first project, Breakway has become significantly more powerful, and I feel has evolved (through concerted effort!) into something that the community would benefit from.

    I hope that Breakway can help others easily identify structural variation breakpoints in their genomic data. Please try it out!
    Last edited by Michael.James.Clark; 04-28-2010, 09:39 AM.
    Mendelian Disorder: A blogshare of random useful information for general public consumption. [Blog]
    Breakway: A Program to Identify Structural Variations in Genomic Data [Website] [Forum Post]
    Projects: U87MG whole genome sequence [Website] [Paper]

  • #2
    Note, the URL should be http://breakway.sourceforge.net

    Comment


    • #3
      Oh, thanks! Serves me right for posting at 3:30AM right after I finished setting up the webpage.
      Mendelian Disorder: A blogshare of random useful information for general public consumption. [Blog]
      Breakway: A Program to Identify Structural Variations in Genomic Data [Website] [Forum Post]
      Projects: U87MG whole genome sequence [Website] [Paper]

      Comment


      • #4
        Nice looking application, do you think it can be used on mRNA-seq and exon capture datasets or just in whole genome sequencing?

        Comment


        • #5
          I 'd like to try breakway, but before that it needs both bfast and DNAA in the path. when I install DNAA, I meet some problems. would you help me to fix it .
          the error shows like this.

          $ make
          make all-recursive
          make[1]: Entering directory `/gs1/users/tangwei/dnaa-0.1.1/dnaa-0.1.1'
          Making all in dkbaseencoding
          make[2]: Entering directory `/gs1/users/tangwei/dnaa-0.1.1/dnaa-0.1.1/dkbaseencoding'
          if gcc -DHAVE_CONFIG_H -I. -I. -I.. -Wall -g -O2 -pthread -D_IOLIB=2 -D_FILE_OFFSET_BITS=64 -m64 -MT RGIndex.o -MD -MP -MF ".deps/RGIndex.Tpo" -c -o RGIndex.o `test -f '../bfast/bfast/RGIndex.c' || echo './'`../bfast/bfast/RGIndex.c; \
          then mv -f ".deps/RGIndex.Tpo" ".deps/RGIndex.Po"; else rm -f ".deps/RGIndex.Tpo"; exit 1; fi
          ../bfast/bfast/RGIndex.c:20:26: error: RGIndexExons.h: No such file or directory
          make[2]: *** [RGIndex.o] Error 1
          make[2]: Leaving directory `/gs1/users/tangwei/dnaa-0.1.1/dnaa-0.1.1/dkbaseencoding'
          make[1]: *** [all-recursive] Error 1
          make[1]: Leaving directory `/gs1/users/tangwei/dnaa-0.1.1/dnaa-0.1.1'
          make: *** [all] Error 2

          Comment


          • #6
            Originally posted by townway View Post
            I 'd like to try breakway, but before that it needs both bfast and DNAA in the path. when I install DNAA, I meet some problems. would you help me to fix it .
            the error shows like this.

            $ make
            make all-recursive
            make[1]: Entering directory `/gs1/users/tangwei/dnaa-0.1.1/dnaa-0.1.1'
            Making all in dkbaseencoding
            make[2]: Entering directory `/gs1/users/tangwei/dnaa-0.1.1/dnaa-0.1.1/dkbaseencoding'
            if gcc -DHAVE_CONFIG_H -I. -I. -I.. -Wall -g -O2 -pthread -D_IOLIB=2 -D_FILE_OFFSET_BITS=64 -m64 -MT RGIndex.o -MD -MP -MF ".deps/RGIndex.Tpo" -c -o RGIndex.o `test -f '../bfast/bfast/RGIndex.c' || echo './'`../bfast/bfast/RGIndex.c; \
            then mv -f ".deps/RGIndex.Tpo" ".deps/RGIndex.Po"; else rm -f ".deps/RGIndex.Tpo"; exit 1; fi
            ../bfast/bfast/RGIndex.c:20:26: error: RGIndexExons.h: No such file or directory
            make[2]: *** [RGIndex.o] Error 1
            make[2]: Leaving directory `/gs1/users/tangwei/dnaa-0.1.1/dnaa-0.1.1/dkbaseencoding'
            make[1]: *** [all-recursive] Error 1
            make[1]: Leaving directory `/gs1/users/tangwei/dnaa-0.1.1/dnaa-0.1.1'
            make: *** [all] Error 2
            That's my fault. I was in the middle of trying to put a tarball up for distribution and it was not being created correctly. Please try again.

            Nils
            Last edited by nilshomer; 04-28-2010, 08:51 PM. Reason: esl

            Comment


            • #7
              Originally posted by Jon_Keats View Post
              Nice looking application, do you think it can be used on mRNA-seq and exon capture datasets or just in whole genome sequencing?
              While I haven't tested it on such datasets, it ought to work on them. The key will be in the reference genome used.

              Breakway functions by looking for clusters of aberrantly spaced paired reads, so the key is to have an appropriate reference genome for it to compare to.

              For exon capture, it should work with the normal reference genome just as well as it will with whole genomes.

              For RNAseq, and I'm not an expert so I welcome other suggestions, the transcriptome will probably be best used as the reference genome.
              Mendelian Disorder: A blogshare of random useful information for general public consumption. [Blog]
              Breakway: A Program to Identify Structural Variations in Genomic Data [Website] [Forum Post]
              Projects: U87MG whole genome sequence [Website] [Paper]

              Comment


              • #8
                Originally posted by nilshomer View Post
                That's my fault. I was in the middle of trying to put a tarball up for distribution and it was not being created correctly. Please try again.

                Nils
                Thanks, Nils.

                townway, I just successfully installed DNAA from the current tarball on Sourceforge without any problems following the directions in the INSTALL file, so just try again and hopefully it'll work for you.
                Mendelian Disorder: A blogshare of random useful information for general public consumption. [Blog]
                Breakway: A Program to Identify Structural Variations in Genomic Data [Website] [Forum Post]
                Projects: U87MG whole genome sequence [Website] [Paper]

                Comment


                • #9
                  Does this work with mate pair data generated by the SOLiD platform. All I see in the manual are references to Paired End data, and the DNAA manual seems a little sparse on handling mate pair data too.

                  cheers

                  Comment


                  • #10
                    sorry, i'm the idiot. just found it after a closer look

                    Comment


                    • #11
                      Breakway should work with any paired data--paired-end or mate pair or even split long reads.
                      Mendelian Disorder: A blogshare of random useful information for general public consumption. [Blog]
                      Breakway: A Program to Identify Structural Variations in Genomic Data [Website] [Forum Post]
                      Projects: U87MG whole genome sequence [Website] [Paper]

                      Comment


                      • #12
                        Hi all,
                        A significant bug fix was just implemented such that breakway.run.pl will now function properly. Please update to Breakway 0.5.1!

                        Please let me know if you find any more!
                        MJ
                        Last edited by Michael.James.Clark; 05-05-2010, 11:27 AM.
                        Mendelian Disorder: A blogshare of random useful information for general public consumption. [Blog]
                        Breakway: A Program to Identify Structural Variations in Genomic Data [Website] [Forum Post]
                        Projects: U87MG whole genome sequence [Website] [Paper]

                        Comment


                        • #13
                          A request was made for a filtering script that allows one to use another Breakway file in order to cross-check the Breakway file being created for events present in both. This is particularly useful for two things (and maybe more):

                          1) Comparing a tumor with its germline genome when both have been aligned to the same reference. This is useful because the germline will often contain variants from the reference, as the reference is unrelated. The expectation is that since the tumor is derived from the germline, we expect the tumor to contain these unless there is a mutation. It should allow one to identify tumor-specific mutations.

                          2) Removing native events that are detected in the reference from the genome in question. This is because some structural events can be detected in the reference (for example, segmental duplications) and therefore may be worth marking in the sequenced genome.

                          If you want to use this function, please go to the Breakway website and download Breakway 0.6. The script is in the scripts folder and is called "breakway.bwfilter.pl". You can also use the usual breakway.run.pl with the --bwfile option and it will work.
                          Last edited by Michael.James.Clark; 05-10-2010, 10:24 PM.
                          Mendelian Disorder: A blogshare of random useful information for general public consumption. [Blog]
                          Breakway: A Program to Identify Structural Variations in Genomic Data [Website] [Forum Post]
                          Projects: U87MG whole genome sequence [Website] [Paper]

                          Comment


                          • #14
                            I'm getting errors on BAM files that have string flag fields instead of numerical flag fields.


                            ie, a read starting with this

                            Code:
                            1155_400_505   pP1     chr1    2571    255     8M6D17M =
                            results in a "problem processing reads. See reads file:" error

                            is this a known bug, or somthing that can be worked around.

                            cheers

                            Comment


                            • #15
                              Originally posted by orcy View Post
                              I'm getting errors on BAM files that have string flag fields instead of numerical flag fields.


                              ie, a read starting with this

                              Code:
                              1155_400_505   pP1     chr1    2571    255     8M6D17M =
                              results in a "problem processing reads. See reads file:" error

                              is this a known bug, or somthing that can be worked around.

                              cheers
                              The string flag field is not up to the SAM spec, and is only meant for viewing. Try using the numerical flag field since your usage is currently non-standard.

                              Comment

                              Working...
                              X