Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • ntn12
    Junior Member
    • May 2014
    • 7

    #16
    Originally posted by ecSeq Bioinformatics View Post
    Dear ntn12,

    thanks for your comments and questions.

    segemehl itself is not a fusion-finder. It is a mapping tool that can detect split-reads and its resulting set of these split-reads can be used to call fusion genes. But it has to be done in a separate downstream analysis and is not included in the segemehl algorithm. I hope that makes things clearer.

    Ok. I understand now that SEGEMEHL is not a fusion genes finder and it has never been used for this. It has the same potential to be used for fusion finder as BLAT/BOWTIE/BWA for example.

    I got confused because the authors of SEGEMEHL claim in the title of their paper:

    Hoffmann et al. A multi-split mapping algorithm for circular RNA, splicing, trans-splicing, and FUSION DETECTION, Genome Biol. 2014.



    that SEGEMEHL does FUSION DETECTION when actually it does not.

    Comment

    • ntn12
      Junior Member
      • May 2014
      • 7

      #17
      Originally posted by Paul Newport View Post
      Sorry, but I don't understand the list shown on the linked page.

      My questions would be:
      1. Where do these 40 fusion genes come from?
      2. Why does only FusionCatcher find all of these?
      3. Why is this list on the FusionCatcher website?
      I do not know. We have not used yet FusionCatcher. We have been testing TopHat-fusion, FusionMap, ChimeraScan, and FusionFinder. We found puzzling that all these four give thousands of candidate fusion genes per sample (some even hundred of thousands) when we know from the medical literature that there should not be more than 1-3 fusion genes per sample!!! Therefore one has here 99% false positives.

      UPDATE: We started testing SOAPfuse and we start to like it!
      Last edited by ntn12; 05-19-2014, 05:48 AM.

      Comment

      • ecSeq Bioinformatics
        Senior Member
        • May 2012
        • 490

        #18
        Originally posted by ntn12 View Post
        I got confused because the authors of SEGEMEHL claim in the title of their paper:

        Hoffmann et al. A multi-split mapping algorithm for circular RNA, splicing, trans-splicing, and FUSION DETECTION, Genome Biol. 2014.



        that SEGEMEHL does FUSION DETECTION when actually it does not.
        Dear ntn12,

        please step gently here. The title of the paper is very clear and all claims are met. Before reading something into the title, you should actually read the paper. Everything is written in very clear manner and all claims are confirmed by public available data.

        Nevertheless, I do not understand your frustrations here. Perhaps you should directly contact the developers of the algorithm and seek a dialogue.
        ecSeq Bioinformatics is Europe’s leading provider of hands-on bioinformatics workshops and professional data analysis in the field of Next-Generation Sequencing (NGS).

        Comment

        • ntn12
          Junior Member
          • May 2014
          • 7

          #19
          Originally posted by ecSeq Bioinformatics View Post
          Dear ntn12,

          please step gently here. The title of the paper is very clear and all claims are met. Before reading something into the title, you should actually read the paper. Everything is written in very clear manner and all claims are confirmed by public available data.
          I am even confused about SEGEMEHL after reading the paper.

          The authors of this paper:

          Hoffmann et al. A multi-split mapping algorithm for circular RNA, splicing, trans-splicing, and FUSION DETECTION, Genome Biol. 2014. http://www.ncbi.nlm.nih.gov/pubmed/24512684

          clearly state in the title and other three places thru out their article that:

          "Here, we present a unified unbiased algorithm to detect splicing, trans-splicing and gene fusion events from single-end read data..."

          "The algorithmic strategy to identify splicing, trans-splicing or gene fusion sites is based on a greedy, score-based seed chaining followed by a Smith-Waterman-like transition alignment."

          "Implemented in the segemehl mapping tool, it readily identifies conventional splice junctions, collinear and non-collinear fusion transcripts, and trans-spliced RNAs, without the need for separate post-processing or an extensive computational overhead."


          Also I did not find in the same article not even one fusion gene or fusion transcript found by SEGEMEHL. According to the last statement SEGEMEHL should identify readily fusion transcripts without the need for separate post-processing.

          We will use SOAPfuse for finding fusion genes because it performed really well in our tests.
          Last edited by ntn12; 05-19-2014, 06:04 AM.

          Comment

          • ecSeq Bioinformatics
            Senior Member
            • May 2012
            • 490

            #20
            Dear ntn12,

            I herewith take notice of your assumption that the segemehl developers wrote some statements which are confusing for you, so you will use SOAPfuse.
            Last edited by ecSeq Bioinformatics; 05-19-2014, 06:59 AM.
            ecSeq Bioinformatics is Europe’s leading provider of hands-on bioinformatics workshops and professional data analysis in the field of Next-Generation Sequencing (NGS).

            Comment

            • ntn12
              Junior Member
              • May 2014
              • 7

              #21
              Originally posted by ecSeq Bioinformatics View Post
              Dear ntn12,

              I herewith take notice of your assumption that the segemehl developers wrote some statements which are confusing for you, so you will use SOAPfuse.
              That is not an assumption. It is a fact.
              Indeed the authors of "Hoffmann et al. A multi-split mapping algorithm for circular RNA, splicing, trans-splicing, and FUSION DETECTION, Genome Biol. 2014. http://www.ncbi.nlm.nih.gov/pubmed/24512684"

              clearly state in their article that:

              "Implemented in the segemehl mapping tool, it readily identifies conventional splice junctions, collinear and non-collinear fusion transcripts, and trans-spliced RNAs, without the need for separate post-processing or an extensive computational overhead."

              I did not write that. The authors wrote that! Anybody can check this! Please, check here:


              Originally posted by ecSeq Bioinformatics View Post
              I herewith take notice of your assumption that the segemehl developers wrote some statements which are confusing for you, so you will use SOAPfuse.
              I am not the only one who got confused about SEGEMEHL. There are at least two others who are confused about SEGEMEHL and finding fusion genes here:
              Last edited by ntn12; 05-19-2014, 06:54 AM.

              Comment

              • Paul Newport
                Member
                • May 2014
                • 10

                #22
                Originally posted by ntn12 View Post
                I am not the only one who got confused about SEGEMEHL. There are at least two others who are confused about SEGEMEHL and finding fusion genes here:
                https://www.biostars.org/p/45986/
                Oh, please! Give me a break! Same statements, same time stamp! Too obvious, man!

                Comment

                • ntn12
                  Junior Member
                  • May 2014
                  • 7

                  #23
                  Originally posted by Paul Newport View Post
                  Oh, please! Give me a break! Same statements, same time stamp! Too obvious, man!
                  ???

                  Comment

                  • ecSeq Bioinformatics
                    Senior Member
                    • May 2012
                    • 490

                    #24
                    As already mentioned before in this thread:

                    If any of you is interested in learning how to use segemehl to detect fusion transcripts and/or circularized RNAs, I can recommend you the following hands-on course:

                    Discovering standard and non-standard RNA transcripts - How to detect canonical splicing, circular RNAs, trans-splicing, and fusion transcripts

                    Developers of the algorithm will explain you step-by-step how you can use segemehl to detect standard and non-standard transcripts. They will assure that all of you understand the difference between 'fusion-junctions' and 'fusion-genes' and what exactly you can do with segemehl and all its downstream analysis tools like (lack or haarz). You will understand the implications of splicing or fusion events and the concept of split-reads, how to detect splice sites using split-read information and in the end be able to find circularized RNAs or fusion-stranscripts.

                    The cool thing with this course: You will not just use (and trust) a tool with pre-defined parameters (like SOAPfuse, etc.), but understand everything from scratch!
                    Last edited by ecSeq Bioinformatics; 05-20-2014, 12:14 AM.
                    ecSeq Bioinformatics is Europe’s leading provider of hands-on bioinformatics workshops and professional data analysis in the field of Next-Generation Sequencing (NGS).

                    Comment

                    • NKAkers
                      Member
                      • Sep 2011
                      • 26

                      #25
                      I'm interested in giving segemehl a shot, but so far it's taking prohibitively long to run. In my cluster-computer environment I reserved 60 nodes for 24 hours to run:

                      segemehl.x -q 8Gb_single_end.fastq -t 60 -d chromosome1.fa -i chr1.idx -S -s -o chr1.sam

                      took over 24hours without completing. There were no errors reported, it did create a sam file, however incomplete. Do you have any tips to make the software run more quickly?

                      Comment

                      • ecSeq Bioinformatics
                        Senior Member
                        • May 2012
                        • 490

                        #26
                        Originally posted by NKAkers View Post
                        I'm interested in giving segemehl a shot, but so far it's taking prohibitively long to run. In my cluster-computer environment I reserved 60 nodes for 24 hours to run:

                        segemehl.x -q 8Gb_single_end.fastq -t 60 -d chromosome1.fa -i chr1.idx -S -s -o chr1.sam

                        took over 24hours without completing. There were no errors reported, it did create a sam file, however incomplete. Do you have any tips to make the software run more quickly?
                        This extensively long runtime of segemehl is probably owed to the common mapping strategy of RNA aligners which first attempt to map reads contiguously (i.e. without split) and then use the unmapped ones for a more expensive split-read mapping strategy. By mapping your data only to one chromosome instead of the entire genome, most of your data cannot be mapped but are attempted to be split-mapped, resulting in this huge runtime.

                        Thus, we would recommend to use the entire genome as database, resulting in faster runtime and moreover more reliable hits since by default segemehl reports only the best ones.
                        ecSeq Bioinformatics is Europe’s leading provider of hands-on bioinformatics workshops and professional data analysis in the field of Next-Generation Sequencing (NGS).

                        Comment

                        • ninni
                          Junior Member
                          • Jun 2012
                          • 8

                          #27
                          Originally posted by ntn12 View Post
                          I do not know. We have not used yet FusionCatcher. We have been testing TopHat-fusion, FusionMap, ChimeraScan, and FusionFinder. We found puzzling that all these four give thousands of candidate fusion genes per sample (some even hundred of thousands) when we know from the medical literature that there should not be more than 1-3 fusion genes per sample!!! Therefore one has here 99% false positives.

                          UPDATE: We started testing SOAPfuse and we start to like it!
                          Hi!
                          Is it possible to use SOAPfuse with hg38? If so, how would I do this? I am a bit lost.

                          Thanks in advance!

                          Comment

                          Latest Articles

                          Collapse

                          ad_right_rmr

                          Collapse

                          News

                          Collapse

                          Topics Statistics Last Post
                          Started by SEQadmin2, 06-09-2026, 11:58 AM
                          0 responses
                          16 views
                          0 reactions
                          Last Post SEQadmin2  
                          Started by SEQadmin2, 06-05-2026, 10:09 AM
                          0 responses
                          26 views
                          0 reactions
                          Last Post SEQadmin2  
                          Started by SEQadmin2, 06-04-2026, 08:59 AM
                          0 responses
                          37 views
                          0 reactions
                          Last Post SEQadmin2  
                          Started by SEQadmin2, 06-02-2026, 12:03 PM
                          0 responses
                          61 views
                          0 reactions
                          Last Post SEQadmin2  
                          Working...