Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Best pipeline for DE and fusion gene using tophat?

    I have rna-seq data that I want to perform DE and find fusion genes. I am more into accuracy than speed. My reads are about 78bp long. Can you tell me which pipeline is better? Or is there an even better way to do this?

    1) bowtie2 only pipeline
    ===============

    1. Run tophat2 --fusion-search with bowtie2 for two samples.

    2. Run tophat-fusion-post on the two outputs for fusion genes.

    3. Run cuffdiff on the two outputs for DE.

    2) bowtie2 for DE and bowtie1 for fusion
    =========================

    1. Run tophat2 --fusion-search --bowtie1 for two samples

    2. Run tophat-fusion-post on the two outputs for fusion genes.

    3. Run tophat2 with bowtie2 for two samples.

    4. Run cuffdiff for the two outputs.

    Which pipeline will provide more accurate result? Thanks!

  • #2
    It has been forgotten the options 3 and 4 which are

    3) use SOAPFuse http://soap.genomics.org.cn/soapfuse.html for finding fusion genes

    or

    4) use FusionCatcher https://code.google.com/p/fusioncatcher/ for finding fusion genes

    TopHat-Fusion has quite high false-positive rate regarding fusion genes compared with other fusion finder tools (for example SOAPfuse).
    I would say the answer to your question is option 3 or option 4.


    Originally posted by ymc View Post
    I have rna-seq data that I want to perform DE and find fusion genes. I am more into accuracy than speed. My reads are about 78bp long. Can you tell me which pipeline is better? Or is there an even better way to do this?

    1) bowtie2 only pipeline
    ===============

    1. Run tophat2 --fusion-search with bowtie2 for two samples.

    2. Run tophat-fusion-post on the two outputs for fusion genes.

    3. Run cuffdiff on the two outputs for DE.

    2) bowtie2 for DE and bowtie1 for fusion
    =========================

    1. Run tophat2 --fusion-search --bowtie1 for two samples

    2. Run tophat-fusion-post on the two outputs for fusion genes.

    3. Run tophat2 with bowtie2 for two samples.

    4. Run cuffdiff for the two outputs.

    Which pipeline will provide more accurate result? Thanks!

    Comment


    • #3
      Dear Senior Member
      I am asking a help in tophat fusion. I have run this successfully with hg19. but got error with mm10.
      Q1. BLAST Database error: No alias or index file found for nucleotide database [blast/nt] in search path
      Q2. However, I continued it took long time compare to normal process in fusion-post and got fusions but the the other samples gives 0 fusion within 10 seconds. why ?
      I have posted here: http://seqanswers.com/forums/showthr...613#post112613
      I would really appreciate if you may answer please ..


      Originally posted by ymc View Post
      I have rna-seq data that I want to perform DE and find fusion genes. I am more into accuracy than speed. My reads are about 78bp long. Can you tell me which pipeline is better? Or is there an even better way to do this?

      1) bowtie2 only pipeline
      ===============

      1. Run tophat2 --fusion-search with bowtie2 for two samples.

      2. Run tophat-fusion-post on the two outputs for fusion genes.

      3. Run cuffdiff on the two outputs for DE.

      2) bowtie2 for DE and bowtie1 for fusion
      =========================

      1. Run tophat2 --fusion-search --bowtie1 for two samples

      2. Run tophat-fusion-post on the two outputs for fusion genes.

      3. Run tophat2 with bowtie2 for two samples.

      4. Run cuffdiff for the two outputs.

      Which pipeline will provide more accurate result? Thanks!

      Comment


      • #4
        Originally posted by jp. View Post
        Dear Senior Member
        I am asking a help in tophat fusion. I have run this successfully with hg19. but got error with mm10.
        Q1. BLAST Database error: No alias or index file found for nucleotide database [blast/nt] in search path
        Q2. However, I continued it took long time compare to normal process in fusion-post and got fusions but the the other samples gives 0 fusion within 10 seconds. why ?
        I have posted here: http://seqanswers.com/forums/showthr...613#post112613
        I would really appreciate if you may answer please ..
        For your Q1, did you download mouse blast db file?

        Comment


        • #5
          I have downloaded blast as follows from ftp://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/:
          1. ncbi-blast-2.2.28+
          2. extracted est_mouse.tar.gz mouse_genomic_transcript.tar.gz within ncbi-blast-2.2.28+/ and ncbi-blast-2.2.28+/bin
          I am not sure about other_genomic* and nt* ?
          I used and exported PATH=$PATH:bowtie1, tophat2, blast, samtools
          I have all the index of Mus_musculus refseq from UCSC, Ensembl.
          Am I missing something ?

          Originally posted by ymc View Post
          For your Q1, did you download mouse blast db file?

          Comment


          • #6
            I don't see any file related to mouse here

            ftp://ftp.ncbi.nlm.nih.gov/blast/db/

            but according to the manual, it didn't say whether other organisms need these file. It only said you need to add --non-human option

            Comment


            • #7
              Yes, I used --non-human option but my fusion results are same without difference with or without --non-human option.
              I used recommended reference files.
              1. However, the error was there and it gave me fusions, when I continued with errors.
              2. for the other samples, it is giving no error but 0 fusions.

              Can you please guess something ?


              Originally posted by ymc View Post
              I don't see any file related to mouse here

              ftp://ftp.ncbi.nlm.nih.gov/blast/db/

              but according to the manual, it didn't say whether other organisms need these file. It only said you need to add --non-human option

              Comment


              • #8
                Originally posted by jp. View Post
                Yes, I used --non-human option but my fusion results are same without difference with or without --non-human option.
                I used recommended reference files.
                1. However, the error was there and it gave me fusions, when I continued with errors.
                2. for the other samples, it is giving no error but 0 fusions.

                Can you please guess something ?
                I am not doing anything related to non-human, so I am afraid I am not the right person for you to ask. Maybe you should send an email to tophat authors directly? Or try other tools that have better documentation about non-human applications.

                Good luck!

                Comment


                • #9
                  I am curious why do you need to find fusion genes in mouse? Are you studying cancer specific to mouse?

                  If you are working on human cancer xenograft, then I suppose the rna-seq data you generated is still human not mouse.

                  Comment


                  • #10
                    These are from the cancer models, therefore I need to do it.
                    What you think is the reason of the problem with blast /nt?

                    Originally posted by ymc View Post
                    I am curious why do you need to find fusion genes in mouse? Are you studying cancer specific to mouse?

                    If you are working on human cancer xenograft, then I suppose the rna-seq data you generated is still human not mouse.

                    Comment


                    • #11
                      Did you download all these files?

                      nt.00.nhd nt.02.nhr nt.04.nnd nt.06.nog nt.08.nsi nt.11.nhd nt.13.nhr
                      nt.00.nhi nt.02.nin nt.04.nni nt.06.nsd nt.08.nsq nt.11.nhi nt.13.nin
                      nt.00.nhr nt.02.nnd nt.04.nog nt.06.nsi nt.09.nhd nt.11.nhr nt.13.nnd
                      nt.00.nin nt.02.nni nt.04.nsd nt.06.nsq nt.09.nhi nt.11.nin nt.13.nni
                      nt.00.nnd nt.02.nog nt.04.nsi nt.07.nhd nt.09.nhr nt.11.nnd nt.13.nog
                      nt.00.nni nt.02.nsd nt.04.nsq nt.07.nhi nt.09.nin nt.11.nni nt.13.nsd
                      nt.00.nog nt.02.nsi nt.05.nhd nt.07.nhr nt.09.nnd nt.11.nog nt.13.nsi
                      nt.00.nsd nt.02.nsq nt.05.nhi nt.07.nin nt.09.nni nt.11.nsd nt.13.nsq
                      nt.00.nsi nt.03.nhd nt.05.nhr nt.07.nnd nt.09.nog nt.11.nsi nt.14.nhd
                      nt.00.nsq nt.03.nhi nt.05.nin nt.07.nni nt.09.nsd nt.11.nsq nt.14.nhi
                      nt.01.nhd nt.03.nhr nt.05.nnd nt.07.nog nt.09.nsi nt.12.nhd nt.14.nhr
                      nt.01.nhi nt.03.nin nt.05.nni nt.07.nsd nt.09.nsq nt.12.nhi nt.14.nin
                      nt.01.nhr nt.03.nnd nt.05.nog nt.07.nsi nt.10.nhd nt.12.nhr nt.14.nnd
                      nt.01.nin nt.03.nni nt.05.nsd nt.07.nsq nt.10.nhi nt.12.nin nt.14.nni
                      nt.01.nnd nt.03.nog nt.05.nsi nt.08.nhd nt.10.nhr nt.12.nnd nt.14.nog
                      nt.01.nni nt.03.nsd nt.05.nsq nt.08.nhi nt.10.nin nt.12.nni nt.14.nsd
                      nt.01.nog nt.03.nsi nt.06.nhd nt.08.nhr nt.10.nnd nt.12.nog nt.14.nsi
                      nt.01.nsd nt.03.nsq nt.06.nhi nt.08.nin nt.10.nni nt.12.nsd nt.14.nsq
                      nt.01.nsi nt.04.nhd nt.06.nhr nt.08.nnd nt.10.nog nt.12.nsi nt.nal
                      nt.01.nsq nt.04.nhi nt.06.nin nt.08.nni nt.10.nsd nt.12.nsq
                      nt.02.nhd nt.04.nhr nt.06.nnd nt.08.nog nt.10.nsi nt.13.nhd
                      nt.02.nhi nt.04.nin nt.06.nni nt.08.nsd nt.10.nsq nt.13.nhi

                      Comment


                      • #12
                        Oh..no. I didn't because I though there are not related with mouse.
                        Do I need these for mouse ?
                        I am downloading now, please reply if these are the one required for my analysis (I will then add them in ./ncbi-blast-2.2.28+/bin).
                        Thank you very much for your guidance.

                        Originally posted by ymc View Post
                        Did you download all these files?

                        nt.00.nhd nt.02.nhr nt.04.nnd nt.06.nog nt.08.nsi nt.11.nhd nt.13.nhr
                        ........ nt.08.nsd nt.10.nsq nt.13.nhi

                        Comment


                        • #13
                          Originally posted by jp. View Post
                          Oh..no. I didn't because I though there are not related with mouse.
                          Do I need these for mouse ?
                          I am downloading now, please reply if these are the one required for my analysis (I will then add them in ./ncbi-blast-2.2.28+/bin).
                          Thank you very much for your guidance.
                          I am not sure but your error message said you don't have these files. Good luck!

                          Comment


                          • #14
                            Well, I finally tried soapfuse-1.26 on my tumor-normal RNA-seq data that has 100mil PE reads each.

                            Soapfuse ran for 7 days and crashed at step 8. At one point, the program took out 300GB of my disk space. In contrast, tophat only generated about 80GB temp files and of course it ran to completion.

                            Therefore, I think I have to give up on soapfuse.

                            I will give fusioncatcher a try later...

                            Comment


                            • #15
                              For whoever is interested, my soapfuse error at s08 is

                              RILP-004/503/SCARF1-001/152 encounters wrong start codon: CTG

                              That doesn't seem like something I know how to fix.

                              Comment

                              Latest Articles

                              Collapse

                              • seqadmin
                                Current Approaches to Protein Sequencing
                                by seqadmin


                                Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                                04-04-2024, 04:25 PM
                              • seqadmin
                                Strategies for Sequencing Challenging Samples
                                by seqadmin


                                Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                                03-22-2024, 06:39 AM

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by seqadmin, 04-11-2024, 12:08 PM
                              0 responses
                              14 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 04-10-2024, 10:19 PM
                              0 responses
                              21 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 04-10-2024, 09:21 AM
                              0 responses
                              16 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 04-04-2024, 09:00 AM
                              0 responses
                              44 views
                              0 likes
                              Last Post seqadmin  
                              Working...
                              X