Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Looking for software tools similar to Vmatch

    Hello All,

    Maybe this is a redundant question and has been asked previously, but I could not find a suitable post/thread addressing it. So I am writing a new one.

    I want to cluster ESTs and long read (>100 bp) data into unique transcripts. I came across a template protocol for doing so on the PlantGDB website, which uses Vmatch. Due to budget cuts in my organization, I am unable to use the licensed version of Vmatch. Therefore I am looking for alternatives to Vmatch. I found an article on ClustDB, but I am getting a "page not found" message.

    Can anyone suggest some alternatives to Vmatch that I can try? I would like something that is command-line driven, so that I can automate the same.

    Best Wishes
    Abhijit

  • #2
    Try using est/DNA assemblers to do de nowo cDNA assembly.

    I assume that you would like to make a good de novo assembly of the cDNA dataset.

    So do the following:
    1. Start with the longest reads possible (if Illumina use 2x250 reads) or use pacbio isoseq.
    2. Flash (if Illumina PE).
    3. Subsample (start with 100K reads), assemble, curate and for the next round:
    Eliminate ("vector" screen) reads matching curated previous round(s) and repeat point 3 with 10X more data.

    Once you you have processed all data or reached a saturation - create a reference from all steps combined and map reads to it...

    This approach can cut the computation resources (RAM)/time required by orders of magnitude... and usually works quite well with most DNA assembly programs (increase minmatch/kmer length to 31 or more if it is at 12-22 bp range).

    Comment


    • #3
      @Abhijit: Take a look at CD-HIT.

      Comment


      • #4
        @Markiyan: Will any EST assembler work? I was thinking of IDBA or SOAP-de novo. Any you would suggest?

        @Genomax: How does CD-Hit-EST compare to any of the other RNA-Seq assemblers?

        Comment


        • #5
          Originally posted by gen2prot View Post
          @Genomax: How does CD-Hit-EST compare to any of the other RNA-Seq assemblers?
          That suggestion was not for assembly. I thought you just wanted to cluster EST's you already have. You could cluster reads but that would not be efficient.

          If you are looking to deduplicate your data then bbduk.sh/dedupe.sh from BBMap suite may be options. See this thread.
          Last edited by GenoMax; 06-08-2016, 08:45 AM.

          Comment


          • #6
            Will any EST assembler work? - Yes, any assembler should work.

            @Markiyan: Will any EST assembler work? I was thinking of IDBA or SOAP-de novo. Any you would suggest?

            Using "divide and conquer" approach described above, you can work with any DNA/RNA sequence assembler (even if it was not originally intended for EST assembly by it's authors ex: I've used PHRAP on a few molluscs ESTs in 2009 done with 454 Titanium from cDNA, and got way better results after 4 iterations, than newbler 2.0 in the cDNA mode).

            But it is better to use the tools specifically designed for EST assembly - so feel free to try any assemblers you like, starting from a smaller subset of the reads.

            You can try MIRA or newbler in the DNA mode. Ideally, you want to be able to check your assembly results in the consed or similar assembly editor.

            You also want your EST reads to be as long as reasonably possible to simplify the assembly (Pacbio Isoseq or Illumina in 2x250 mode would be my tools of choice).

            Comment

            Latest Articles

            Collapse

            • seqadmin
              Recent Advances in Sequencing Analysis Tools
              by seqadmin


              The sequencing world is rapidly changing due to declining costs, enhanced accuracies, and the advent of newer, cutting-edge instruments. Equally important to these developments are improvements in sequencing analysis, a process that converts vast amounts of raw data into a comprehensible and meaningful form. This complex task requires expertise and the right analysis tools. In this article, we highlight the progress and innovation in sequencing analysis by reviewing several of the...
              05-06-2024, 07:48 AM
            • seqadmin
              Essential Discoveries and Tools in Epitranscriptomics
              by seqadmin




              The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
              04-22-2024, 07:01 AM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by seqadmin, Yesterday, 02:46 PM
            0 responses
            11 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 05-07-2024, 06:57 AM
            0 responses
            13 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 05-06-2024, 07:17 AM
            0 responses
            17 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 05-02-2024, 08:06 AM
            0 responses
            23 views
            0 likes
            Last Post seqadmin  
            Working...
            X