Seqanswers Leaderboard Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • perencia
    Junior Member
    • Jun 2010
    • 6

    Large RNA sequences ? Does it has any sense ?

    Hi!

    First, i'm a computer scientist recently exploring bioinformatics field, so please forgive me if i say something really stupid

    Basically i'm studying the possibility of implementing Nussinov-Jacobsen algorithm on GPU's, accelerating, if possible, time performance in orders of magnitude; but to accomplish that, the RNA sequence has to be very large. I was wondering if it has some sense since i've seen most RNA seqs are about 200 bases.

    Thanks!
  • raela
    Member
    • Apr 2010
    • 39

    #2
    How long do you mean by 'very large'? It depends on the sequencing technology used and the length ordered. Even 200 is somewhat in the 'long' range for NGS (I believe).

    Comment

    • epigen
      Senior Member
      • May 2010
      • 101

      #3
      What kind of RNA do you mean? There are entire RNA genomes of bacteria. And normal mRNAs are some 100s to 1000s nucleotides long.
      By quickly googling Nussinov-Jacobsen I learned that you can do RNA folding prediction with it. That only makes sense for small RNAs.

      Comment

      • krobison
        Senior Member
        • Nov 2007
        • 734

        #4
        454 has 400+ base reads & PacBio is promising reads that long or much longer.

        Folding of longer RNAs could be interesting, as secondary structure is sometimes involved in the stability, localization or utilization of an RNA.

        It's a niche, but that doesn't mean it isn't interesting.

        Comment

        • mrawlins
          Member
          • Apr 2010
          • 63

          #5
          Many RNAs are long, but the current sequencing technologies fragment them prior to sequencing, since they perform better on shorter sequences. SOLiD works up to 50 bp, Illumina works up to about 100 bp, and 454 can get a few hundred bp. If you want something larger you'll have to piece together multiple reads into a longer consensus sequence.
          It seems to me, though, that if you need a longer consensus sequence you could just use the complement of the genomic sequence (which is the RNA sequence) for some interesting genes. If your goal is to demonstrate an algorithmic speedup using a GPU-based approach it seems that it wouldn't be important to have cutting-edge RNA data, but it would be better to use a well studied RNA (like ribosomal RNA or tRNA) for your comparison.

          Comment

          • perencia
            Junior Member
            • Jun 2010
            • 6

            #6
            Originally posted by mrawlins View Post
            Many RNAs are long, but the current sequencing technologies fragment them prior to sequencing, since they perform better on shorter sequences. SOLiD works up to 50 bp, Illumina works up to about 100 bp, and 454 can get a few hundred bp. If you want something larger you'll have to piece together multiple reads into a longer consensus sequence.
            It seems to me, though, that if you need a longer consensus sequence you could just use the complement of the genomic sequence (which is the RNA sequence) for some interesting genes. If your goal is to demonstrate an algorithmic speedup using a GPU-based approach it seems that it wouldn't be important to have cutting-edge RNA data, but it would be better to use a well studied RNA (like ribosomal RNA or tRNA) for your comparison.
            Ok.

            And how many nucleotides can have that ribosomal or tRNA ?

            Comment

            • mrawlins
              Member
              • Apr 2010
              • 63

              #7
              In Shewanella the longest ribosomal sequence is about 2900 bases long. Human ribosome sequences may be a bit longer. The tRNAs (in Shewanella) are about 76 bases long. I seem to recall that tRNAs have some of the best documented secondary structure, though, so while they may not make a good test of your algorithm's speed, they might be a good test of accuracy.

              Comment

              • perencia
                Junior Member
                • Jun 2010
                • 6

                #8
                Originally posted by mrawlins View Post
                In Shewanella the longest ribosomal sequence is about 2900 bases long. Human ribosome sequences may be a bit longer. The tRNAs (in Shewanella) are about 76 bases long. I seem to recall that tRNAs have some of the best documented secondary structure, though, so while they may not make a good test of your algorithm's speed, they might be a good test of accuracy.
                Thanks!

                I wonder, what about the searching for structures on an entire genome ?

                Comment

                • maubp
                  Peter (Biopython etc)
                  • Jul 2009
                  • 1544

                  #9
                  Originally posted by perencia View Post
                  I wonder, what about the searching for structures on an entire genome ?
                  Only some viruses have an RNA genome (most organisms use DNA), and their genomes tend not to be very big (not big enough to worry about GPU optimisations I would guess).

                  Comment

                  • Bruins
                    Member
                    • Feb 2010
                    • 78

                    #10
                    Have you had a chance to take a look at Rfam and Sean Eddy's Infernal?

                    Comment

                    • perencia
                      Junior Member
                      • Jun 2010
                      • 6

                      #11
                      Originally posted by Bruins View Post
                      Have you had a chance to take a look at Rfam and Sean Eddy's Infernal?
                      No, but i'll look for them now

                      I've searching a little more, and found that report



                      and that implementation



                      Former is a GPU implementation of the Unafold Algorithm
                      ( http://mfold.bioinfo.rpi.edu/ )

                      It seems that a GPU optimisation may take room on a multiple RNA structure prediction, as in the first report ( a set of 11 Picor-
                      naviral sequences (7124 to 8214 nucleotides)).
                      I'll post what i find

                      Anyway, i'd like to known which are the benefits from such procedures, where do they impact. Bioinformatics is a large field i guess .
                      Last edited by perencia; 07-29-2010, 07:12 AM.

                      Comment

                      Latest Articles

                      Collapse

                      • seqadmin
                        Pathogen Surveillance with Advanced Genomic Tools
                        by seqadmin




                        The COVID-19 pandemic highlighted the need for proactive pathogen surveillance systems. As ongoing threats like avian influenza and newly emerging infections continue to pose risks, researchers are working to improve how quickly and accurately pathogens can be identified and tracked. In a recent SEQanswers webinar, two experts discussed how next-generation sequencing (NGS) and machine learning are shaping efforts to monitor viral variation and trace the origins of infectious...
                        03-24-2025, 11:48 AM
                      • seqadmin
                        New Genomics Tools and Methods Shared at AGBT 2025
                        by seqadmin


                        This year’s Advances in Genome Biology and Technology (AGBT) General Meeting commemorated the 25th anniversary of the event at its original venue on Marco Island, Florida. While this year’s event didn’t include high-profile musical performances, the industry announcements and cutting-edge research still drew the attention of leading scientists.

                        The Headliner
                        The biggest announcement was Roche stepping back into the sequencing platform market. In the years since...
                        03-03-2025, 01:39 PM

                      ad_right_rmr

                      Collapse

                      News

                      Collapse

                      Topics Statistics Last Post
                      Started by seqadmin, 03-20-2025, 05:03 AM
                      0 responses
                      49 views
                      0 reactions
                      Last Post seqadmin  
                      Started by seqadmin, 03-19-2025, 07:27 AM
                      0 responses
                      57 views
                      0 reactions
                      Last Post seqadmin  
                      Started by seqadmin, 03-18-2025, 12:50 PM
                      0 responses
                      50 views
                      0 reactions
                      Last Post seqadmin  
                      Started by seqadmin, 03-03-2025, 01:15 PM
                      0 responses
                      201 views
                      0 reactions
                      Last Post seqadmin  
                      Working...