Seqanswers Leaderboard Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • adaptivegenome
    Super Moderator
    • Nov 2009
    • 436

    FastG format?

    I saw some posters at Biology of Genomes this year mentioning the new FastG format for assemblers. I was wondering if anyone has heard about this and if a spec was available yet.
    Last edited by adaptivegenome; 06-12-2012, 12:38 PM. Reason: typo
  • maubp
    Peter (Biopython etc)
    • Jul 2009
    • 1544

    #2
    It is being discussed for the next Assemblathon competition, and the mailing list was recently made public. See:
    An offshoot of the Genome 10K project, and primarily organized by the UC Davis Genome Center, Assemblathons are contests to assess state-of-the-art methods in the field of genome assembly....

    Comment

    • adaptivegenome
      Super Moderator
      • Nov 2009
      • 436

      #3
      Thanks, this is great. I think the format might be also useful in mapping. However I do realize at some point we won't be mapping to a reference anymore.

      Comment

      • nilshomer
        Nils Homer
        • Nov 2008
        • 1283

        #4
        Originally posted by genericforms View Post
        Thanks, this is great. I think the format might be also useful in mapping. However I do realize at some point we won't be mapping to a reference anymore.
        It will be quite difficult to adapt the FM-index (BWT) based aligners. My prediction would be full-on assembly being the norm in about 2 years.

        Comment

        • adaptivegenome
          Super Moderator
          • Nov 2009
          • 436

          #5
          Originally posted by nilshomer View Post
          It will be quite difficult to adapt the FM-index (BWT) based aligners. My prediction would be full-on assembly being the norm in about 2 years.
          You would know better than me how fast the technology is progressing, so I can't say for sure that it would be worth it, but I think FastG could be useful in specifying alternate reference sequences during mapping. I am not sure it would require significant alteration to existing methods.

          Comment

          • nilshomer
            Nils Homer
            • Nov 2008
            • 1283

            #6
            Originally posted by genericforms View Post
            You would know better than me how fast the technology is progressing, so I can't say for sure that it would be worth it, but I think FastG could be useful in specifying alternate reference sequences during mapping. I am not sure it would require significant alteration to existing methods.
            I am not saying mapping to FastG is not possible, I am asserting that FM-indexes are not suitable (yet) for multiple hapolotypes in the same reference sequence.

            Comment

            • lh3
              Senior Member
              • Feb 2008
              • 686

              #7
              The reference genome will still be relevant even if we could get a very good assembly. After all, the annotations are in the reference coordinate. To annotate a new assembly, we need to map the assembly to the reference genome.

              We all wish to map data to a graph, but few have a clear definition of the problem, let alone the solution. Adopting graph alignment is likely to take longer than we hope. For now, my vague vision is a graph alone is not enough. We also need the alignment between the graph and the reference.

              As to fastg, you can read from the archive that I a little worry about its scope (final scaffold only or generic sequence graph?), technical complexity (simpler and easier to parse format?) and mathematical clarity (more straightforward graph interpretation?), but probably it is me who has the wrong opinions.

              Comment

              • adaptivegenome
                Super Moderator
                • Nov 2009
                • 436

                #8
                Originally posted by lh3 View Post
                The reference genome will still be relevant even if we could get a very good assembly. After all, the annotations are in the reference coordinate. To annotate a new assembly, we need to map the assembly to the reference genome.

                We all wish to map data to a graph, but few have a clear definition of the problem, let alone the solution. Adopting graph alignment is likely to take longer than we hope. For now, my vague vision is a graph alone is not enough. We also need the alignment between the graph and the reference.

                As to fastg, you can read from the archive that I a little worry about its scope (final scaffold only or generic sequence graph?), technical complexity (simpler and easier to parse format?) and mathematical clarity (more straightforward graph interpretation?), but probably it is me who has the wrong opinions.
                I would be the first to admit that I am probably underestimating the complexity here, but a graph approach would be really nice.

                I suppose the final specs are not released yet, however from the conference it seems that the format is very easy to parse and represents an obvious advance from an IUPAC coded reference (you can explicitly define indels, repeats, etc.).

                Comment

                • vmakinen
                  Junior Member
                  • Feb 2013
                  • 1

                  #9
                  Originally posted by nilshomer View Post
                  I am not saying mapping to FastG is not possible, I am asserting that FM-indexes are not suitable (yet) for multiple hapolotypes in the same reference sequence.
                  Actually they are already suitable. A slight modification to BWT is enough:

                  Comment

                  • kbradnam
                    Member
                    • May 2011
                    • 54

                    #10
                    FASTG v1.0 spec is now available from here:

                    Compare the best free open source System Software at SourceForge. Free, secure and fast System Software downloads from the largest Open Source applications and software directory

                    Comment

                    Latest Articles

                    Collapse

                    • seqadmin
                      New Genomics Tools and Methods Shared at AGBT 2025
                      by seqadmin


                      This year’s Advances in Genome Biology and Technology (AGBT) General Meeting commemorated the 25th anniversary of the event at its original venue on Marco Island, Florida. While this year’s event didn’t include high-profile musical performances, the industry announcements and cutting-edge research still drew the attention of leading scientists.

                      The Headliner
                      The biggest announcement was Roche stepping back into the sequencing platform market. In the years since...
                      03-03-2025, 01:39 PM
                    • seqadmin
                      Investigating the Gut Microbiome Through Diet and Spatial Biology
                      by seqadmin




                      The human gut contains trillions of microorganisms that impact digestion, immune functions, and overall health1. Despite major breakthroughs, we’re only beginning to understand the full extent of the microbiome’s influence on health and disease. Advances in next-generation sequencing and spatial biology have opened new windows into this complex environment, yet many questions remain. This article highlights two recent studies exploring how diet influences microbial...
                      02-24-2025, 06:31 AM

                    ad_right_rmr

                    Collapse

                    News

                    Collapse

                    Topics Statistics Last Post
                    Started by seqadmin, Yesterday, 05:03 AM
                    0 responses
                    16 views
                    0 reactions
                    Last Post seqadmin  
                    Started by seqadmin, 03-19-2025, 07:27 AM
                    0 responses
                    15 views
                    0 reactions
                    Last Post seqadmin  
                    Started by seqadmin, 03-18-2025, 12:50 PM
                    0 responses
                    16 views
                    0 reactions
                    Last Post seqadmin  
                    Started by seqadmin, 03-03-2025, 01:15 PM
                    0 responses
                    185 views
                    0 reactions
                    Last Post seqadmin  
                    Working...