Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • digitonin
    Junior Member
    • Mar 2009
    • 6

    Warning: Encountered reference sequence with only gaps

    Does anyone know what this output in tophat mean? How much of a problem is it?
  • scozza
    Member
    • Jan 2009
    • 16

    #2
    I am curious too...

    I am curious as well as to what this truly means and if it should be a concern. I encountered it on my latest runs in which I am using the Ensembl version of the human genome assembly. An example of the warning displayed is:

    [Sun Aug 08 23:20:23 2010] Beginning TopHat run (v1.0.14)
    -----------------------------------------------
    [Sun Aug 08 23:20:23 2010] Preparing output location brain/
    [Sun Aug 08 23:20:23 2010] Checking for Bowtie index files
    [Sun Aug 08 23:20:23 2010] Checking for reference FASTA file
    [Sun Aug 08 23:20:23 2010] Checking for Bowtie
    Bowtie version: 0.12.5.0
    [Sun Aug 08 23:20:23 2010] Checking reads
    seed length: 36bp
    format: fastq
    quality scale: phred33 (default)
    [Sun Aug 08 23:23:07 2010] Mapping reads against Homo_sapiens.GRCh37.59 with Bowtie
    [Sun Aug 08 23:26:43 2010] Joining segment hits
    [Sun Aug 08 23:27:09 2010] Mapping reads against Homo_sapiens.GRCh37.59 with Bowtie
    [Sun Aug 08 23:31:02 2010] Joining segment hits
    [Sun Aug 08 23:31:26 2010] Searching for junctions via segment mapping
    [Sun Aug 08 23:46:58 2010] Retrieving sequences for splices
    [Sun Aug 08 23:52:48 2010] Indexing splices
    Warning: Encountered reference sequence with only gaps
    Warning: Encountered reference sequence with only gaps
    Warning: Encountered reference sequence with only gaps
    Warning: Encountered reference sequence with only gaps
    Warning: Encountered reference sequence with only gaps
    Warning: Encountered reference sequence with only gaps
    Warning: Encountered reference sequence with only gaps
    Warning: Encountered reference sequence with only gaps
    Warning: Encountered reference sequence with only gaps
    Warning: Encountered reference sequence with only gaps
    Warning: Encountered reference sequence with only gaps
    Warning: Encountered reference sequence with only gaps
    Warning: Encountered reference sequence with only gaps
    Warning: Encountered reference sequence with only gaps
    Warning: Encountered reference sequence with only gaps
    Warning: Encountered reference sequence with only gaps
    Warning: Encountered reference sequence with only gaps
    Warning: Encountered reference sequence with only gaps
    Warning: Encountered reference sequence with only gaps
    Warning: Encountered reference sequence with only gaps
    Warning: Encountered reference sequence with only gaps
    Warning: Encountered reference sequence with only gaps
    Warning: Encountered reference sequence with only gaps
    Warning: Encountered reference sequence with only gaps
    Warning: Encountered reference sequence with only gaps
    Warning: Encountered reference sequence with only gaps
    [Sun Aug 08 23:54:58 2010] Mapping reads against segment_juncs with Bowtie
    [Sun Aug 08 23:58:35 2010] Joining segment hits
    [Sun Aug 08 23:59:00 2010] Mapping reads against segment_juncs with Bowtie
    [Mon Aug 09 00:02:49 2010] Joining segment hits
    [Mon Aug 09 00:03:13 2010] Reporting output tracks
    -----------------------------------------------
    Run complete [00:45:40 elapsed]
    Should we be concerned?

    -steve

    Comment

    • ataheri
      Junior Member
      • Nov 2011
      • 7

      #3
      Did you find the answer to this warning message? I have similar problem with my data as well and appreciate if you can let me know how to deal with it.

      Comment

      • digitonin
        Junior Member
        • Mar 2009
        • 6

        #4
        I never did find an answer to this. I didn't see any problems associated it with it either though.

        Comment

        • DunderChief
          Junior Member
          • Aug 2012
          • 6

          #5
          -------bump-------

          I also get this using Ensembl human reference (v. 68)

          Comment

          • Wallysb01
            Senior Member
            • Feb 2011
            • 286

            #6
            Is it because you are using a repeat masked genome and possibly some of the sequences have been completely masked? I've seen this in other, less complete genomes, so I'm unsure if this would be the case with the human genome.

            Comment

            • DunderChief
              Junior Member
              • Aug 2012
              • 6

              #7
              Thanks Wallysb01, that's propably it. I am using a soft-masked reference (low complexity regions are lowercase).

              Comment

              • Tuinhof
                Member
                • Jul 2012
                • 10

                #8
                I also have the same error message, when I use the Homo_sapiens.GRCh37.71.dna.toplevel reference.
                Does this upper/lower case issue influence the mapping/counting outcome?

                Comment

                • xfh
                  Member
                  • Jan 2011
                  • 26

                  #9
                  I also have the same warning message with Homo_sapiens.GRCh37.71.dna.toplevel reference. anyone can explain it?

                  Comment

                  • Hugo A F Santos
                    Junior Member
                    • Jan 2015
                    • 2

                    #10
                    Originally posted by digitonin View Post
                    I never did find an answer to this. I didn't see any problems associated it with it either though.
                    -----------BUMP-----------

                    First of all, sorry for bumping such a old post, but this problem still persists...
                    Does anyone found a reason why does this happens?
                    I have obtained this same warning message when using the current Drosophila melanogaster reference (r6.03; BDGP5.78)

                    Comment

                    • digitonin
                      Junior Member
                      • Mar 2009
                      • 6

                      #11
                      Hugo,

                      This probably has to do with the lowercase or the use of N's in the reference. Either way, it is not a problem.

                      Comment

                      • Sow
                        Member
                        • Feb 2016
                        • 16

                        #12
                        Hi,
                        I am using bowtie 1.1.2 and I get the same message -
                        encountered reference sequence with gaps
                        My genome is not repmasked and has the following type of header
                        >ta_IWGSC_CSSassembly_1as_v2_44039

                        Does such a warning interfere with the bowtie-build in anyway?

                        Comment

                        • dpryan
                          Devon Ryan
                          • Jul 2011
                          • 3478

                          #13
                          This will happen whenever you have cases like this
                          Code:
                          >chr1
                          >chr2
                          ACAGCTACT
                          or this

                          Code:
                          >chr3
                          NNNNNNN
                          >chr4
                          ACGTAGCTGACT
                          It's just a warning, which means that there won't be an issue with index creation, but you should really fix your fasta files.

                          Comment

                          Latest Articles

                          Collapse

                          • SEQadmin2
                            From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
                            by SEQadmin2


                            Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


                            The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
                            ...
                            06-02-2026, 10:05 AM
                          • SEQadmin2
                            Single-Cell Sequencing at an Inflection Point: Early Impacts of New Platforms and Emerging Trends
                            by SEQadmin2


                            With the launch of new single-cell sequencing platforms in 2026, the field stands at an exciting inflection point. This article surveys the most impactful advances in the field and discusses how they’re reshaping research in cancer, immunology, and beyond.


                            Introduction

                            Single-cell sequencing technologies have undergone remarkable advances over the past decade, transitioning from low-throughput experimental approaches to highly scalable platforms capable of...
                            05-22-2026, 06:42 AM
                          • SEQadmin2
                            Environmental Genomics in the Age of NGS: From Microbes to Conservation Strategies
                            by SEQadmin2

                            Studying ecosystems means dealing with complex, multi-species communities that are hard to observe at scale. This complexity, however, hides many important questions to be answered, from how biogeochemical cycles work and how climate change can affect species distribution to how conservation strategies can work best.


                            Genomics, particularly since the expansion of NGS, has transformed ecosystem ecology. By sequencing environmental DNA, we can now assess biodiversity without direct...
                            05-06-2026, 09:04 AM

                          ad_right_rmr

                          Collapse

                          News

                          Collapse

                          Topics Statistics Last Post
                          Started by SEQadmin2, Today, 08:59 AM
                          0 responses
                          10 views
                          0 reactions
                          Last Post SEQadmin2  
                          Started by SEQadmin2, 06-02-2026, 12:03 PM
                          0 responses
                          21 views
                          0 reactions
                          Last Post SEQadmin2  
                          Started by SEQadmin2, 06-02-2026, 11:40 AM
                          0 responses
                          17 views
                          0 reactions
                          Last Post SEQadmin2  
                          Started by SEQadmin2, 05-28-2026, 11:40 AM
                          0 responses
                          30 views
                          0 reactions
                          Last Post SEQadmin2  
                          Working...