Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Warning: Encountered reference sequence with only gaps

    Does anyone know what this output in tophat mean? How much of a problem is it?

  • #2
    I am curious too...

    I am curious as well as to what this truly means and if it should be a concern. I encountered it on my latest runs in which I am using the Ensembl version of the human genome assembly. An example of the warning displayed is:

    [Sun Aug 08 23:20:23 2010] Beginning TopHat run (v1.0.14)
    -----------------------------------------------
    [Sun Aug 08 23:20:23 2010] Preparing output location brain/
    [Sun Aug 08 23:20:23 2010] Checking for Bowtie index files
    [Sun Aug 08 23:20:23 2010] Checking for reference FASTA file
    [Sun Aug 08 23:20:23 2010] Checking for Bowtie
    Bowtie version: 0.12.5.0
    [Sun Aug 08 23:20:23 2010] Checking reads
    seed length: 36bp
    format: fastq
    quality scale: phred33 (default)
    [Sun Aug 08 23:23:07 2010] Mapping reads against Homo_sapiens.GRCh37.59 with Bowtie
    [Sun Aug 08 23:26:43 2010] Joining segment hits
    [Sun Aug 08 23:27:09 2010] Mapping reads against Homo_sapiens.GRCh37.59 with Bowtie
    [Sun Aug 08 23:31:02 2010] Joining segment hits
    [Sun Aug 08 23:31:26 2010] Searching for junctions via segment mapping
    [Sun Aug 08 23:46:58 2010] Retrieving sequences for splices
    [Sun Aug 08 23:52:48 2010] Indexing splices
    Warning: Encountered reference sequence with only gaps
    Warning: Encountered reference sequence with only gaps
    Warning: Encountered reference sequence with only gaps
    Warning: Encountered reference sequence with only gaps
    Warning: Encountered reference sequence with only gaps
    Warning: Encountered reference sequence with only gaps
    Warning: Encountered reference sequence with only gaps
    Warning: Encountered reference sequence with only gaps
    Warning: Encountered reference sequence with only gaps
    Warning: Encountered reference sequence with only gaps
    Warning: Encountered reference sequence with only gaps
    Warning: Encountered reference sequence with only gaps
    Warning: Encountered reference sequence with only gaps
    Warning: Encountered reference sequence with only gaps
    Warning: Encountered reference sequence with only gaps
    Warning: Encountered reference sequence with only gaps
    Warning: Encountered reference sequence with only gaps
    Warning: Encountered reference sequence with only gaps
    Warning: Encountered reference sequence with only gaps
    Warning: Encountered reference sequence with only gaps
    Warning: Encountered reference sequence with only gaps
    Warning: Encountered reference sequence with only gaps
    Warning: Encountered reference sequence with only gaps
    Warning: Encountered reference sequence with only gaps
    Warning: Encountered reference sequence with only gaps
    Warning: Encountered reference sequence with only gaps
    [Sun Aug 08 23:54:58 2010] Mapping reads against segment_juncs with Bowtie
    [Sun Aug 08 23:58:35 2010] Joining segment hits
    [Sun Aug 08 23:59:00 2010] Mapping reads against segment_juncs with Bowtie
    [Mon Aug 09 00:02:49 2010] Joining segment hits
    [Mon Aug 09 00:03:13 2010] Reporting output tracks
    -----------------------------------------------
    Run complete [00:45:40 elapsed]
    Should we be concerned?

    -steve

    Comment


    • #3
      Did you find the answer to this warning message? I have similar problem with my data as well and appreciate if you can let me know how to deal with it.

      Comment


      • #4
        I never did find an answer to this. I didn't see any problems associated it with it either though.

        Comment


        • #5
          -------bump-------

          I also get this using Ensembl human reference (v. 68)

          Comment


          • #6
            Is it because you are using a repeat masked genome and possibly some of the sequences have been completely masked? I've seen this in other, less complete genomes, so I'm unsure if this would be the case with the human genome.

            Comment


            • #7
              Thanks Wallysb01, that's propably it. I am using a soft-masked reference (low complexity regions are lowercase).

              Comment


              • #8
                I also have the same error message, when I use the Homo_sapiens.GRCh37.71.dna.toplevel reference.
                Does this upper/lower case issue influence the mapping/counting outcome?

                Comment


                • #9
                  I also have the same warning message with Homo_sapiens.GRCh37.71.dna.toplevel reference. anyone can explain it?

                  Comment


                  • #10
                    Originally posted by digitonin View Post
                    I never did find an answer to this. I didn't see any problems associated it with it either though.
                    -----------BUMP-----------

                    First of all, sorry for bumping such a old post, but this problem still persists...
                    Does anyone found a reason why does this happens?
                    I have obtained this same warning message when using the current Drosophila melanogaster reference (r6.03; BDGP5.78)

                    Comment


                    • #11
                      Hugo,

                      This probably has to do with the lowercase or the use of N's in the reference. Either way, it is not a problem.

                      Comment


                      • #12
                        Hi,
                        I am using bowtie 1.1.2 and I get the same message -
                        encountered reference sequence with gaps
                        My genome is not repmasked and has the following type of header
                        >ta_IWGSC_CSSassembly_1as_v2_44039

                        Does such a warning interfere with the bowtie-build in anyway?

                        Comment


                        • #13
                          This will happen whenever you have cases like this
                          Code:
                          >chr1
                          >chr2
                          ACAGCTACT
                          or this

                          Code:
                          >chr3
                          NNNNNNN
                          >chr4
                          ACGTAGCTGACT
                          It's just a warning, which means that there won't be an issue with index creation, but you should really fix your fasta files.

                          Comment

                          Latest Articles

                          Collapse

                          • seqadmin
                            Best Practices for Single-Cell Sequencing Analysis
                            by seqadmin



                            While isolating and preparing single cells for sequencing was historically the bottleneck, recent technological advancements have shifted the challenge to data analysis. This highlights the rapidly evolving nature of single-cell sequencing. The inherent complexity of single-cell analysis has intensified with the surge in data volume and the incorporation of diverse and more complex datasets. This article explores the challenges in analysis, examines common pitfalls, offers...
                            06-06-2024, 07:15 AM
                          • seqadmin
                            Latest Developments in Precision Medicine
                            by seqadmin



                            Technological advances have led to drastic improvements in the field of precision medicine, enabling more personalized approaches to treatment. This article explores four leading groups that are overcoming many of the challenges of genomic profiling and precision medicine through their innovative platforms and technologies.

                            Somatic Genomics
                            “We have such a tremendous amount of genetic diversity that exists within each of us, and not just between us as individuals,”...
                            05-24-2024, 01:16 PM

                          ad_right_rmr

                          Collapse

                          News

                          Collapse

                          Topics Statistics Last Post
                          Started by seqadmin, 06-14-2024, 07:24 AM
                          0 responses
                          12 views
                          0 likes
                          Last Post seqadmin  
                          Started by seqadmin, 06-13-2024, 08:58 AM
                          0 responses
                          13 views
                          0 likes
                          Last Post seqadmin  
                          Started by seqadmin, 06-12-2024, 02:20 PM
                          0 responses
                          17 views
                          0 likes
                          Last Post seqadmin  
                          Started by seqadmin, 06-07-2024, 06:58 AM
                          0 responses
                          184 views
                          0 likes
                          Last Post seqadmin  
                          Working...
                          X