Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • super0925
    Senior Member
    • Feb 2014
    • 206

    Cufflinks/Cuffmerge/Cuffdiff error 'subsequence cannot be larger than 16338'

    Hi all
    I'm doing DE analysis on cow samples.
    I use the reference genome of Ensembl UMD3.1, which I thought is the latest version.
    However, when I ran Tophat2-Cuffdiff2 pipeline (default parameter setting), I still get this warning:

    Warning: couldn't find fasta record for 'GJ058256.1'!
    This contig will not be bias corrected.
    Warning: couldn't find fasta record for 'GJ058424.1'!
    ......



    (1) What does it mean? Is it very trouble for my downstream analysis?

    And if I ran Tophat2-Cufflinks/Cuffmerge-Cuffdiff2 ((default parameter setting)), I got the error:
    Warning: couldn't find fasta record for 'GJ060129.1'!
    This contig will not be bias corrected.
    ......

    Error (GFaSeqGet): subsequence cannot be larger than 16338
    Error getting subseq for TCONS_00062149 (1..16448)!


    (2) Why could I get this result? How can I go on with Tophat2-Cufflinks/Cuffmerge-Cuffdiff2?


    Thank you !
    Last edited by super0925; 03-05-2015, 08:39 AM.
  • super0925
    Senior Member
    • Feb 2014
    • 206

    #2
    Cufflinks/Cuffmerge/Cuffdiff error 'subsequence cannot be larger than 16338'

    Hi all
    I'm doing DE analysis on cow samples.
    I use the reference genome of Ensembl UMD3.1, which I thought is the latest version.
    However, when I ran Tophat2-Cuffdiff2 pipeline (default parameter setting), I still get this warning:


    Warning: couldn't find fasta record for 'GJ058256.1'!
    This contig will not be bias corrected.
    Warning: couldn't find fasta record for 'GJ058424.1'!
    ......



    (1) What does it mean? Is it very trouble for my downstream analysis?

    And if I ran Tophat2-Cufflinks/Cuffmerge-Cuffdiff2 ((default parameter setting)), I got the error:

    Warning: couldn't find fasta record for 'GJ060129.1'!
    This contig will not be bias corrected.

    ......

    Error (GFaSeqGet): subsequence cannot be larger than 16338
    Error getting subseq for TCONS_00062149 (1..16448)!


    (2) Why could I get this result? How can I go on with Tophat2-Cufflinks/Cuffmerge-Cuffdiff2?


    Thank you !
    Last edited by super0925; 03-05-2015, 08:40 AM.

    Comment

    • GenoMax
      Senior Member
      • Feb 2008
      • 7142

      #3
      I don't have the cow iGenomes set but my guess is that "GJ058424.1" is in the GTF file but is not in the genome sequence file. It appears to be SH3YL1 gene now.

      Someone else will need to comment on the other error.

      Comment

      • super0925
        Senior Member
        • Feb 2014
        • 206

        #4
        Originally posted by GenoMax View Post
        I don't have the cow iGenomes set but my guess is that "GJ058424.1" is in the GTF file but is not in the genome sequence file. It appears to be SH3YL1 gene now.

        Someone else will need to comment on the other error.
        Why do I get this warning in (1) and error in (2)?
        Is warning in (1) very critical for downstream analysis?
        How to solve it?
        Cheers

        Comment

        • super0925
          Senior Member
          • Feb 2014
          • 206

          #5
          Anyone could help?

          GTF file are genes.gtf from UMD3.1

          The first column at genes.gtf (I think it is chromosome) is
          1
          10
          11
          12
          13
          14
          15
          16
          17
          18
          19
          2
          20
          21
          22
          23
          24
          25
          26
          27
          28
          29
          3
          4
          5
          6
          7
          8
          9
          GJ058256.1
          GJ058424.1
          GJ058425.1
          GJ058430.1
          GJ058433.1
          GJ058437.1
          GJ058729.1
          GJ059463.1
          GJ059486.1
          GJ059509.1
          GJ059556.1
          GJ059670.1
          GJ060027.1
          GJ060032.1
          GJ060118.1
          GJ060120.1
          GJ060129.1
          MT
          X
          Last edited by super0925; 05-12-2015, 11:17 AM.

          Comment

          • super0925
            Senior Member
            • Feb 2014
            • 206

            #6
            Anyone could help?

            GTF file are genes.gtf from UMD3.1

            The first column at genes.gtf (I think it is chromosome) is
            1
            10
            11
            12
            13
            14
            15
            16
            17
            18
            19
            2
            20
            21
            22
            23
            24
            25
            26
            27
            28
            29
            3
            4
            5
            6
            7
            8
            9
            GJ058256.1
            GJ058424.1
            GJ058425.1
            GJ058430.1
            GJ058433.1
            GJ058437.1
            GJ058729.1
            GJ059463.1
            GJ059486.1
            GJ059509.1
            GJ059556.1
            GJ059670.1
            GJ060027.1
            GJ060032.1
            GJ060118.1
            GJ060120.1
            GJ060129.1
            MT
            X

            Comment

            • GenoMax
              Senior Member
              • Feb 2008
              • 7142

              #7
              Did you get these files from iGenomes or is this something you put together by getting files (seq, annotation etc) from individual sources?

              Comment

              • super0925
                Senior Member
                • Feb 2014
                • 206

                #8
                Originally posted by GenoMax View Post
                Did you get these files from iGenomes or is this something you put together by getting files (seq, annotation etc) from individual sources?
                I downloaded from iGenome...
                Do you mean my files is abnormal?

                Comment

                • GenoMax
                  Senior Member
                  • Feb 2008
                  • 7142

                  #9
                  Originally posted by super0925 View Post
                  I downloaded from iGenome...
                  Do you mean my files is abnormal?
                  No. One of the reasons to get this data from iGenomes is it has (supposedly) been checked for consistency so the kind of thing you have run into does not happen. It is possible that you may have downloaded a flawed version that has since been fixed (you could download a new copy and compare).

                  I hesitate to recommend that you get sequences of missing fasta from NCBI and append them to your genome.fa file (you will likely need to re-index it again). But this may get you past one of the errors.

                  I am not sure how much work you have put into this already but if the new download from iGenomes does have these sequences then you could use that genome.fa file.

                  As for your second error this thread seems to have some options: https://www.biostars.org/p/57249/

                  Comment

                  • karimhasanpur@yahoo.com
                    Junior Member
                    • Nov 2013
                    • 4

                    #10
                    cufflinks warnings: could not find fasta records

                    Dar friends,

                    I am getting the same warnings. I have downloaded Galgal4 reference files from iGenome. When running cufflinks, I am getting "warning: couldn't find fasta record for LGE64 ...". I think these are contigs that are present in genes.gtf but not in the genome.fasta. My question is: could these warnings affect my downstream analyses? If so, what should I do to resolve these problem?

                    any comment would be appreciated
                    Karim

                    Comment

                    Latest Articles

                    Collapse

                    • SEQadmin2
                      From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
                      by SEQadmin2


                      Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


                      The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
                      ...
                      06-02-2026, 10:05 AM
                    • SEQadmin2
                      Single-Cell Sequencing at an Inflection Point: Early Impacts of New Platforms and Emerging Trends
                      by SEQadmin2


                      With the launch of new single-cell sequencing platforms in 2026, the field stands at an exciting inflection point. This article surveys the most impactful advances in the field and discusses how they’re reshaping research in cancer, immunology, and beyond.


                      Introduction

                      Single-cell sequencing technologies have undergone remarkable advances over the past decade, transitioning from low-throughput experimental approaches to highly scalable platforms capable of...
                      05-22-2026, 06:42 AM
                    • SEQadmin2
                      Environmental Genomics in the Age of NGS: From Microbes to Conservation Strategies
                      by SEQadmin2

                      Studying ecosystems means dealing with complex, multi-species communities that are hard to observe at scale. This complexity, however, hides many important questions to be answered, from how biogeochemical cycles work and how climate change can affect species distribution to how conservation strategies can work best.


                      Genomics, particularly since the expansion of NGS, has transformed ecosystem ecology. By sequencing environmental DNA, we can now assess biodiversity without direct...
                      05-06-2026, 09:04 AM

                    ad_right_rmr

                    Collapse

                    News

                    Collapse

                    Topics Statistics Last Post
                    Started by SEQadmin2, Today, 08:59 AM
                    0 responses
                    9 views
                    0 reactions
                    Last Post SEQadmin2  
                    Started by SEQadmin2, 06-02-2026, 12:03 PM
                    0 responses
                    21 views
                    0 reactions
                    Last Post SEQadmin2  
                    Started by SEQadmin2, 06-02-2026, 11:40 AM
                    0 responses
                    17 views
                    0 reactions
                    Last Post SEQadmin2  
                    Started by SEQadmin2, 05-28-2026, 11:40 AM
                    0 responses
                    30 views
                    0 reactions
                    Last Post SEQadmin2  
                    Working...