Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • fabrice
    Member
    • Oct 2009
    • 86

    cufflinks, Warning: Skipping large bundle.

    Hi All,
    When I used cufflinks,

    cufflinks-1.0.3.Linux_x86_64/cufflinks -p 4 -I 5000000 -G genome_index/annotation/Homo_sap iens.GRCh37.63/Homo_sapiens.GRCh37.63.gtf --output-dir mapping/7124 mapping/7124/accepted_hits.bam

    I always get this warning,

    Warning: Skipping large bundle.

    what is this mean?

    Thank you very much.
  • Joseph Dougherty
    Junior Member
    • Aug 2011
    • 2

    #2
    Hi Farbice,

    Did you ever head anything back on this? I get the same error. It seems to occur during the steps involved in running multi-read correction, which it did not appear you ar doing.

    I am running:

    cufflinks -I 500000 -p 3 -b /srv/cgs/data/jdougherty/indexes/mm9_mcherry.fa -u -g ../../mm9_flat_dsred.gtf accepted_hits.bam


    and in the output I get:


    [05:56:12] Inspecting reads and determining fragment length distribution.
    > Processed 106047 loci. [*************************] 100%
    > Map Properties:
    > Total Map Mass: 5751822.83
    > Number of Multi-Reads: 640779 (with 3399112 total hits)
    > Read Type: 104bp x 104bp
    > Fragment Length Distribution: Empirical (learned)
    > Estimated Mean: 126.83
    > Estimated Std Dev: 21.21
    [05:58:36] Assembling transcripts and initializing abundances for multi-read correction.
    > Processing Locus chr14:3030401-3030502 [******* ] 28%
    chr14:3032091-7220792 Warning: Skipping large bundle.
    > Processed 106046 loci. [*************************] 100%
    [06:11:45] Loading reference annotation and sequence.
    [06:12:12] Learning bias parameters.
    > Processed 22579 loci. [*************************] 100%
    [06:13:53] Re-estimating abundances with bias and multi-read correction.
    > Processed 22579 loci. [*************************] 100%

    real 25m33.584s
    user 62m53.950s
    sys 0m44.890s
    Finished: Sat Aug 13 06:21:42 CDT 2011

    Any idea of what this is?

    Thanks
    Joe

    Comment

    • fabrice
      Member
      • Oct 2009
      • 86

      #3
      Joe ,

      I do not get the answer

      Comment

      • kmcarr
        Senior Member
        • May 2008
        • 1181

        #4
        Joe & Fabrice,

        Cufflinks groups overlapping reads into what it refers to as 'bundles', the assumption being that each of these bundles represents a gene locus. It then processes each of the bundles separately to assemble a gene model. If the length of genome spanned by all the reads in a bundle is too large (larger than reasonably expected for a gene) cufflinks will not attempt to process that bundle further and will move on. When this happens it produces the warning message you see. No models will be built from this group of aligned reads nor any expression values reported.

        The default length which triggers this skipping is 3.5 million base pairs. In Joe's example the bundle which was skipped spanned chr14 from 3032091-7220792 which is ~4.2 million bp. You can increase (or decrease) the maximum bundle length by passing the "--max-bundle-length <int>" parameter to cufflinks. <int> can be any integer >= 1.

        Comment

        • fabrice
          Member
          • Oct 2009
          • 86

          #5
          kmcarr,

          Thank your reply.

          Here if I set too larger --max-bundle-length value, will have some problem?


          > Processing Locus 1:11868-31109 [ ] 0%^M> Processing Locus 1:34553-36081
          21:38435145-45747259 Warning: Skipping large bundle.

          here it is ~7,3million bp
          Last edited by fabrice; 08-16-2011, 03:10 PM.

          Comment

          • Joseph Dougherty
            Junior Member
            • Aug 2011
            • 2

            #6
            Thanks much!

            Comment

            • kmcarr
              Senior Member
              • May 2008
              • 1181

              #7
              Originally posted by fabrice View Post
              kmcarr,

              Thank your reply.

              Here if I set too larger --max-bundle-length value, will have some problem?


              > Processing Locus 1:11868-31109 [ ] 0%^M> Processing Locus 1:34553-36081
              21:38435145-45747259 Warning: Skipping large bundle.

              here it is ~7,3million bp
              The purpose of the --max-bundle-length parameter is to prevent cufflinks from trying to assemble a gene model from a read group spanning a genomic region which is clearly too large to represent a single gene. An appropriate value for this parameter is very much dependent upon the species you are working in. The default value of 3,500,000bp is (I believe) set to be appropriate for humans or other mammals. You could increase the size of this value but is it likely that a gene in your organism of interest would span 7.3 million bp? I can't answer that; this is where your knowledge of the organism you are studying comes into play.

              Comment

              • fabrice
                Member
                • Oct 2009
                • 86

                #8
                I am working on humans samples.

                Originally posted by kmcarr View Post
                The purpose of the --max-bundle-length parameter is to prevent cufflinks from trying to assemble a gene model from a read group spanning a genomic region which is clearly too large to represent a single gene. An appropriate value for this parameter is very much dependent upon the species you are working in. The default value of 3,500,000bp is (I believe) set to be appropriate for humans or other mammals. You could increase the size of this value but is it likely that a gene in your organism of interest would span 7.3 million bp? I can't answer that; this is where your knowledge of the organism you are studying comes into play.

                Comment

                • Kcornelius
                  Member
                  • Apr 2012
                  • 14

                  #9
                  I am working on human as well.
                  Me and a colleague of mine got 2 regions bigger than 3.5 mio:

                  chr21:38435145-45760353 Warning: Skipping large bundle.

                  chr6:126102278-130463972 Warning: Skipping large bundle.


                  So I think this is quite normal for human samples.

                  Marc

                  Comment

                  • caddymob
                    Member
                    • Apr 2009
                    • 36

                    #10
                    I consistently see it skipping these in humans:

                    21:38435145-45747259 Warning: Skipping large bundle.
                    6:126102306-130463972 Warning: Skipping large bundle.

                    The chr21 locus is huge and part of the down syndrome critical region... I run --max-bundle-length 10000000 to get past this error. Seems to work, calling FPKMs across the DSCR...

                    Comment

                    Latest Articles

                    Collapse

                    ad_right_rmr

                    Collapse

                    News

                    Collapse

                    Topics Statistics Last Post
                    Started by SEQadmin2, 06-05-2026, 10:09 AM
                    0 responses
                    16 views
                    0 reactions
                    Last Post SEQadmin2  
                    Started by SEQadmin2, 06-04-2026, 08:59 AM
                    0 responses
                    34 views
                    0 reactions
                    Last Post SEQadmin2  
                    Started by SEQadmin2, 06-02-2026, 12:03 PM
                    0 responses
                    37 views
                    0 reactions
                    Last Post SEQadmin2  
                    Started by SEQadmin2, 06-02-2026, 11:40 AM
                    0 responses
                    24 views
                    0 reactions
                    Last Post SEQadmin2  
                    Working...