Seqanswers Leaderboard Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • Tomnl
    Junior Member
    • Jun 2013
    • 6

    rDiff - error while getting gene expression

    Hi everybody,

    I was wondering if anybody could help with an error message I am receiving for the differential isoform analysis software rDiff.



    The error message is:
    Getting gene expression for: /NGS/users/Thomas/rDiff//wt_7.bam
    error: convert_reads_to_region_indicators: A(I,J): column index out of bounds; value 11054 out of bound 7483
    error: called from:
    error: /NGS/Software/rDiff-master/src/tools/convert_reads_to_region_indicators.m at line 16, column 19
    error: /NGS/Software/rDiff-master/src/get_reads_caller.m at line 71, column 48
    error: /NGS/Software/rDiff-master/src/get_read_counts.m at line 32, column 17
    error: /NGS/Software/rDiff-master/src/rdiff.m at line 38, column 5



    Any help would be greatly appreciated.

    Additional info:

    The command given was as follows:
    ./rdiff -o output/ -d files/ -a wt_7.bam,wt_8.bam,wt_203.bam -b mut_2.bam,mut_201.bam,mut_204.bam -g genes_mm10.gff3 -m param -L 51 -m 30

    The same error occurs when using both param and non param.

    The bam files were generated by TopHat.

    The .gff3 file was generated by converting the .gtf file provided by TopHat http://tophat.cbcb.umd.edu/igenomes.shtml for mus musculus NCBI.
  • philippd
    Junior Member
    • Oct 2012
    • 4

    #2
    The command seems to be right although the path to the bam-file seems strange. Is the bam-file located at: /NGS/users/Thomas/rDiff//wt_7.bam ?

    Could you maybe also send me the complete output of the rDiff run as well as the first 1000 lines of your gff3-file( or the part where you believe that the problem is)?

    Comment

    • Tomnl
      Junior Member
      • Jun 2013
      • 6

      #3
      Hi philippd,

      Thank you for the quick response.

      I have attached a text file of the output, the first 1000 lines of the GFF3 file and the first 1000 lines of one of the BAM files. I have also attached the output of the first example (used in the make example command) in case that may be of any use.

      I note that on the example BAM files there are no qual and sequence strings. Do the BAM files need to be processed in any specific way?

      The BAM location /NGS/users/Thomas/rDiff//wt_7.bam is correct. Except for the // should be a /

      The command I showed previously was shortened. The full command is shown below:
      /NGS/Software/rDiff-master/bin/rdiff -o /NGS/users/Thomas/rDiff/output/ -d /NGS/users/Thomas/rDiff/ -a wt_7.bam,wt_8.bam,wt_203.bam -b mut_2.bam,mut_201.bam,mut_204.bam -g /NGS/users/Thomas/Transcripts/genes_mm10.gff3 -m param -L 51 -m 30


      I look forward to hearing your reply.

      Kind regards

      Tom
      Attached Files

      Comment

      • philippd
        Junior Member
        • Oct 2012
        • 4

        #4
        Hi Tom,

        I think that your GFF3-file is not formatted correctly. I saw that sometimes the exon and mRNA coordinates lie outside the gene coordinates( e.g for some genes the exons end after the gene), which should normally not happen.
        What you could to is to either download a GFF3-file where this is not the case or replace for each gene the start and the stop with the smallest resp. largest exon position.

        Kind regards,
        Philipp

        Comment

        • Tomnl
          Junior Member
          • Jun 2013
          • 6

          #5
          Hi Philipp

          Thanks again for your reply. I tried with a number of different GFF3 files and using a number of different GTF2/GFF3 converters see below... but still no luck.

          Would you recommend any specific GFF3/GTF files for the mm10 mouse genome?

          I have used the following GFF3 files:

          ftp://ftp.ncbi.nlm.nih.gov/genomes/M..._level.gff3.gz

          ftp://ftp.ncbi.nlm.nih.gov/genomes/M...ffolds.gff3.gz


          I have used the following GTF2/GFF3 converters:

          The GFF toolkit from the mskcc galaxy webserver linked from the rDiff website https://galaxy.cbio.mskcc.org/

          The python script which comes with SpliceGrapher-0.2.2 (gtf2gff.py)

          The gffread tool which comes with cufflinks

          Converter tools used with the following GTF files:

          NCBI gtf file provided by cufflinks http://cufflinks.cbcb.umd.edu/igenomes.html
          ensembl genes downloaded fro UCSC http://genome.ucsc.edu/cgi-bin/hgTab...mblGenes.fasta

          I have attached a file which contains some of the error codes associated with some of the attempts I have made.

          I have tried to avoid having to edit the GFF and replace the start and the stop location for each gene with the smallest resp. largest exon position. As it seems that it indicates that the GFF file is not correct. Although if there is no other option that is what I will do.

          I should note that when I use the GFF toolkit conversion tool kit I always get exon and mRNA coordinates which lie outside the gene coordinate. When I use the gffread conversion tool I get the following rDiff error "child may be mapped to multiple parents ex: Parent=AT01,AT01-1-Protein."

          Kind regards

          Tom
          Attached Files

          Comment

          • vipints
            Junior Member
            • Dec 2009
            • 1

            #6
            Hello Tom,

            Can you please please post first 5 lines (uncommented) from the GTF/GFF file.

            Thanks, Vipin

            Comment

            • Tomnl
              Junior Member
              • Jun 2013
              • 6

              #7
              Hi Vipin

              Sorry for the delay in replying. I have attached a file of the first few lines of the GTF/GFF files I have used.

              I should mention: I have managed to get the program to run successfully using test files of very limited size (~50 KB a BAM file). This was using the ensembl GTF downloaded from UCSC and then GTF/GFF conversion using GFF converter.

              When I attempt with larger files e.g over 2 GB a BAM file. I get the following error message:
              error: memory exhausted or requested size too large for range of Octave's index type -- eval failed

              Best regards
              Attached Files

              Comment

              Latest Articles

              Collapse

              • seqadmin
                New Genomics Tools and Methods Shared at AGBT 2025
                by seqadmin


                This year’s Advances in Genome Biology and Technology (AGBT) General Meeting commemorated the 25th anniversary of the event at its original venue on Marco Island, Florida. While this year’s event didn’t include high-profile musical performances, the industry announcements and cutting-edge research still drew the attention of leading scientists.

                The Headliner
                The biggest announcement was Roche stepping back into the sequencing platform market. In the years since...
                03-03-2025, 01:39 PM
              • seqadmin
                Investigating the Gut Microbiome Through Diet and Spatial Biology
                by seqadmin




                The human gut contains trillions of microorganisms that impact digestion, immune functions, and overall health1. Despite major breakthroughs, we’re only beginning to understand the full extent of the microbiome’s influence on health and disease. Advances in next-generation sequencing and spatial biology have opened new windows into this complex environment, yet many questions remain. This article highlights two recent studies exploring how diet influences microbial...
                02-24-2025, 06:31 AM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, Today, 05:03 AM
              0 responses
              16 views
              0 reactions
              Last Post seqadmin  
              Started by seqadmin, Yesterday, 07:27 AM
              0 responses
              13 views
              0 reactions
              Last Post seqadmin  
              Started by seqadmin, 03-18-2025, 12:50 PM
              0 responses
              15 views
              0 reactions
              Last Post seqadmin  
              Started by seqadmin, 03-03-2025, 01:15 PM
              0 responses
              185 views
              0 reactions
              Last Post seqadmin  
              Working...