Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • Sbamo
    Junior Member
    • Jan 2016
    • 7

    RSEM with HISAT2

    Hello guys,

    I have RNA-sequencing data of around 250 patients with leukaemia.
    I have built up a basic pipeline using HISAT2 as my aligner with satisfying results. I want to test differential transcript expression between my samples using RSEM, but I can't get it to work with HISAT2.

    I am running the tools on the GALAXY platform and using the instructions provided I used RSEM prepare reference to create the reference files to provide to the aligner. My outputs are the following:

    rsem ref name.log
    rsem ref name.grp
    rsem ref name.ti
    rsem ref name.chrlist
    rsem ref name.transcripts.fa
    rsem ref name.seq
    rsem ref name.idx.fa
    rsem_ref name.3.ebwt
    rsem_ref name.4.ebwt
    rsem_ref name.1.ebwt
    rsem_ref name.2.ebwt
    rsem_ref name.rev.1.ebwt
    rsem_ref name.rev.2.ebwt

    From what I gather from the RSEM Readme I now have to align my reads using rsem ref name.idx.fa as a reference file. Trying this I get the following error:

    (ERR): hisat2-align died with signal 11 (SEGV)
    [W::sam_read1] parse error at line 43388
    [main_samview] truncated file.

    Does anyone have experience using HISAT2 with the RSEM reference file?
    It seems to me, that the prepared reference only works with TopHat2 since it creates BowTie index files. I would appreciate any response!

    Thanks!
    Sbamo
  • dpryan
    Devon Ryan
    • Jul 2011
    • 3478

    #2
    You can't use those files with HISAT2. You'll need to reindex either name.idx.fa or name.seq (I assume it's a fasta file) and use the resulting .ht2 files.

    Comment

    • Sbamo
      Junior Member
      • Jan 2016
      • 7

      #3
      dpryan, thank you for your response! Since I am new to Bioinformatics is there any tool or method you can suggest, in order to convert those files to .ht2-files?

      Comment

      • dpryan
        Devon Ryan
        • Jul 2011
        • 3478

        #4
        There are no conversion programs. Just delete them and build the index with hisat2-build.

        Comment

        • Sbamo
          Junior Member
          • Jan 2016
          • 7

          #5
          Hi, thanks again for your reply.
          It was possible to reindex the "rsem ref name.idx.fa" file (trying to reindex "rsem ref name.seq" threw an error). Unfortunately when using it as a reference HISAT2 presented the following problem:

          [E::sam_parse1] CIGAR and query sequence are of different length
          [W::sam_read1] parse error at line 42905944
          [main_samview] truncated file.
          Error while flushing and closing output
          terminate called after throwing an instance of 'int'
          (ERR): hisat2-align died with signal 6 (ABRT)
          [bam_sort_core] merging from 19 files...

          I looked up this part: "Error while flushing and closing output
          terminate called after throwing an instance of 'int'"
          and it seems this is a common problem with the Bowtie/TopHat2 aligners. Sadly, none of the fixes suggested worked and there was no post about this issue in HISAT2.

          Please note that the GTF-file and Fasta-file used to create the RSEM reference have been tested and HISAT2 can succesfully run through using them prior to RSEM-conversion (They are Hg19 references).

          As always any help is appreciated!

          Comment

          • dpryan
            Devon Ryan
            • Jul 2011
            • 3478

            #6
            Can't RSEM just be fed a BAM file? That'd be easier. Anyway, I'd personally just use salmon or kallisto and be done in a fraction of the time

            Comment

            • robp
              Member
              • Aug 2013
              • 13

              #7
              RSEM cannot handle alignments with gaps in them; only matches / mis-matches. I would suggest that you use the alignment-based mode of salmon if you'd still like to use the HISAT alignments downstream. Otherwise, you could try using the quasi-mapping-based mode of salmon or sailfish. These programs can produce accurate quantification estimates very quickly without the need to first perform traditional alignment of the reads. Full disclosure: I am the main developer of both of these tools .

              Comment

              • Sbamo
                Junior Member
                • Jan 2016
                • 7

                #8
                Robp, thank you for your reply! I may try them out, since for my analysis gapped alignment is mandatory!

                Comment

                • Sbamo
                  Junior Member
                  • Jan 2016
                  • 7

                  #9
                  P.S. dpryan, RSEM can be fed an BAM-file directly, but I had no luck doing this either. Probably due to the fact that it was created using gapped alignment. Thanks for your quick replies!

                  Comment

                  Latest Articles

                  Collapse

                  • SEQadmin2
                    Nine Things a Sample Prep Scientist Thinks About Before Sequencing
                    by SEQadmin2


                    I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.

                    Here are nine questions we think about, in roughly the order they matter, before...
                    06-18-2026, 07:11 AM
                  • SEQadmin2
                    From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
                    by SEQadmin2


                    Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


                    The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
                    ...
                    06-02-2026, 10:05 AM

                  ad_right_rmr

                  Collapse

                  News

                  Collapse

                  Topics Statistics Last Post
                  Started by SEQadmin2, 06-26-2026, 11:10 AM
                  0 responses
                  14 views
                  0 reactions
                  Last Post SEQadmin2  
                  Started by SEQadmin2, 06-17-2026, 06:09 AM
                  0 responses
                  48 views
                  0 reactions
                  Last Post SEQadmin2  
                  Started by SEQadmin2, 06-09-2026, 11:58 AM
                  0 responses
                  107 views
                  0 reactions
                  Last Post SEQadmin2  
                  Started by SEQadmin2, 06-05-2026, 10:09 AM
                  0 responses
                  125 views
                  0 reactions
                  Last Post SEQadmin2  
                  Working...