Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • TranAbyss-Analyze Error

    Although the paired reads were checked with two different scripts for unpaired reads before running TransAbyss-Analyze, an error saying "the paired-end accessions do not match" is output before aborting. The message essentially says “Paired-end accessions FCC6B7TACXX:7:1101:10524:57301# and FCC6B7TACXX:7:1101:9245:1994# do not match”.

    Command:
    transabyss-analyze -a TrAb_Merged_1_1.fa -1 RG11_NR_P3_R1.fq.gz -2 RG11_NR_P3_R2.fq.gz --SS --ref Gh1 --cfg /work/satishg/transcriptome.cfg --annodir /work/satishg/Gh1.gtf --analyze fusion -o /work/satishg/TrAn_RG11_Gh1 -t 20

    Error Message:
    /work/satishg/TrAn_RG11_Gh1/reads_to_genome/cluster/transabyss.local.sh: line 3: ulimit: core file size: cannot modify limit: Operation not permitted
    GSNAP version 2014-12-29 called with args: gsnap --gunzip -d Gh1 -D /work/satishg -t 18 --format sam -N 1 -m 10 /work/satishg/RG11_NR_P3_R1.fq.gz /work/satishg/RG11_NR_P3_R2.fq.gz
    Checking compiler assumptions for popcnt: 6B8B4567 __builtin_clz=1 __builtin_ctz=0 _mm_popcnt_u32=17 __builtin_popcount=17
    Checking compiler assumptions for SSE2: 6B8B4567 327B23C6 xor=59F066A1
    Checking compiler assumptions for SSE4.1: -103 -58 max=-58 => compiler sign extends
    Finished checking compiler assumptions
    Novel splicing (-N) turned on => assume reads are RNA-Seq
    Paired-end accessions FCC6B7TACXX:7:1101:10524:57301# and FCC6B7TACXX:7:1101:9245:1994# do not match
    real 0m0.623s
    user 0m0.009s
    sys 0m0.056s
    ERROR: Execution of script ended with a non-zero exit-status.

    I tried running it with three different read pairs, but all terminate with the same error.

  • #2
    Have you done something to the original raw data (e.g. trimming) that could potentially have broken the read pairing?

    Have you checked to see if the reads the program is complaining about are present/have no problems.

    Code:
    $ zgrep -A 3 7:1101:10524:57301 your files(R1/R2)

    Comment


    • #3
      They show up:

      $ zgrep -A 3 7:1101:10524:57301 RG11_NR_P3_R1.fq.gz
      @FCC6B7TACXX:7:1101:10524:57301#/1
      CCTCATGGATACCAAGCTTGAGGTTCTTTGAGAATGCCTCATAAAACTTGTTGTAATCTTCCTTGTTCTCTGCTATTTCAAAGAAGAG
      +
      giiiiihiiihhiihiihiihiideghiiihihiiifghighhhhhhiihhiafhiihhhhiibgggggedgeeeeeebdddddb`bc

      $ zgrep -A 3 7:1101:9245:1994 RG11_NR_P3_R2.fq.gz
      @FCC6B7TACXX:7:1101:9245:1994#/2
      ACAAGACTCGGCCGCTTAAAAAAACCAGGGTGAAAGCCATGCCTTTCGTTAAAGCTCAAAAGACCAAGGCTTATTTCAAGAGATATCA
      +
      gihiiiiiiiiiiiiiiiiiiiiiiiiiiibggggeeeaedcddddccccccccccccbcccccccccccccccddddccbccccdcd

      The reads were trimmed and cleaned for rRNA sequences.

      Comment


      • #4
        Those are not reads from the same fragment (unless you are just showing examples from separate R1 and R2 files).

        Does this show a corresponding read from R2 file?

        Code:
        $ zgrep -A 3 7:1101:10524:57301 RG11_NR_P3_R2.fq.gz

        Comment


        • #5
          It does not return anything. What is that supposed to mean?

          Comment


          • #6
            That means the corresponding read was eliminated from R2 file during trimming leaving you with an unmatched read in R1. That is why transabyss is now complaining.

            Use repair.sh from BBMap to remove reads that are singletons.

            Code:
            $ repair.sh in1=r1.fq in2=r2.fq out1=fixed1.fq out2=fixed2.fq outsingle=singletons.fq
            In future you may want to use a trimming program that is paired-end aware (or you should have trimmed the R1/R2 files together) to keep the read pairing intact in R1/R2 files.

            Comment


            • #7
              Thanks ! It seems to be clearing out unpaired reads. I used Trimmomatic to trim the reads, followed by running TransAbyss-Analyze. When I got the unpaired reads error, I processed the paired reads once again through a python script to remove unpaired reads, which didn't find any. I shall run the bbmap output reads through TransAbyss-Analyze and get back if I face any other issues.

              Comment


              • #8
                BBMap also contains bbduk.sh which is paired-end aware trimming program. Find the thread for bbduk to get additional information.

                Comment


                • #9
                  Dear GenoMax,

                  recently I also got the same error when I tried to map my RNA-seq data to the mm9 genome,

                  START time : Wed Jul 8 17:43:56 CEST 2015
                  GSNAP version 2014-12-23 called with args: /usr/local/gmap/gmap-2014-12-23/bin/gsnap -t 8 -N 1 -n 1 -A sam -D /data/DIV5/HumGen/WHY/NEO_genome_alignm$
                  Checking compiler assumptions for popcnt: 6B8B4567 __builtin_clz=1 __builtin_ctz=0 _mm_popcnt_u32=17 __builtin_popcount=17
                  Checking compiler assumptions for SSE2: 6B8B4567 327B23C6 xor=59F066A1
                  Checking compiler assumptions for SSE4.1: -103 -58 max=198 => compiler zero extends
                  Finished checking compiler assumptions
                  Novel splicing (-N) turned on => assume reads are RNA-Seq
                  Paired-end accessions FCC6N7WACXX:8:1101:1142:17629# and FCC6N7WACXX:8:1101:1206:2090# do not match

                  and I tried to grep both ID from both R1 and R2 fastq files, and I can get sequence information from both files.

                  For example:
                  grep -A 3 FCC6N7WACXX:8:1101:1206:2090 FCC6N7WACXX-MOUeoqEAAFRAAPEI-207_L8_1.fq

                  @FCC6N7WACXX:8:1101:1206:2090#/1
                  TCTCCTTCAACAACATCAAACTCCACAGTCTCTCCATCGCCTACACTGCGAAGGTACTTCCTGGGGTTATTCTTCTTTATGGCAGTCTGG
                  +
                  bbbeeeeegggggihihiiiiiihiiiigfhiiiiihiiiiiiiiiiihiiiii_egfhihihgggX[adddddd`bcccccccc[bccc

                  grep -A 3 FCC6N7WACXX:8:1101:1206:2090 FCC6N7WACXX-MOUeoqEAAFRAAPEI-207_L8_2.fq
                  @FCC6N7WACXX:8:1101:1206:2090#/2
                  TTGGGAACAGTCAAATGGTTCAATGTAAGGAACGGATACGGTTTCATCAACAGGAATGACACCAAGGAAGACGTATTTGTACACCAGACT
                  +
                  _bbeeeeeggggbefefhghhhhhifgfihdhiiihhiiiieghihhhigfgfghihihiiiiihiggggeeeccccdcbcceccccccc

                  So do you have any clue about this? Many thanks~!

                  Comment


                  • #10
                    @whytcs: Can you try running repair.sh from BBMap on your files to see if there is a problem somewhere else?

                    Someone had previously reported this error with GSNAP but the context there may not be applicable in your case: http://seqanswers.com/forums/showthread.php?t=45718

                    Comment

                    Latest Articles

                    Collapse

                    • seqadmin
                      Recent Advances in Sequencing Analysis Tools
                      by seqadmin


                      The sequencing world is rapidly changing due to declining costs, enhanced accuracies, and the advent of newer, cutting-edge instruments. Equally important to these developments are improvements in sequencing analysis, a process that converts vast amounts of raw data into a comprehensible and meaningful form. This complex task requires expertise and the right analysis tools. In this article, we highlight the progress and innovation in sequencing analysis by reviewing several of the...
                      Today, 07:48 AM
                    • seqadmin
                      Essential Discoveries and Tools in Epitranscriptomics
                      by seqadmin




                      The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
                      04-22-2024, 07:01 AM

                    ad_right_rmr

                    Collapse

                    News

                    Collapse

                    Topics Statistics Last Post
                    Started by seqadmin, Today, 07:17 AM
                    0 responses
                    11 views
                    0 likes
                    Last Post seqadmin  
                    Started by seqadmin, 05-02-2024, 08:06 AM
                    0 responses
                    19 views
                    0 likes
                    Last Post seqadmin  
                    Started by seqadmin, 04-30-2024, 12:17 PM
                    0 responses
                    20 views
                    0 likes
                    Last Post seqadmin  
                    Started by seqadmin, 04-29-2024, 10:49 AM
                    0 responses
                    28 views
                    0 likes
                    Last Post seqadmin  
                    Working...
                    X