Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Is this segemehl error due to memory?

    Hi I am fairly new to RNA-seq.
    I am trying to analyze my data using segemehl but am running into following error. (I've cut and pasted the last part of the output.)
    [SEGEMEHL] Fri Jul 24 19:16:53 2015: 1637977 reads in thread 0.
    [SEGEMEHL] Fri Jul 24 19:16:53 2015: 1637824 reads in thread 1.
    [SEGEMEHL] Fri Jul 24 19:16:53 2015: 1637824 reads in thread 2.
    [SEGEMEHL] Fri Jul 24 19:16:53 2015: 1637824 reads in thread 3.
    segemehl.x: libs/biofiles.c:1160: bl_fastxAddMate: Assertion `bl_fastaCheckMateID(f, n, descr, descrlen)' failed.

    My job commend is
    segemehl.x --silent -i hg19.idx -d human_hg19.fa -q READ1 -p READ2 -O -o sege.sam -u unmap.sam -D 1 -t 4

    One of my question was if I submit the job by chromosome to reduce the memory load how can segemehl map reads that align to different chromosomes?

    I read in some posting I should use the full reference file for but this will lead to significant increase in mapping time and memory requirement.
    How do I find the right balance?

    Thank you in advance

  • #2
    There is no "right balance". You need to map to the full reference if you want correct results.

    I can't advise you on that error message, but your command certainly looks strange. Is that the actual command, or are you substituting "READ1" and "READ2" for the filenames?
    Last edited by Brian Bushnell; 07-27-2015, 09:30 AM.

    Comment


    • #3
      If this error occurs, segemehl cannot assign mate2 to mate1. Are the reads in both your files in correct order? Do they have matching read ids (at least the beginning of the id)? Do you have the same number of reads in the mate1 file and the mate2 file?

      If you did adapter clipping and/or quality trimming, assure that you do it for both files together and not separated in two calls. You can use bbduk to trim paired-end reads without loosing the mate1-mate2-connection.
      ecSeq Bioinformatics is Europe’s leading provider of hands-on bioinformatics workshops and professional data analysis in the field of Next-Generation Sequencing (NGS).

      Comment


      • #4
        Thank you for the reply.

        Brian Bushnell : Yes the READ1 and READ2 are being substituted with actual fastq file names.
        Are there more strange things you could find in my commend? please let me know.

        ecSeq Bioinformatics : I was using a Alientrimmer and I believe it does not do read ID matching. I am sure that is the problem.
        Thank you.

        Comment


        • #5
          Hi Him26,
          Have you solved the problem? I'm using segemehl and meet the problem too. I don't do any trimming to my fastq file and I have checked that the reads in both my files are in correct order. I really appreciate any help.
          Thank you.

          Comment


          • #6
            Nope

            I got caught up with other issue and have not followed up on this matter. sorry about this. Do let me know if you find out anything.

            Comment


            • #7
              Segemehl tries to find the two mates that belong together by checking the fastq identifiers.

              They have to be:
              1. completely identical,
              2. contain identical substring (everything before the first whitespace), or
              3. identical with a '/1', or a '/2' at their ends
              ecSeq Bioinformatics is Europe’s leading provider of hands-on bioinformatics workshops and professional data analysis in the field of Next-Generation Sequencing (NGS).

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Recent Advances in Sequencing Analysis Tools
                by seqadmin


                The sequencing world is rapidly changing due to declining costs, enhanced accuracies, and the advent of newer, cutting-edge instruments. Equally important to these developments are improvements in sequencing analysis, a process that converts vast amounts of raw data into a comprehensible and meaningful form. This complex task requires expertise and the right analysis tools. In this article, we highlight the progress and innovation in sequencing analysis by reviewing several of the...
                05-06-2024, 07:48 AM
              • seqadmin
                Essential Discoveries and Tools in Epitranscriptomics
                by seqadmin




                The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
                04-22-2024, 07:01 AM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, Yesterday, 06:57 AM
              0 responses
              12 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 05-06-2024, 07:17 AM
              0 responses
              16 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 05-02-2024, 08:06 AM
              0 responses
              19 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-30-2024, 12:17 PM
              0 responses
              24 views
              0 likes
              Last Post seqadmin  
              Working...
              X