Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • samtools faidx Segmentation fault

    populus@Rust:~/samtools-0.1.5c_x86_64-linux$ ./samtools faidx /media/Poplar/baohua/genome/poplar_genome.fa
    [fai_build_core] different line length in sequence 'scaffold_28'.
    Segmentation fault

    What's the meaning of defferent line length ?

    It's the standard fasta file.

    I download it from:
    ftp://ftp.jgi-psf.org/pub/JGI_data/P...asked.fasta.gz

  • #2
    Originally posted by baohua100 View Post
    populus@Rust:~/samtools-0.1.5c_x86_64-linux$ ./samtools faidx /media/Poplar/baohua/genome/poplar_genome.fa
    [fai_build_core] different line length in sequence 'scaffold_28'.
    Segmentation fault

    What's the meaning of defferent line length ?

    It's the standard fasta file.

    I download it from:
    ftp://ftp.jgi-psf.org/pub/JGI_data/P...asked.fasta.gz
    Can you verify that every sequence line in the FASTA has the same length?

    Comment


    • #3
      I am having this same issue. While I have verified that there are lines of different length in the sequence file, however, why should this matter?

      Comment


      • #4
        This bioperl snippet fixes the fasta:

        Code:
        use Bio::SeqIO;
        $in  = Bio::SeqIO->new(-file => "inputfilename",
                               -format => 'Fasta');
        $out = Bio::SeqIO->new(-file => ">outputfilename",
                               -format => 'Fasta');
        while ( my $seq = $in->next_seq() ) {$out->write_seq($seq); }

        Comment


        • #5
          I had this problem too, my solution was to use unix command to trim and fold the fasta file. You would have to cut the header first, and catenate it with your sorted fasta file. It perfectly solves the problem.

          Writing a shell script may be a good idea to make things easlier.

          Comment


          • #6
            Originally posted by baohua100 View Post
            populus@Rust:~/samtools-0.1.5c_x86_64-linux$ ./samtools faidx /media/Poplar/baohua/genome/poplar_genome.fa
            [fai_build_core] different line length in sequence 'scaffold_28'.
            Segmentation fault

            What's the meaning of defferent line length ?
            I just found the same thing when there are blank lines in the FASTA file. The message "different line length" is very misleading in this case. I'll report this bug.

            Comment


            • #7
              Thanks very much webbrewer for your bioperl fix, it worked perfectly.

              Comment


              • #8
                Originally posted by webbrewer View Post
                This bioperl snippet fixes the fasta
                Just for anyone interested here is the Biopython equivalent:

                Code:
                from Bio import SeqIO
                SeqIO.convert("inputfilename.fas", "fasta", "outputfilename.fas", "fasta")
                The convert function returns the number of records if you wanted that information.

                Comment


                • #9
                  I faced the same problem, but solved with webbrewer's code!

                  Originally posted by webbrewer View Post
                  This bioperl snippet fixes the fasta:

                  Code:
                  use Bio::SeqIO;
                  $in  = Bio::SeqIO->new(-file => "inputfilename",
                                         -format => 'Fasta');
                  $out = Bio::SeqIO->new(-file => ">outputfilename",
                                         -format => 'Fasta');
                  while ( my $seq = $in->next_seq() ) {$out->write_seq($seq); }
                  Thanks a lot!

                  Comment


                  • #10
                    Hello, I have the same problem. I found the last lines in fa file likes:
                    Code:
                    ggttagggtgtggtgtgtgggtgtgtgtgggtgtggtgtgtgtgggtgtg
                    gtgtgtgggtgtgggtgtgggtgtgggtgtgtgggtgtggtgtgtgggtg
                    tggT
                    That means the last line has not the same length with others.
                    My question is that can I manually modify instead of using a software.
                    I don't want to install too many software because of rare usage.

                    Thanks.

                    Comment


                    • #11
                      The problem is that your FASTA file has a blank lines in it.
                      you need to get rid of them!!!

                      you can :g/^$/d in vi/vim editor.

                      Comment


                      • #12
                        Originally posted by michmich View Post
                        The problem is that your FASTA file has a blank lines in it.
                        you need to get rid of them!!!

                        you can :g/^$/d in vi/vim editor.
                        Hello, I am coming across the same error. However I have tried that Bio script posted above and it did not work stating some error.

                        I then looked at my fasta file in vim and it does not have any blank lines in the file.

                        Does any one have a suggestion of how to fix this problem so that I can use samtools faidx common on my fasta file?

                        Thank you in advance for your help.
                        jdjax
                        Ph.d. Student
                        Åarhus University

                        Comment


                        • #13
                          Originally posted by jdjax View Post
                          Hello, I am coming across the same error. However I have tried that Bio script posted above and it did not work stating some error.

                          I then looked at my fasta file in vim and it does not have any blank lines in the file.

                          Does any one have a suggestion of how to fix this problem so that I can use samtools faidx common on my fasta file?

                          Thank you in advance for your help.
                          Figured it out on my own.
                          jdjax
                          Ph.d. Student
                          Åarhus University

                          Comment


                          • #14
                            Originally posted by jdjax View Post
                            Figured it out on my own.
                            For the benefit of future readers, what was it about your FASTA file that faidx was breaking on? You said it wasn't blank lines.

                            Comment


                            • #15
                              Originally posted by maubp View Post
                              For the benefit of future readers, what was it about your FASTA file that faidx was breaking on? You said it wasn't blank lines.
                              It actually was blank lines. There were two blank lines at the end of my file that caused the problem.
                              jdjax
                              Ph.d. Student
                              Åarhus University

                              Comment

                              Latest Articles

                              Collapse

                              • seqadmin
                                The Impact of AI in Genomic Medicine
                                by seqadmin



                                Artificial intelligence (AI) has evolved from a futuristic vision to a mainstream technology, highlighted by the introduction of tools like OpenAI's ChatGPT and Google's Gemini. In recent years, AI has become increasingly integrated into the field of genomics. This integration has enabled new scientific discoveries while simultaneously raising important ethical questions1. Interviews with two researchers at the center of this intersection provide insightful perspectives into...
                                02-26-2024, 02:07 PM
                              • seqadmin
                                Multiomics Techniques Advancing Disease Research
                                by seqadmin


                                New and advanced multiomics tools and technologies have opened new avenues of research and markedly enhanced various disciplines such as disease research and precision medicine1. The practice of merging diverse data from various ‘omes increasingly provides a more holistic understanding of biological systems. As Maddison Masaeli, Co-Founder and CEO at Deepcell, aptly noted, “You can't explain biology in its complex form with one modality.”

                                A major leap in the field has
                                ...
                                02-08-2024, 06:33 AM

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by seqadmin, 02-28-2024, 06:12 AM
                              0 responses
                              28 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 02-23-2024, 04:11 PM
                              0 responses
                              74 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 02-21-2024, 08:52 AM
                              0 responses
                              85 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 02-20-2024, 08:57 AM
                              0 responses
                              69 views
                              0 likes
                              Last Post seqadmin  
                              Working...
                              X