Announcement

Collapse
No announcement yet.

Convert .fna file from NCBI to .fa or .fasta file

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Convert .fna file from NCBI to .fa or .fasta file

    Hello,
    I am totally new to this (I am a student following a course in bio-informatics) and I wanted to use a genome found on NCBI (in .fna or genbank format with .gff annotation) as a reference genome in STAR as an exercise but I cannot find a way to convert the .fna file so the genome can be read by STAR in --genomeFastaFiles. It is a genome not found on normal genome database sites (UCSC e.g.) since it is from a copepod and not much genomic work is done on copepods...
    Is this even possible to use such a genome as a reference genome or is this a bad idea from the start?
    Thank you in advance,
    kind regards,
    Josefien

  • #2
    As far as I know, .fna just means fasta nucleic acid (as opposed to .faa, fasta amino acid, for protein sequences), so the file is actually in fasta format.

    Comment


    • #3
      The problem is; STAR is not recognizing this fasta format (.fna), I am getting an error that is impossible to read this fasta file, that why I wondered if it was not possible to convert from .fna to .fa. Or do you think it is a problem with the file itself and STAR is able to read/load .fna files?

      Comment


      • #4
        Just rename the .fna extension to .fa (as long as the file is in fasta format). That should work.

        Code:
        $ cp file.fna file.fa
        If you are not sure about the format of the file post the output of this command
        Code:
        $ head -10 file.fna

        Comment


        • #5
          Originally posted by GenoMax View Post
          Just rename the .fna extension to .fa (as long as the file is in fasta format). That should work.

          Code:
          $ cp file.fna file.fa
          If you are not sure about the format of the file post the output of this command
          Code:
          $ head -10 file.fna
          thank you very much it is working now !

          Comment


          • #6
            could you please explain that changing the file extension won't effect the results for which we are mapping with a reference genome in case of RNA-seq for HISAT2 software?

            Comment


            • #7
              No the results will not be affected since we are not changing sequence/content of any data files. We are only renaming the file.

              Comment


              • #8
                will the content is same in both files?

                Comment


                • #9
                  Yes. As long as you only change the file name.

                  Comment


                  • #10
                    Thank you for your help

                    Comment

                    Working...
                    X