Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • IGV .genome files: where is the sequence?

    IGV offers a number of "built in" genomes to display your .bam file against. These are apparently packaged in .genome files. The .genome file is actually a zipped set of a few files. But none of them seem to be the actual genome sequence.

    Where does IGV keep the genome sequence?

    For example, the chicken genome reference file, "galGal3.genome" after unzipping, via "ls -1s"

    3 Chicken_galGal3_cytoband.txt
    1514 chicken.refflat
    3 property.txt
    1755 refGene.txt

    None of them contain the chicken reference sequence.

    --
    Phillip

  • #2
    Originally posted by pmiguel View Post
    IGV offers a number of "built in" genomes to display your .bam file against. These are apparently packaged in .genome files. The .genome file is actually a zipped set of a few files. But none of them seem to be the actual genome sequence.

    Where does IGV keep the genome sequence?

    For example, the chicken genome reference file, "galGal3.genome" after unzipping, via "ls -1s"

    3 Chicken_galGal3_cytoband.txt
    1514 chicken.refflat
    3 property.txt
    1755 refGene.txt

    None of them contain the chicken reference sequence.

    --
    Phillip
    I am curious that how did you find galGal3.genome is a zip file.
    I also don't know where the sequence is.
    waiting reply...

    Comment


    • #3
      Originally posted by hanifk View Post
      I am curious that how did you find galGal3.genome is a zip file.
      I also don't know where the sequence is.
      waiting reply...
      You would have to ask Rick Westerman, he determined the file was zipped. He looked at the first few characters of the file and saw "PK" there. That said, when I look now, I see no "PK".

      IGV has fairly extensive documentation, so it is probably mentioned somewhere.

      --
      Phillip

      Comment


      • #4
        Not exactly an answer, but should be suitable for my purposes. The Broad describes the source of each .genome file it has "built in":

        If you are unable to find something or have a question about our new website, please email [email protected]. For other inquiries related to the Broad Institute, the necessary contact information can be found here.


        --
        Phillip

        Comment


        • #5
          I believe built-in genomes are read/cached from Broad website... I had an error message yesterday using IGV, it warned me that communication with the main server has been interrupted. As results, DNA sequences (and translated proteins) disappeared.
          If you build your own genome, instead, you should find a directory like

          Code:
          IGV_Genomes/
          |
          +-MyGenome/
             |
             +-MyGenome.genome
             +-MyGenome.genome_seq/
                |
                +-OriginalSequenceFile
          d

          Comment


          • #6
            @pmiguel

            if you're on a unix/linux system, type:

            file galGal3.genome

            this will tell you (within certain limits) what type of file - ascii, binary, zip, gzip etc.

            not sure for windows. should work on command line on a mac

            -sf

            Comment


            • #7
              @seqfast,

              It works:
              file galGal3.genome
              galGal3.genome: Zip archive data, at least v2.0 to extract

              Thanks!

              --
              Phillip

              Comment


              • #8
                This thread helps.

                Comment

                Latest Articles

                Collapse

                • seqadmin
                  Current Approaches to Protein Sequencing
                  by seqadmin


                  Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                  04-04-2024, 04:25 PM
                • seqadmin
                  Strategies for Sequencing Challenging Samples
                  by seqadmin


                  Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                  03-22-2024, 06:39 AM

                ad_right_rmr

                Collapse

                News

                Collapse

                Topics Statistics Last Post
                Started by seqadmin, 04-11-2024, 12:08 PM
                0 responses
                31 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 04-10-2024, 10:19 PM
                0 responses
                32 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 04-10-2024, 09:21 AM
                0 responses
                28 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 04-04-2024, 09:00 AM
                0 responses
                53 views
                0 likes
                Last Post seqadmin  
                Working...
                X