Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • fasta sequence: 0 based or 1 based index

    When we say the position of a sequence, is zero based or one based?
    Because I want to extract a sub sequence from a long one.
    Code:
    AGCTTT
    012345
    OR
    Code:
    AGCTTT
    123456
    Thanks.

  • #2
    That is undefined if you don't specify which system (or language) you're working with.

    Comment


    • #3
      Okay. If the sequence is given and a position 63150935 is given as well. I want to get 1000kb size around this point by C#.
      Then
      Code:
      string trunk = sequence.Substring(63150935-500000,1000000);
      or
      Code:
      string trunk = sequence.Substring(63150934-500000,1000000);
      Which one is correct?
      Last edited by ardmore; 11-15-2011, 08:26 AM.

      Comment


      • #4
        In c# the indices are 0 based, so the first one would be apppropriate if your position is also defined as 0 based.

        If it was 1 based (for example, if it comes from Ensembl), you'll need to do the second one though.

        Comment


        • #5
          My question is not for C#. I meant that I am not sure whether the sequence is defined as 0 based or not. The sequence is a fasta file or extracted from a genome.

          Comment


          • #6
            The sequence is not your issue. A sequence itself is not '0 based', it's just a list of characters.
            Where does your position 63150935 come from?

            Comment


            • #7
              It is from a bam file output. If we define a region such as chr22:10000-20000.
              And we get the consensus sequence, we only interest one small region around a specific position.
              How to?

              Comment


              • #8
                If it's from a BAM/SAM file, then look at the BAM/SAM specification:



                For example, the fourth field of SAM files is 1-based:
                POS: 1-based leftmost mapping POSition of the first matching base. The first base in a reference
                sequence has coordinate 1. POS is set as 0 for an unmapped read without coordinate. If POS is
                0, no assumptions can be made about RNAME and CIGAR.
                whereas the internal BAM representation is 0-based:
                pos / 0-based leftmost coordinate (= POS − 1) / int32 t / [-1]

                Comment


                • #9
                  Thank you.

                  Comment

                  Latest Articles

                  Collapse

                  • seqadmin
                    Exploring the Dynamics of the Tumor Microenvironment
                    by seqadmin




                    The complexity of cancer is clearly demonstrated in the diverse ecosystem of the tumor microenvironment (TME). The TME is made up of numerous cell types and its development begins with the changes that happen during oncogenesis. “Genomic mutations, copy number changes, epigenetic alterations, and alternative gene expression occur to varying degrees within the affected tumor cells,” explained Andrea O’Hara, Ph.D., Strategic Technical Specialist at Azenta. “As...
                    07-08-2024, 03:19 PM
                  • seqadmin
                    Exploring Human Diversity Through Large-Scale Omics
                    by seqadmin


                    In 2003, researchers from the Human Genome Project (HGP) announced the most comprehensive genome to date1. Although the genome wasn’t fully completed until nearly 20 years later2, numerous large-scale projects, such as the International HapMap Project and 1000 Genomes Project, continued the HGP's work, capturing extensive variation and genomic diversity within humans. Recently, newer initiatives have significantly increased in scale and expanded beyond genomics, offering a more detailed...
                    06-25-2024, 06:43 AM

                  ad_right_rmr

                  Collapse

                  News

                  Collapse

                  Topics Statistics Last Post
                  Started by seqadmin, 07-10-2024, 07:30 AM
                  0 responses
                  25 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 07-03-2024, 09:45 AM
                  0 responses
                  201 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 07-03-2024, 08:54 AM
                  0 responses
                  211 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 07-02-2024, 03:00 PM
                  0 responses
                  193 views
                  0 likes
                  Last Post seqadmin  
                  Working...
                  X