Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • SSPACE help

    Hi everyone,

    I was just wondering what an f_tig meant in the .evidence file produced by SSPACE. Is it the original contigs that are put into a scaffold or something else? Thanks!

  • #2
    These are the original contigs (order is based on the order of the contig in the fasta file). The 'f' indicates that it has forward orientation in the final scaffold, the 'r' means the reverse orientation.

    Comment


    • #3
      Hi, boetsie, I have seen your reply. I am recently going to construct contig graph that is produced by SSPACE. Because SSPACE produce a file that store conitg link message. for exmple, f3545 has 23 links with r3245 and gap of -68 bases. It say that there are 68 gaps between f3545 and r3245, but what meaning the " - " in the front of 68? I do not understand its meaning.
      And I want to ask you other questions about scaffold. This is a record below. From it, I know that there are gap between r_tig3042 and f_tig3539, and its size is 615, but why merged 15? How do I understand it?
      scaffold6|size68841|tigs7
      f_tig3325|size6927|links7|gaps-694|merged25
      f_tig3146|size3331|links6|gaps-623
      r_tig3405|size10398|links5|gaps-621
      f_tig3266|size5457|links15|gaps-649
      f_tig3358|size8089|links8|gaps383
      r_tig3042|size2074|links5|gaps-615|merged15
      f_tig3539|size32219

      Thank you very much. I am looking forward from you.
      Best wishes for you.

      Yue Xu

      Comment


      • #4
        The negative gap indicates a potential overlap between the two contigs. However, it seems unlikely that there is 615bp overlap between the contigs, indicating that the insert size you've provided in the library file is not correct.

        To illustrate how this is estimated;

        Say you have a two contigs, contig1 of 1000bp and contig2 of 2000bp, one of your paired-read aligns at position 900 at contig1 and the other at position 100 on contig 2.

        If you set the insert size to 210bp, the estimated gap is;
        Provided insert size - ((size of contig1)-(position of read1 on contig1)) + (position of read2 on contig2). In this case it is;
        210 - (1000-900) + 100 = 10

        So a gap of 10bp. If we change the insert size to 2000, it is;

        2000 - (1000-900) + 100 = 1800

        If we change the insert size to 100, it is;

        100 - (1000-900) + 100 = -100

        As you can see, the estimated gap really depends on the provided insert size by the user.

        In your case I see a number of large negative gaps, this is highly unusual. Probably you should lower your insert-size by 600 bases.

        Regards,
        Boetsie

        Comment


        • #5
          Hi, thank your detailed reply, because of your reply, I understand how to calculate the gap between contigs in SSPACE. Thank you very much.
          But I have seen your writing formula:
          Provided insert size - (((size of contig1)-(position of read1 on contig1)) + (position of read2 on contig2))
          whether is it lack of a pair of bracket that I mark it in the type of bold and italic?

          yours sincercely,
          Yue Xu

          Comment

          Latest Articles

          Collapse

          • seqadmin
            Exploring the Dynamics of the Tumor Microenvironment
            by seqadmin




            The complexity of cancer is clearly demonstrated in the diverse ecosystem of the tumor microenvironment (TME). The TME is made up of numerous cell types and its development begins with the changes that happen during oncogenesis. “Genomic mutations, copy number changes, epigenetic alterations, and alternative gene expression occur to varying degrees within the affected tumor cells,” explained Andrea O’Hara, Ph.D., Strategic Technical Specialist at Azenta. “As...
            07-08-2024, 03:19 PM
          • seqadmin
            Exploring Human Diversity Through Large-Scale Omics
            by seqadmin


            In 2003, researchers from the Human Genome Project (HGP) announced the most comprehensive genome to date1. Although the genome wasn’t fully completed until nearly 20 years later2, numerous large-scale projects, such as the International HapMap Project and 1000 Genomes Project, continued the HGP's work, capturing extensive variation and genomic diversity within humans. Recently, newer initiatives have significantly increased in scale and expanded beyond genomics, offering a more detailed...
            06-25-2024, 06:43 AM

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by seqadmin, 07-10-2024, 07:30 AM
          0 responses
          25 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 07-03-2024, 09:45 AM
          0 responses
          201 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 07-03-2024, 08:54 AM
          0 responses
          211 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 07-02-2024, 03:00 PM
          0 responses
          193 views
          0 likes
          Last Post seqadmin  
          Working...
          X