Header Leaderboard Ad

Collapse

SSPACE help

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • SSPACE help

    Hi everyone,

    I was just wondering what an f_tig meant in the .evidence file produced by SSPACE. Is it the original contigs that are put into a scaffold or something else? Thanks!

  • #2
    These are the original contigs (order is based on the order of the contig in the fasta file). The 'f' indicates that it has forward orientation in the final scaffold, the 'r' means the reverse orientation.

    Comment


    • #3
      Hi, boetsie, I have seen your reply. I am recently going to construct contig graph that is produced by SSPACE. Because SSPACE produce a file that store conitg link message. for exmple, f3545 has 23 links with r3245 and gap of -68 bases. It say that there are 68 gaps between f3545 and r3245, but what meaning the " - " in the front of 68? I do not understand its meaning.
      And I want to ask you other questions about scaffold. This is a record below. From it, I know that there are gap between r_tig3042 and f_tig3539, and its size is 615, but why merged 15? How do I understand it?
      scaffold6|size68841|tigs7
      f_tig3325|size6927|links7|gaps-694|merged25
      f_tig3146|size3331|links6|gaps-623
      r_tig3405|size10398|links5|gaps-621
      f_tig3266|size5457|links15|gaps-649
      f_tig3358|size8089|links8|gaps383
      r_tig3042|size2074|links5|gaps-615|merged15
      f_tig3539|size32219

      Thank you very much. I am looking forward from you.
      Best wishes for you.

      Yue Xu

      Comment


      • #4
        The negative gap indicates a potential overlap between the two contigs. However, it seems unlikely that there is 615bp overlap between the contigs, indicating that the insert size you've provided in the library file is not correct.

        To illustrate how this is estimated;

        Say you have a two contigs, contig1 of 1000bp and contig2 of 2000bp, one of your paired-read aligns at position 900 at contig1 and the other at position 100 on contig 2.

        If you set the insert size to 210bp, the estimated gap is;
        Provided insert size - ((size of contig1)-(position of read1 on contig1)) + (position of read2 on contig2). In this case it is;
        210 - (1000-900) + 100 = 10

        So a gap of 10bp. If we change the insert size to 2000, it is;

        2000 - (1000-900) + 100 = 1800

        If we change the insert size to 100, it is;

        100 - (1000-900) + 100 = -100

        As you can see, the estimated gap really depends on the provided insert size by the user.

        In your case I see a number of large negative gaps, this is highly unusual. Probably you should lower your insert-size by 600 bases.

        Regards,
        Boetsie

        Comment


        • #5
          Hi, thank your detailed reply, because of your reply, I understand how to calculate the gap between contigs in SSPACE. Thank you very much.
          But I have seen your writing formula:
          Provided insert size - (((size of contig1)-(position of read1 on contig1)) + (position of read2 on contig2))
          whether is it lack of a pair of bracket that I mark it in the type of bold and italic?

          yours sincercely,
          Yue Xu

          Comment

          Latest Articles

          Collapse

          • seqadmin
            Improved Targeted Sequencing: A Comprehensive Guide to Amplicon Sequencing
            by seqadmin



            Amplicon sequencing is a targeted approach that allows researchers to investigate specific regions of the genome. This technique is routinely used in applications such as variant identification, clinical research, and infectious disease surveillance. The amplicon sequencing process begins by designing primers that flank the regions of interest. The DNA sequences are then amplified through PCR (typically multiplex PCR) to produce amplicons complementary to the targets. RNA targets...
            Today, 01:49 PM
          • seqadmin
            Targeted Sequencing: Choosing Between Hybridization Capture and Amplicon Sequencing
            by seqadmin




            Targeted sequencing is an effective way to sequence and analyze specific genomic regions of interest. This method enables researchers to focus their efforts on their desired targets, as opposed to other methods like whole genome sequencing that involve the sequencing of total DNA. Utilizing targeted sequencing is an attractive option for many researchers because it is often faster, more cost-effective, and only generates applicable data. While there are many approaches...
            03-10-2023, 05:31 AM
          • seqadmin
            Expert Advice on Automating Your Library Preparations
            by seqadmin



            Using automation to prepare sequencing libraries isn’t a new concept, and most researchers are aware that there are numerous benefits to automating this process. However, many labs are still hesitant to switch to automation and often believe that it’s not suitable for their lab. To combat these concerns, we’ll cover some of the key advantages, review the most important considerations, and get real-world advice from automation experts to remove any lingering anxieties....
            02-21-2023, 02:14 PM

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by seqadmin, 03-17-2023, 12:32 PM
          0 responses
          12 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 03-15-2023, 12:42 PM
          0 responses
          18 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 03-09-2023, 10:17 AM
          0 responses
          67 views
          1 like
          Last Post seqadmin  
          Started by seqadmin, 03-03-2023, 12:03 PM
          0 responses
          64 views
          0 likes
          Last Post seqadmin  
          Working...
          X