Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Yue Xu
    replied
    Hi, thank your detailed reply, because of your reply, I understand how to calculate the gap between contigs in SSPACE. Thank you very much.
    But I have seen your writing formula:
    Provided insert size - (((size of contig1)-(position of read1 on contig1)) + (position of read2 on contig2))
    whether is it lack of a pair of bracket that I mark it in the type of bold and italic?

    yours sincercely,
    Yue Xu

    Leave a comment:


  • boetsie
    replied
    The negative gap indicates a potential overlap between the two contigs. However, it seems unlikely that there is 615bp overlap between the contigs, indicating that the insert size you've provided in the library file is not correct.

    To illustrate how this is estimated;

    Say you have a two contigs, contig1 of 1000bp and contig2 of 2000bp, one of your paired-read aligns at position 900 at contig1 and the other at position 100 on contig 2.

    If you set the insert size to 210bp, the estimated gap is;
    Provided insert size - ((size of contig1)-(position of read1 on contig1)) + (position of read2 on contig2). In this case it is;
    210 - (1000-900) + 100 = 10

    So a gap of 10bp. If we change the insert size to 2000, it is;

    2000 - (1000-900) + 100 = 1800

    If we change the insert size to 100, it is;

    100 - (1000-900) + 100 = -100

    As you can see, the estimated gap really depends on the provided insert size by the user.

    In your case I see a number of large negative gaps, this is highly unusual. Probably you should lower your insert-size by 600 bases.

    Regards,
    Boetsie

    Leave a comment:


  • Yue Xu
    replied
    Hi, boetsie, I have seen your reply. I am recently going to construct contig graph that is produced by SSPACE. Because SSPACE produce a file that store conitg link message. for exmple, f3545 has 23 links with r3245 and gap of -68 bases. It say that there are 68 gaps between f3545 and r3245, but what meaning the " - " in the front of 68? I do not understand its meaning.
    And I want to ask you other questions about scaffold. This is a record below. From it, I know that there are gap between r_tig3042 and f_tig3539, and its size is 615, but why merged 15? How do I understand it?
    scaffold6|size68841|tigs7
    f_tig3325|size6927|links7|gaps-694|merged25
    f_tig3146|size3331|links6|gaps-623
    r_tig3405|size10398|links5|gaps-621
    f_tig3266|size5457|links15|gaps-649
    f_tig3358|size8089|links8|gaps383
    r_tig3042|size2074|links5|gaps-615|merged15
    f_tig3539|size32219

    Thank you very much. I am looking forward from you.
    Best wishes for you.

    Yue Xu

    Leave a comment:


  • boetsie
    replied
    These are the original contigs (order is based on the order of the contig in the fasta file). The 'f' indicates that it has forward orientation in the final scaffold, the 'r' means the reverse orientation.

    Leave a comment:


  • szimmerman
    started a topic SSPACE help

    SSPACE help

    Hi everyone,

    I was just wondering what an f_tig meant in the .evidence file produced by SSPACE. Is it the original contigs that are put into a scaffold or something else? Thanks!

Latest Articles

Collapse

  • seqadmin
    Exploring the Dynamics of the Tumor Microenvironment
    by seqadmin




    The complexity of cancer is clearly demonstrated in the diverse ecosystem of the tumor microenvironment (TME). The TME is made up of numerous cell types and its development begins with the changes that happen during oncogenesis. “Genomic mutations, copy number changes, epigenetic alterations, and alternative gene expression occur to varying degrees within the affected tumor cells,” explained Andrea O’Hara, Ph.D., Strategic Technical Specialist at Azenta. “As...
    07-08-2024, 03:19 PM
  • seqadmin
    Exploring Human Diversity Through Large-Scale Omics
    by seqadmin


    In 2003, researchers from the Human Genome Project (HGP) announced the most comprehensive genome to date1. Although the genome wasn’t fully completed until nearly 20 years later2, numerous large-scale projects, such as the International HapMap Project and 1000 Genomes Project, continued the HGP's work, capturing extensive variation and genomic diversity within humans. Recently, newer initiatives have significantly increased in scale and expanded beyond genomics, offering a more detailed...
    06-25-2024, 06:43 AM

ad_right_rmr

Collapse

News

Collapse

Topics Statistics Last Post
Started by seqadmin, 07-16-2024, 05:49 AM
0 responses
28 views
0 likes
Last Post seqadmin  
Started by seqadmin, 07-15-2024, 06:53 AM
0 responses
33 views
0 likes
Last Post seqadmin  
Started by seqadmin, 07-10-2024, 07:30 AM
0 responses
40 views
0 likes
Last Post seqadmin  
Started by seqadmin, 07-03-2024, 09:45 AM
0 responses
205 views
0 likes
Last Post seqadmin  
Working...
X