Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Using SVdetect for finding rearrangements in resequenced genome

    Dear all

    I have sequenced a CHO cell line descended from CHO K1 and I would like to investigate the chromosomal rearrangements that as occurred. I found a program called SVdetect that other people seem to like and even a tutorial with a step by step explanation. I sequenced the genome to a theoretical depth of 38x but below i used a dataset only contaning ~3x depth (here I used a subset of the data = only enough reads were aligned for ~3x depth of the genome and not the full dataset)

    1. I made a list of the chromosomes and their length (called sortedlist.txt)
    1 NW_003613580.1 8905210
    2 NW_003613581.1 8197018
    3 NW_003613582.1 6761507

    4392 NW_003717424.1 213


    2. I aligned my reads to the CHO K1 genome using BWA, realigned using GATK and removed duplicates using Picard.

    3. I used the SVDetect BAM_preprocessingPairs script to preprocess my bam file
    perl ~/genome/SVDetect_r0.8b/scripts/BAM_preprocessingPairs.pl /novo/omicsmanager/processed01/1099/data/pipe2del/DXB1/4Picard.bam

    Gave the output:
    Total : 62889376 pairs analysed
    -- 251915 pairs whose one or both reads are unmapped
    -- 62637461 mapped pairs
    ---- 746850 abnormal mapped pairs
    ------ 542994 pairs mapped on two different chromosomes
    ------ 443784 pairs with incorrect strand orientation and/or pair order
    ------ 179904 pairs with incorrect insert size distance
    ---- 61890611 correct mapped pairs

    4. I converted the bam to sam
    samtools view -h -o 4Picard.ab.sam 4Picard.ab.bam

    5. I made a config file more or less exactly as in the toturial
    <general>
    input_format=sam
    sv_type=all
    mates_orientation=RF
    read1_length=86
    read2_length=86
    mates_file=/novo/omicsmanager/processed01/1099/data/pipe2del/SVdetect/4Picard.ab.sam
    cmap_file=/novo/omicsmanager/processed01/1099/data/pipe2del/SVdetect/sortedlist.txt
    num_threads=1
    </general>

    <detection>
    split_mate_file=0
    window_size=2000
    step_length=500
    </detection>

    <filtering>
    split_link_file=0
    nb_pairs_threshold=3
    strand_filtering=1
    </filtering>

    <bed>
    <colorcode>
    255,0,0=1,4
    0,255,0=5,10
    0,0,255=11,100000
    </colorcode>

    6. Finally I ran SVDetect
    perl ~/genome/SVDetect_r0.8b/bin/SVDetect linking -conf /novo/omicsmanager/processed01/1099/data/pipe2del/SVdetect/configfile.txt

    it gave the output:
    ls: cannot access ./mates/4Picard.ab.sam.all*: No such file or directory
    # Error: No splitted mate files already created at ./ : at /novo/users/csrk/genome/SVDetect_r0.8b/bin/SVDetect line 136.

    I could not find anybody else posting this error massage. I would be very gratefull if anyone that could guide me to another step by step manual for using SVDetect or help me make one here for others to find by google.

  • #2
    I think you're missing a specification of the bam file containing all mapped reads. You should figure out why it's looking for this file "./mates/4Picard.ab.sam.all".... it's likely the original bam file.

    Comment


    • #3
      Dear YazBraimah

      You are right on the money... i continued the question in another thread (
      http://seqanswers.com/forums/showthread.php?t=41919) but it was a matter of the program not being written to handle more than 255 chromosomes. The ./mates/4Picard.ab.sam.all had to be made before running the software.

      -Christian

      Comment

      Latest Articles

      Collapse

      • seqadmin
        Best Practices for Single-Cell Sequencing Analysis
        by seqadmin



        While isolating and preparing single cells for sequencing was historically the bottleneck, recent technological advancements have shifted the challenge to data analysis. This highlights the rapidly evolving nature of single-cell sequencing. The inherent complexity of single-cell analysis has intensified with the surge in data volume and the incorporation of diverse and more complex datasets. This article explores the challenges in analysis, examines common pitfalls, offers...
        06-06-2024, 07:15 AM
      • seqadmin
        Latest Developments in Precision Medicine
        by seqadmin



        Technological advances have led to drastic improvements in the field of precision medicine, enabling more personalized approaches to treatment. This article explores four leading groups that are overcoming many of the challenges of genomic profiling and precision medicine through their innovative platforms and technologies.

        Somatic Genomics
        “We have such a tremendous amount of genetic diversity that exists within each of us, and not just between us as individuals,”...
        05-24-2024, 01:16 PM

      ad_right_rmr

      Collapse

      News

      Collapse

      Topics Statistics Last Post
      Started by seqadmin, Yesterday, 07:24 AM
      0 responses
      11 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 06-13-2024, 08:58 AM
      0 responses
      11 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 06-12-2024, 02:20 PM
      0 responses
      16 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 06-07-2024, 06:58 AM
      0 responses
      184 views
      0 likes
      Last Post seqadmin  
      Working...
      X