Dear all
I have sequenced a CHO cell line descended from CHO K1 and I would like to investigate the chromosomal rearrangements that as occurred. I found a program called SVdetect that other people seem to like and even a tutorial with a step by step explanation. I sequenced the genome to a theoretical depth of 38x but below i used a dataset only contaning ~3x depth (here I used a subset of the data = only enough reads were aligned for ~3x depth of the genome and not the full dataset)
1. I made a list of the chromosomes and their length (called sortedlist.txt)
1 NW_003613580.1 8905210
2 NW_003613581.1 8197018
3 NW_003613582.1 6761507
…
4392 NW_003717424.1 213
2. I aligned my reads to the CHO K1 genome using BWA, realigned using GATK and removed duplicates using Picard.
3. I used the SVDetect BAM_preprocessingPairs script to preprocess my bam file
perl ~/genome/SVDetect_r0.8b/scripts/BAM_preprocessingPairs.pl /novo/omicsmanager/processed01/1099/data/pipe2del/DXB1/4Picard.bam
Gave the output:
Total : 62889376 pairs analysed
-- 251915 pairs whose one or both reads are unmapped
-- 62637461 mapped pairs
---- 746850 abnormal mapped pairs
------ 542994 pairs mapped on two different chromosomes
------ 443784 pairs with incorrect strand orientation and/or pair order
------ 179904 pairs with incorrect insert size distance
---- 61890611 correct mapped pairs
4. I converted the bam to sam
samtools view -h -o 4Picard.ab.sam 4Picard.ab.bam
5. I made a config file more or less exactly as in the toturial
<general>
input_format=sam
sv_type=all
mates_orientation=RF
read1_length=86
read2_length=86
mates_file=/novo/omicsmanager/processed01/1099/data/pipe2del/SVdetect/4Picard.ab.sam
cmap_file=/novo/omicsmanager/processed01/1099/data/pipe2del/SVdetect/sortedlist.txt
num_threads=1
</general>
<detection>
split_mate_file=0
window_size=2000
step_length=500
</detection>
<filtering>
split_link_file=0
nb_pairs_threshold=3
strand_filtering=1
</filtering>
<bed>
<colorcode>
255,0,0=1,4
0,255,0=5,10
0,0,255=11,100000
</colorcode>
6. Finally I ran SVDetect
perl ~/genome/SVDetect_r0.8b/bin/SVDetect linking -conf /novo/omicsmanager/processed01/1099/data/pipe2del/SVdetect/configfile.txt
it gave the output:
ls: cannot access ./mates/4Picard.ab.sam.all*: No such file or directory
# Error: No splitted mate files already created at ./ : at /novo/users/csrk/genome/SVDetect_r0.8b/bin/SVDetect line 136.
I could not find anybody else posting this error massage. I would be very gratefull if anyone that could guide me to another step by step manual for using SVDetect or help me make one here for others to find by google.
I have sequenced a CHO cell line descended from CHO K1 and I would like to investigate the chromosomal rearrangements that as occurred. I found a program called SVdetect that other people seem to like and even a tutorial with a step by step explanation. I sequenced the genome to a theoretical depth of 38x but below i used a dataset only contaning ~3x depth (here I used a subset of the data = only enough reads were aligned for ~3x depth of the genome and not the full dataset)
1. I made a list of the chromosomes and their length (called sortedlist.txt)
1 NW_003613580.1 8905210
2 NW_003613581.1 8197018
3 NW_003613582.1 6761507
…
4392 NW_003717424.1 213
2. I aligned my reads to the CHO K1 genome using BWA, realigned using GATK and removed duplicates using Picard.
3. I used the SVDetect BAM_preprocessingPairs script to preprocess my bam file
perl ~/genome/SVDetect_r0.8b/scripts/BAM_preprocessingPairs.pl /novo/omicsmanager/processed01/1099/data/pipe2del/DXB1/4Picard.bam
Gave the output:
Total : 62889376 pairs analysed
-- 251915 pairs whose one or both reads are unmapped
-- 62637461 mapped pairs
---- 746850 abnormal mapped pairs
------ 542994 pairs mapped on two different chromosomes
------ 443784 pairs with incorrect strand orientation and/or pair order
------ 179904 pairs with incorrect insert size distance
---- 61890611 correct mapped pairs
4. I converted the bam to sam
samtools view -h -o 4Picard.ab.sam 4Picard.ab.bam
5. I made a config file more or less exactly as in the toturial
<general>
input_format=sam
sv_type=all
mates_orientation=RF
read1_length=86
read2_length=86
mates_file=/novo/omicsmanager/processed01/1099/data/pipe2del/SVdetect/4Picard.ab.sam
cmap_file=/novo/omicsmanager/processed01/1099/data/pipe2del/SVdetect/sortedlist.txt
num_threads=1
</general>
<detection>
split_mate_file=0
window_size=2000
step_length=500
</detection>
<filtering>
split_link_file=0
nb_pairs_threshold=3
strand_filtering=1
</filtering>
<bed>
<colorcode>
255,0,0=1,4
0,255,0=5,10
0,0,255=11,100000
</colorcode>
6. Finally I ran SVDetect
perl ~/genome/SVDetect_r0.8b/bin/SVDetect linking -conf /novo/omicsmanager/processed01/1099/data/pipe2del/SVdetect/configfile.txt
it gave the output:
ls: cannot access ./mates/4Picard.ab.sam.all*: No such file or directory
# Error: No splitted mate files already created at ./ : at /novo/users/csrk/genome/SVDetect_r0.8b/bin/SVDetect line 136.
I could not find anybody else posting this error massage. I would be very gratefull if anyone that could guide me to another step by step manual for using SVDetect or help me make one here for others to find by google.
Comment