Control-FREEC: a tool for assessing copy number and allelic content using NGS data

dganiewich replied

04-23-2020, 11:50 AM
Hi Valentina! How are you?
I was hopping you could help me with the following:
I have a human tumor sample from WES which has also been analyzed by cytogenetists and they concluded it is near-tetraploid. I also have its matching normal sample, with regular diploidy.
When running Control-FREEC should I set ploidy to 3 or 2? In other words, does the plody entered refer to the tumor ploidy, the normal ploidy or it assumes both samples have the same one?
Thank you very much in advance,
Best,
Daiana
Leave a comment:

chmay replied

03-05-2019, 11:42 AM

Hi, I am trying to run controlfreec on WGS cancer cell line data I received from a collaborator. However, I am hitting an error for all of them:

Code:

..failed to run segmentation on chrGL000207.1
terminate called after throwing an instance of 'std::bad_alloc'
  what():  std::bad_alloc

It is the same error in all cases, though not necessarily at the same stage - in most cases the last output generated is:

Code:

Will continue with contamination = 0
..Identified contamination by normal cells: 0%
Seeking eventual subclones...

But in some cases, this step runs to completion:

Code:

Seeking eventual subclones...-> Done!
Total proportion of unexplained regions: 165470 out of 4.47446e+06 = 0.036981

Attaching an example config file:

Code:

[general]

## parameters chrLenFile and ploidy are required.
BedGraphOutput=TRUE
breakPointType = 4
chrFiles = /projects/wtsspipeline/resources/Homo_sapiens/bfa_NCBI-37-TCGA/hg19a_per_chr_fastas/
chrLenFile = /projects/wtsspipeline/programs/code/Control-FREEC_1.0.0/resources/hg19_control_FreeC_chr_length.txt
coefficientOfVariation = 0.062
contaminationAdjustment=TRUE
forceGCcontentNormalization = 2
gemMappabilityFile = /projects/wtsspipeline/resources/Homo_sapiens/bfa_NCBI-37-TCGA/out100m1_hg19.gem
minCNAlength = 2
minimalSubclonePresence = 0.05
maxThreads=8
outputDir = /projects/ccg_capture_panel/cmay_dev/projects/CLINGEN-5870_LSARP/CLINGEN-6122_Run_CNV_analysis_on_cell_line_BAMS/freec_run1/A12438
ploidy = 2,3,4,5,6
sambamba = /gsc/software/linux-x86_64/sambamba-0.5.5/sambamba_v0.5.5
samtools = /home/rcorbett/aligners/samtools-1.2/samtools
telocentromeric = 75000


[sample]

mateFile = /projects/analysis/analysis19/A12438/merge_bwa-0.5.7/100nt/hg19a/A12438_3_lanes_dupsFlagged.bam
inputFormat = BAM
mateOrientation = FR

I had seen elsewhere that this could be because the chromosome lengths file doesn't match the chromosome lengths in the bam files, however these all match. Any insight would be helpful! Thanks.

Leave a comment:

valeu replied

09-22-2016, 01:16 AM
Please send me the generated .cnp files and your config: valentina . boeva at inserm . fr
Valentina
Leave a comment:

ruanys replied

09-21-2016, 04:18 AM

Segmentation fault (core dumped)

Hi,

I'm trying to run Control-FREEC on cancer whole genome sequencing data, but I get the error "Segmentation fault (core dumped)". thanks for any help!

config file :

Code:

[general]

chrLenFile = /public1/users/ruanys/human_genome/ref/b37.len
chrFiles= /public1/users/ruanys/human_genome/ref/b37_chrfa
ploidy = 2,3,4
BedGraphOutput=TRUE
coefficientOfVariation = 0.062
outputDir = ./
sex=XX

#minCNAlength=1


[sample]

mateFile = /public2/users/chenbj/called/HCC2.pileup
inputFormat = pileup
mateOrientation = FR


[control]

mateFile = /public2/users/chenbj/called/CRN.pileup
inputFormat = pileup
mateOrientation = FR


[BAF]

minimalCoveragePerPosition=0
SNPfile=/public1/users/ruanys/software/FREEC-9.5/download/hg19_snp142.SingleDiNucl.1based.txt.gz
shiftInQuality = 33

full output while running Control-FREEC:

Code:

Control-FREEC v9.6 : a method for automatic detection of copy number alterations, subclones and for accurate estimation of contamination and main ploidy using deep-sequencing data
Non MT-mode
..consider the sample being female
..Breakpoint threshold for segmentation of copy number profiles is 0.8
..telocenromeric set to 50000
..FREEC is not going to adjust profiles for a possible contamination by normal cells
..Coefficient Of Variation set equal to 0.062
..it will be used to evaluate window size
..Output directory: ./
..Directory with files containing chromosome sequences: /public1/users/ruanys/human_genome/ref/b37_chrfa
..Sample file:  /public2/users/chenbj/called/HCC2.pileup
..Sample input format:  pileup
..Control file: /public2/users/chenbj/called/CRN.pileup
..Input format for the control file:    pileup
..minimal expected GC-content (general parameter "minExpectedGC") was set to 0.35
..maximal expected GC-content (general parameter "maxExpectedGC") was set to 0.55
..Polynomial degree for "ReadCount ~ GC-content" normalization is 3 or 4: will try both
..Minimal CNA length (in windows) is 1
..File with chromosome lengths: /public1/users/ruanys/human_genome/ref/b37.len
..Using the default minimal mappability value of 0.85
..uniqueMatch = FALSE
..FREEC will try to guess the correct ploidy(for each ploidy specified in 'ploidy' parameter)
..It will try ploidies: 2
3
4
..break-point type set to 2
..noisyData set to 0
..minimal number of reads per window in the control sample is set to 10
..Control-FREEC will not look for subclones
Warning: we recommend setting "window=0" for exome sequencing data
..will use SNP positions from /public1/users/ruanys/software/FREEC-9.5/download/hg19_snp142.SingleDiNucl.1based.txt.gz to calculate BAF profiles
..Starting reading /public1/users/ruanys/software/FREEC-9.5/download/hg19_snp142.SingleDiNucl.1based.txt.gz to get SNP positions
..read 101778434 SNP positions
PROFILING [tid=139713552041760]: /public1/users/ruanys/software/FREEC-9.5/download/hg19_snp142.SingleDiNucl.1based.txt.gz read in 1329 seconds [readSNPs]
..use "pileup" format of reads to calculate BAF profiles
..Starting reading /public2/users/chenbj/called/HCC2.pileup to calculate BAF profiles
will skip chrMT
will skip chrGL000207.1
will skip chrGL000226.1
will skip chrGL000229.1
will skip chrGL000231.1
will skip chrGL000210.1
will skip chrGL000239.1
will skip chrGL000235.1
will skip chrGL000201.1
will skip chrGL000247.1
will skip chrGL000245.1
will skip chrGL000197.1
will skip chrGL000203.1
will skip chrGL000246.1
will skip chrGL000249.1
will skip chrGL000196.1
will skip chrGL000248.1
will skip chrGL000244.1
will skip chrGL000238.1
will skip chrGL000202.1
will skip chrGL000234.1
will skip chrGL000232.1
will skip chrGL000206.1
will skip chrGL000240.1
will skip chrGL000236.1
will skip chrGL000241.1
will skip chrGL000243.1
will skip chrGL000242.1
will skip chrGL000230.1
will skip chrGL000237.1
will skip chrGL000233.1
will skip chrGL000204.1
will skip chrGL000198.1
will skip chrGL000208.1
will skip chrGL000191.1
will skip chrGL000227.1
will skip chrGL000228.1
will skip chrGL000214.1
will skip chrGL000221.1
will skip chrGL000209.1
will skip chrGL000218.1
will skip chrGL000220.1
will skip chrGL000213.1
will skip chrGL000211.1
will skip chrGL000199.1
will skip chrGL000217.1
will skip chrGL000216.1
will skip chrGL000215.1
will skip chrGL000205.1
will skip chrGL000219.1
will skip chrGL000224.1
will skip chrGL000223.1
will skip chrGL000195.1
will skip chrGL000212.1
will skip chrGL000222.1
will skip chrGL000200.1
will skip chrGL000193.1
will skip chrGL000194.1
will skip chrGL000225.1
will skip chrGL000192.1
2842829201 lines read
PROFILING [tid=139713552033536]: /public2/users/chenbj/called/HCC2.pileup read in 2400 seconds [assignValues]
..use "pileup" format of reads to calculate BAF profiles
..Starting reading /public2/users/chenbj/called/CRN.pileup to calculate BAF profiles
will skip chrMT
will skip chrGL000207.1
will skip chrGL000226.1
will skip chrGL000229.1
will skip chrGL000231.1
will skip chrGL000210.1
will skip chrGL000239.1
will skip chrGL000235.1
will skip chrGL000201.1
will skip chrGL000247.1
will skip chrGL000245.1
will skip chrGL000197.1
will skip chrGL000203.1
will skip chrGL000246.1
will skip chrGL000249.1
will skip chrGL000196.1
will skip chrGL000248.1
will skip chrGL000244.1
will skip chrGL000238.1
will skip chrGL000202.1
will skip chrGL000234.1
will skip chrGL000232.1
will skip chrGL000206.1
will skip chrGL000240.1
will skip chrGL000236.1
will skip chrGL000241.1
will skip chrGL000243.1
will skip chrGL000242.1
will skip chrGL000230.1
will skip chrGL000237.1
will skip chrGL000233.1
will skip chrGL000204.1
will skip chrGL000198.1
will skip chrGL000208.1
will skip chrGL000191.1
will skip chrGL000227.1
will skip chrGL000228.1
will skip chrGL000214.1
will skip chrGL000221.1
will skip chrGL000209.1
will skip chrGL000218.1
will skip chrGL000220.1
will skip chrGL000213.1
will skip chrGL000211.1
will skip chrGL000199.1
will skip chrGL000217.1
will skip chrGL000216.1
will skip chrGL000215.1
will skip chrGL000205.1
will skip chrGL000219.1
will skip chrGL000224.1
will skip chrGL000223.1
will skip chrGL000195.1
will skip chrGL000212.1
will skip chrGL000222.1
will skip chrGL000200.1
will skip chrGL000193.1
will skip chrGL000194.1
will skip chrGL000225.1
will skip chrGL000192.1
2841735157 lines read
PROFILING [tid=139713541543680]: /public2/users/chenbj/called/CRN.pileup read in 1730 seconds [assignValues]
..File /public1/users/ruanys/human_genome/ref/b37.len was read
total genome size: 3.1018e+09
PROFILING [tid=139713552041760]: /public2/users/chenbj/called/HCC2.pileup read in 10471 seconds [getReadNumberFromPileup]
read number:   1246830993
coefficientOfVariation:    0.062
 evaluated window size: 647
 ..Starting reading /public2/users/chenbj/called/HCC2.pileup
 PROFILING [tid=139713552041760]: /public2/users/chenbj/called/HCC2.pileup read in 3551 seconds [fillMyHash]
 2842829201 lines read..
 1246830993 reads used to compute copy number profile
 printing counts into ./HCC2.pileup_sample.cpn
 ..Window size: 647
 ..Will not consider chrY..
 ..Erased chrY from the list of chromosomes
 ..File /public1/users/ruanys/human_genome/ref/b37.len was read
 ..Starting reading /public2/users/chenbj/called/CRN.pileup
 PROFILING [tid=139713552041760]: /public2/users/chenbj/called/CRN.pileup read in 2893 seconds [fillMyHash]
 2841735157 lines read..
 880018556 reads used to compute copy number profile
 printing counts into ./CRN.pileup_control.cpn
 ..Will not consider chrY..
 ..Erased chrY from the list of chromosomes
 ..using GC-content to normalize copy number profiles
 CG-content printed into ./GC_profile.cnp
 ..using GC-content to normalize the control profile
 file ./GC_profile.cnp is read
 ..will remove all windows with read count in the control less than 10
 Warning: control length is not equal to the sample length for chromosome MT
 Warning: control length is not equal to the sample length for chromosome GL000207.1
 Warning: control length is not equal to the sample length for chromosome GL000226.1
 Warning: control length is not equal to the sample length for chromosome GL000229.1
 Warning: control length is not equal to the sample length for chromosome GL000210.1
 Warning: control length is not equal to the sample length for chromosome GL000239.1
 Warning: control length is not equal to the sample length for chromosome GL000235.1
 Warning: control length is not equal to the sample length for chromosome GL000201.1
 Warning: control length is not equal to the sample length for chromosome GL000245.1
 Warning: control length is not equal to the sample length for chromosome GL000203.1
 Warning: control length is not equal to the sample length for chromosome GL000246.1
 Warning: control length is not equal to the sample length for chromosome GL000249.1
 Warning: control length is not equal to the sample length for chromosome GL000196.1
 Warning: control length is not equal to the sample length for chromosome GL000202.1
 Warning: control length is not equal to the sample length for chromosome GL000232.1
 Warning: control length is not equal to the sample length for chromosome GL000206.1
 Warning: control length is not equal to the sample length for chromosome GL000236.1
 Warning: control length is not equal to the sample length for chromosome GL000241.1
 Warning: control length is not equal to the sample length for chromosome GL000243.1
 Warning: control length is not equal to the sample length for chromosome GL000230.1
 Warning: control length is not equal to the sample length for chromosome GL000237.1
 Warning: control length is not equal to the sample length for chromosome GL000233.1
 Warning: control length is not equal to the sample length for chromosome GL000204.1
 Warning: control length is not equal to the sample length for chromosome GL000198.1
 Warning: control length is not equal to the sample length for chromosome GL000208.1
 Warning: control length is not equal to the sample length for chromosome GL000191.1
 Warning: control length is not equal to the sample length for chromosome GL000227.1
 Warning: control length is not equal to the sample length for chromosome GL000228.1
 Warning: control length is not equal to the sample length for chromosome GL000214.1
 Warning: control length is not equal to the sample length for chromosome GL000221.1
 Warning: control length is not equal to the sample length for chromosome GL000209.1
 Warning: control length is not equal to the sample length for chromosome GL000218.1
 Warning: control length is not equal to the sample length for chromosome GL000220.1
 Warning: control length is not equal to the sample length for chromosome GL000213.1
 Warning: control length is not equal to the sample length for chromosome GL000211.1
 Warning: control length is not equal to the sample length for chromosome GL000199.1
 Warning: control length is not equal to the sample length for chromosome GL000215.1
 Warning: control length is not equal to the sample length for chromosome GL000205.1
 Warning: control length is not equal to the sample length for chromosome GL000219.1
 Warning: control length is not equal to the sample length for chromosome GL000224.1
 Warning: control length is not equal to the sample length for chromosome GL000223.1
 Warning: control length is not equal to the sample length for chromosome GL000195.1
 Warning: control length is not equal to the sample length for chromosome GL000222.1
 Warning: control length is not equal to the sample length for chromosome GL000200.1
 Warning: control length is not equal to the sample length for chromosome GL000193.1
 Warning: control length is not equal to the sample length for chromosome GL000194.1
 Warning: control length is not equal to the sample length for chromosome GL000225.1
 Warning: control length is not equal to the sample length for chromosome GL000192.1
 ..will process the control file as well: removing all windows with read count in the control less than 10
 ..Set ploidy for the control genome equal to 2
 ..Running FREEC with ploidy set to 2
 2645.86    -3376.64    1406.51 -189.551
 1256.67    -1598.41    661.752 -88.7931
 642.334    -822.065    341.75  -46.061
 334.722    -432.897    181.612 -24.6936
 172.978    -225.899    95.6776 -13.1404
 87.9735    -114.832    48.6472 -6.68335
 41.8822    -55.0312    23.4134 -3.22546
 20.4673    -27.3629    11.8648 -1.66627
 13.0201    -17.4558    7.57621 -1.0632
 3.97755    -5.10018    2.12297 -0.28691
 2.15448    -2.89124    1.26563 -0.180688
 0.620571   -0.747793   0.292975    -0.0372971
 0  0   0   0
 Number of EM iterations :12
 root mean square error = 27.3447
 -9154.09   19731.4 -15436.1    5205.91 -638.203
 -4604.52   9828.18 -7620.35    2547.45 -309.835
 -2104.57   4427.19 -3386.03    1116.97 -134.13
 -959.901   1981.09 -1487.84    482.197 -56.9187
 -448.968   902.816 -662.002    209.83  -24.2583
 -236.815   466.895 -336.011    104.598 -11.8814
 -116.556   225.078 -159.085    48.7493 -5.4619
 -82.6468   155.462 -106.854    31.7517 -3.43634
 -47.7307   87.0257 -58.1954    16.9145 -1.80325
 -36.084    63.2147 -40.7236    11.4445 -1.18504
 -28.2197   50.9987 -33.7152    9.6528  -1.0098
 -23.3177   42.3226 -28.1507    8.10278 -0.846576
 -8.77687   15.6185 -10.1197    2.82089 -0.284334
 -8.45886   14.7795 -9.41635    2.59205 -0.260067
 -4.2932    7.44233 -4.73822    1.31233 -0.133346
 -0.497625  0.879831    -0.552636   0.14733 -0.0141353
 -0.404366  0.650716    -0.360985   0.0819369   -0.00635136
 0  0   0   0   0
 Number of EM iterations :17
 root mean square error = 27.3093
 2645.86    -3376.64    1406.51 -189.551
 1256.67    -1598.41    661.752 -88.7931
 642.334    -822.065    341.75  -46.061
 334.722    -432.897    181.612 -24.6936
 q172.978   -225.899    95.6776 -13.1404
 87.9735    -114.832    48.6472 -6.68335
 41.8822    -55.0312    23.4134 -3.22546
 20.4673    -27.3629    11.8648 -1.66627
 13.0201    -17.4558    7.57621 -1.0632
 3.97755    -5.10018    2.12297 -0.28691
 2.15448    -2.89124    1.26563 -0.180688
 0.620571   -0.747793   0.292975    -0.0372971
 0  0   0   0
 Number of EM iterations :12
 root mean square error = 27.3447
 Y = 110.664*x*x*x+-749.239*x*x+769.89*x+71.4395
 Segmentation fault (core dumped)

Leave a comment:

morrowliu replied

06-02-2015, 02:23 PM
Originally posted by smapdy View Post

I ended up figuring out what was going on. I had some multiallelic variants in the .snp file that were causing it to fail to load, and my sex variable in the configuration file didn't match up with the actual sample sex which caused problems as well. I ended up dropping the sex argument and using the following general configuration file for my samples:
[general]
window = 8000
step = 2500
samtools = samtools
minCNAlength = 4
BedGraphOutput = TRUE
chrLenFile = NCBIM37_um.fa.len
chrFiles = chrfiles
outputDir = 31208T_31668N_FREEC_V1
printNA = FALSE
maxThreads = 6
ploidy = 2
breakPointType = 4
contaminationAdjustment = TRUE
noisyData = TRUE

[sample]
mateFile = 31208_EXOME.pileup.gz
inputFormat = pileup
mateOrientation = 0

[control]
mateFile = 31668_EXOME.pileup.gz
inputFormat = pileup
mateOrientation = 0

[target]
captureRegions = S0276129_Merged_Sorted_Probes.bed

[BAF]
SNPfile = snp128.singlebases.monoalleleic.freec_baf.txt
minimalCoveragePerPosition = 5

If anyone is interested I also have the commands I used to generate the pileups from the .bams, as well as the script I used to generate a working Mm9 and Mm10 .snp file.

Hi, Smapdy,

I am also working on a mouse project and want to use FreeC to call CNVs. However, when I use the Snp137 file I have the same error message as you mentioned above.
I noticed it's been 2 years. But still wondering if you can send me the mm10.snp file?

Thank you very much!
Best,
Yihua
Leave a comment:
vd4mindia replied

02-05-2015, 02:34 AM
what should be the parameter for normal/tumor clone with varying coverage

I would like to discuss certain things with you regarding the samples am using to infer CNV with exome data with Control-FREEC. I am using WES tumor data. I have tumor sample with a coverage of 70X(polyclonal) and its match normal as blood with same coverage. I used 500 windows and step 250 to infer the CNVs. I found 120 CNVs with signifiance with a median of 42kb for a region that is called CNV. However am applying the same parameters when I am using to infer CNVs from my tumor reprogrammed clones which are sequenced at 35X since they are single clone but the normal control in that case is again 70X coverage blood sample. So can you suggest me if the window length for this? Should it be the same as that of tumor/normal pair? I did with same window and found the median distribution of the bases is higher for single clone iPSCs than the tumor. Do you have any suggestion is I should double the window and step size for the single clone or reduce it by half? Also the coverage of normal blood is 70X while that of the iPSC clone is 35X so wont the results be spurious taking the same window and step as with tumor/normal samples having both 70X coverage? What should be ideal window and step if the control is having double the coverage than its tumor sample? or is it preferable to use the coefficientofVariation? If so then what should be the suggestion of coefficientofvariation that I should use. Also the breakpointType and breakpoint threshold that should be used. Am attaching the config file which I already used for my normal/tumor (both 70X coverage) . I have used the same config file for normal/tumor-IPSC (70X/35X) coverage. The results look promising but am thinking if am tampering with the sensitivity or not, but as far as I know the read depths are normalized for both and then the CNV are calculated. Still I would like some suggestions about the parameters I should change for varying normal/tumor depth. Should I also use intercept=0 and readcountThreshold >=50 since it is WES data. I would like some suggestions if it seems that am tampering with the sensitivity since am keeping the parameters same for norma/tumor and normal/ipsc which has different coverage.

Code:

[general] chrLenFile = /scratch/GT/vdas/pietro/exome_seq/test_Control_FREEC/hs19_chr.len window = 500 step = 250 ploidy = 2 outputDir = /scratch/GT/vdas/pietro/exome_seq/results/control_freec_out/output_S313_tumor/ BedGraphOutput=TRUE breakPointType=4 gemMappabilityFile = /scratch/GT/vdas/pietro/exome_seq/test_Control_FREEC/out100m1_hg19.gem chrFiles = /scratch/GT/vdas/test_exome/exome/ maxThreads=6 breakPointThreshold=1.5 noisyData=TRUE printNA=FALSE #breakPointThreshold = -.002; #window = 50000 #chrFiles = hg18/hg18_per_chromosome #outputDir = test #degree=3 #intercept = 0 [sample] mateFile = /scratch/GT/vdas/pietro/exome_seq/results/T_S7998/T_S7998.realigned.recal.bam inputFormat = bam mateOrientation = FR [control] mateFile = /scratch/GT/vdas/pietro/exome_seq/results/N_S8980/N_S8980.realigned.recal.bam inputFormat = bam mateOrientation = FR [BAF] SNPfile = /scratch/GT/vdas/pietro/exome_seq/test_Control_FREEC/hg19_snp137.SingleDiNucl.1based.txt minimalCoveragePerPosition = 5 [target] captureRegions = /scratch/GT/vdas/referenceBed/hg19/ss_v4/Exon_SSV4_clean.bed
Leave a comment:
AnweshaM7 replied

11-12-2014, 09:56 PM
error while control free C

Hi,
I have bam files for my sample. I ran control-freeC (WGS) for all chromosomes and got _CNV for them .
However for chromosome X and Y I am getting error :
'Unable to proceed..
Try to rerun the program with higher number of reads'

The data (tumor) is of 27x coverage for hg18 track.
I have tried a winow length of 1000,1500 and 3000 but still get the same error.

I am not able to understand the reason for getting this error.

Thanks
Anwesha
Leave a comment:
shruti replied

11-12-2014, 06:56 AM
Hi,

I am running ControlFreec for matched tumor/normal pairs whole exome sequencing.
However for one sample I am always getting the error.

Initial guess for polynomial:
Error: variation in read count per window is too small.
Unable to proceed..
Wed Nov 12 14:41:11 GMT 2014

I have tried to increase the window size but still get the same problem. Last setting for window size was 1500.

The average coverage for the normal and tumor is 107x and 24x respectively.

I am a bit clueless here.. should I increase or decrease the window size?

Thanks

Regards
Shruti
Leave a comment:
valeu replied

10-05-2014, 04:56 AM
Originally posted by AnweshaM7 View Post

Hi , I would like to download all tracks > SNP130 (if your using hg18, for hg19 its 131) >.provide the hg18 snp 130 txt file. I checked ucsc but am not able to understand which filters to select. Secondly how do I change the order of the columns. I checked the tutorial but am not getting any option to do

I will try to add it. But I assure you that the results will be the same as if you use hg18_snp130.SingleDiNucl.1based.txt
Leave a comment:
AnweshaM7 replied

10-04-2014, 12:15 AM
Hi , I would like to download all tracks > SNP130 (if your using hg18, for hg19 its 131) >.provide the hg18 snp 130 txt file. I checked ucsc but am not able to understand which filters to select. Secondly how do I change the order of the columns. I checked the tutorial but am not getting any option to do

Thanks
Anwesha
Leave a comment:
valeu replied

09-23-2014, 08:43 AM
Originally posted by tatinhawk View Post

I noticed that in the "_CNVs" output file there are overlapping CNVs.

FREEC uses overlapping windows to scan the genome (if step < window). This is why you may have overlapping predictions. The breakpoint should be located somewhere in the overlapping part.
Leave a comment:
tatinhawk replied

09-23-2014, 08:20 AM
Question about the _CNV output

Dear Value,

I would like to ask you something about the "_CNVs" output of Control-FREEC. I have a set of mouse cancer whole genomes that have been sequence at high depth ~45X using Illumina. I have used Control-FREEC to call CNVs on the samples as well as the BAF(using the set of SNPs idetified by the mouse resequencing project on the same mouse strain). I noticed that in the "_CNVs" output file there are overlapping CNVs. For instance (highlighted in bold below as reported in the _CNVs output file)

1 2960000 3029999 2 normal AA 20.8697
1 2990000 3389999 8 gain AAAAABBB 5.57241
1 3350000 3499999 3 gain AAB 44.5596
1 3460000 3549999 11 gain AAAAAAAAABB 100
1 3510000 3739999 3 gain AAB 7.9066
1 11890000 12709999 3 gain AAB 2.14849
1 12670000 16909999 3 gain AAB 0.411016

In most of the cases that I have encountered so far, the overlapping CNV windows have either different predicted genotypes and copy number (like in the firs example) or only different precentages of uncertainty of the predicted genotype.

In the former case I assume the presence of the overlapped CNVs is due to the prediction of different genotypes (is this correct?) and a filter by percentage of uncertainity would remove them. However, in the latter the predicted genotypes and copy numbers are the same and the percentages of uncertainity are low as well.

Do you have any clues on why this might be occuring? Also would you recommend to filter out the CNVs based on the precentages of uncertainty up to the point where one ends up with non overlapping CNVs?

Thanks and I hope that you have a good day!
Leave a comment:
valeu replied

08-12-2014, 12:17 AM
Originally posted by bhdavis1978 View Post

Hi Valeu,
What would be the consequences of this? More variability in the copy number estimation? More breakpoints? Less confidence in identifying break points?

More variability in the normalized read count signal => less confidence in breakpoints.

Anyway, you can try and then visually check the resulting profile.
Leave a comment:

Previous 1 2 3 4 6 template Next

Essential Discoveries and Tools in Epitranscriptomics

by seqadmin

The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
- Channel: Articles
04-22-2024, 07:01 AM
Current Approaches to Protein Sequencing

by seqadmin

Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
- Channel: Articles
04-04-2024, 04:25 PM

Topics	Statistics	Last Post
Genetic Variants and Diabetes Risk in Childhood Cancer Survivors by seqadmin Started by seqadmin, Yesterday, 08:47 AM	0 responses 12 views 0 likes	Last Post by seqadmin Yesterday, 08:47 AM
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 60 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 59 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 54 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM

Seqanswers Leaderboard Ad

Announcement

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Latest Articles

ad_right_rmr

News