Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • dganiewich
    replied
    Hi Valentina! How are you?
    I was hopping you could help me with the following:
    I have a human tumor sample from WES which has also been analyzed by cytogenetists and they concluded it is near-tetraploid. I also have its matching normal sample, with regular diploidy.
    When running Control-FREEC should I set ploidy to 3 or 2? In other words, does the plody entered refer to the tumor ploidy, the normal ploidy or it assumes both samples have the same one?
    Thank you very much in advance,
    Best,
    Daiana

    Leave a comment:


  • chmay
    replied
    Hi, I am trying to run controlfreec on WGS cancer cell line data I received from a collaborator. However, I am hitting an error for all of them:

    Code:
    ..failed to run segmentation on chrGL000207.1
    terminate called after throwing an instance of 'std::bad_alloc'
      what():  std::bad_alloc
    It is the same error in all cases, though not necessarily at the same stage - in most cases the last output generated is:

    Code:
    Will continue with contamination = 0
    ..Identified contamination by normal cells: 0%
    Seeking eventual subclones...
    But in some cases, this step runs to completion:
    Code:
    Seeking eventual subclones...-> Done!
    Total proportion of unexplained regions: 165470 out of 4.47446e+06 = 0.036981
    Attaching an example config file:
    Code:
    [general]
    
    ## parameters chrLenFile and ploidy are required.
    BedGraphOutput=TRUE
    breakPointType = 4
    chrFiles = /projects/wtsspipeline/resources/Homo_sapiens/bfa_NCBI-37-TCGA/hg19a_per_chr_fastas/
    chrLenFile = /projects/wtsspipeline/programs/code/Control-FREEC_1.0.0/resources/hg19_control_FreeC_chr_length.txt
    coefficientOfVariation = 0.062
    contaminationAdjustment=TRUE
    forceGCcontentNormalization = 2
    gemMappabilityFile = /projects/wtsspipeline/resources/Homo_sapiens/bfa_NCBI-37-TCGA/out100m1_hg19.gem
    minCNAlength = 2
    minimalSubclonePresence = 0.05
    maxThreads=8
    outputDir = /projects/ccg_capture_panel/cmay_dev/projects/CLINGEN-5870_LSARP/CLINGEN-6122_Run_CNV_analysis_on_cell_line_BAMS/freec_run1/A12438
    ploidy = 2,3,4,5,6
    sambamba = /gsc/software/linux-x86_64/sambamba-0.5.5/sambamba_v0.5.5
    samtools = /home/rcorbett/aligners/samtools-1.2/samtools
    telocentromeric = 75000
    
    
    [sample]
    
    mateFile = /projects/analysis/analysis19/A12438/merge_bwa-0.5.7/100nt/hg19a/A12438_3_lanes_dupsFlagged.bam
    inputFormat = BAM
    mateOrientation = FR
    I had seen elsewhere that this could be because the chromosome lengths file doesn't match the chromosome lengths in the bam files, however these all match. Any insight would be helpful! Thanks.

    Leave a comment:


  • valeu
    replied
    Please send me the generated .cnp files and your config: valentina . boeva at inserm . fr
    Valentina

    Leave a comment:


  • ruanys
    replied
    Segmentation fault (core dumped)

    Hi,

    I'm trying to run Control-FREEC on cancer whole genome sequencing data, but I get the error "Segmentation fault (core dumped)". thanks for any help!


    config file :

    Code:
    [general]
    
    chrLenFile = /public1/users/ruanys/human_genome/ref/b37.len
    chrFiles= /public1/users/ruanys/human_genome/ref/b37_chrfa
    ploidy = 2,3,4
    BedGraphOutput=TRUE
    coefficientOfVariation = 0.062
    outputDir = ./
    sex=XX
    
    #minCNAlength=1
    
    
    [sample]
    
    mateFile = /public2/users/chenbj/called/HCC2.pileup
    inputFormat = pileup
    mateOrientation = FR
    
    
    [control]
    
    mateFile = /public2/users/chenbj/called/CRN.pileup
    inputFormat = pileup
    mateOrientation = FR
    
    
    [BAF]
    
    minimalCoveragePerPosition=0
    SNPfile=/public1/users/ruanys/software/FREEC-9.5/download/hg19_snp142.SingleDiNucl.1based.txt.gz
    shiftInQuality = 33
    full output while running Control-FREEC:
    Code:
    Control-FREEC v9.6 : a method for automatic detection of copy number alterations, subclones and for accurate estimation of contamination and main ploidy using deep-sequencing data
    Non MT-mode
    ..consider the sample being female
    ..Breakpoint threshold for segmentation of copy number profiles is 0.8
    ..telocenromeric set to 50000
    ..FREEC is not going to adjust profiles for a possible contamination by normal cells
    ..Coefficient Of Variation set equal to 0.062
    ..it will be used to evaluate window size
    ..Output directory: ./
    ..Directory with files containing chromosome sequences: /public1/users/ruanys/human_genome/ref/b37_chrfa
    ..Sample file:  /public2/users/chenbj/called/HCC2.pileup
    ..Sample input format:  pileup
    ..Control file: /public2/users/chenbj/called/CRN.pileup
    ..Input format for the control file:    pileup
    ..minimal expected GC-content (general parameter "minExpectedGC") was set to 0.35
    ..maximal expected GC-content (general parameter "maxExpectedGC") was set to 0.55
    ..Polynomial degree for "ReadCount ~ GC-content" normalization is 3 or 4: will try both
    ..Minimal CNA length (in windows) is 1
    ..File with chromosome lengths: /public1/users/ruanys/human_genome/ref/b37.len
    ..Using the default minimal mappability value of 0.85
    ..uniqueMatch = FALSE
    ..FREEC will try to guess the correct ploidy(for each ploidy specified in 'ploidy' parameter)
    ..It will try ploidies: 2
    3
    4
    ..break-point type set to 2
    ..noisyData set to 0
    ..minimal number of reads per window in the control sample is set to 10
    ..Control-FREEC will not look for subclones
    Warning: we recommend setting "window=0" for exome sequencing data
    ..will use SNP positions from /public1/users/ruanys/software/FREEC-9.5/download/hg19_snp142.SingleDiNucl.1based.txt.gz to calculate BAF profiles
    ..Starting reading /public1/users/ruanys/software/FREEC-9.5/download/hg19_snp142.SingleDiNucl.1based.txt.gz to get SNP positions
    ..read 101778434 SNP positions
    PROFILING [tid=139713552041760]: /public1/users/ruanys/software/FREEC-9.5/download/hg19_snp142.SingleDiNucl.1based.txt.gz read in 1329 seconds [readSNPs]
    ..use "pileup" format of reads to calculate BAF profiles
    ..Starting reading /public2/users/chenbj/called/HCC2.pileup to calculate BAF profiles
    will skip chrMT
    will skip chrGL000207.1
    will skip chrGL000226.1
    will skip chrGL000229.1
    will skip chrGL000231.1
    will skip chrGL000210.1
    will skip chrGL000239.1
    will skip chrGL000235.1
    will skip chrGL000201.1
    will skip chrGL000247.1
    will skip chrGL000245.1
    will skip chrGL000197.1
    will skip chrGL000203.1
    will skip chrGL000246.1
    will skip chrGL000249.1
    will skip chrGL000196.1
    will skip chrGL000248.1
    will skip chrGL000244.1
    will skip chrGL000238.1
    will skip chrGL000202.1
    will skip chrGL000234.1
    will skip chrGL000232.1
    will skip chrGL000206.1
    will skip chrGL000240.1
    will skip chrGL000236.1
    will skip chrGL000241.1
    will skip chrGL000243.1
    will skip chrGL000242.1
    will skip chrGL000230.1
    will skip chrGL000237.1
    will skip chrGL000233.1
    will skip chrGL000204.1
    will skip chrGL000198.1
    will skip chrGL000208.1
    will skip chrGL000191.1
    will skip chrGL000227.1
    will skip chrGL000228.1
    will skip chrGL000214.1
    will skip chrGL000221.1
    will skip chrGL000209.1
    will skip chrGL000218.1
    will skip chrGL000220.1
    will skip chrGL000213.1
    will skip chrGL000211.1
    will skip chrGL000199.1
    will skip chrGL000217.1
    will skip chrGL000216.1
    will skip chrGL000215.1
    will skip chrGL000205.1
    will skip chrGL000219.1
    will skip chrGL000224.1
    will skip chrGL000223.1
    will skip chrGL000195.1
    will skip chrGL000212.1
    will skip chrGL000222.1
    will skip chrGL000200.1
    will skip chrGL000193.1
    will skip chrGL000194.1
    will skip chrGL000225.1
    will skip chrGL000192.1
    2842829201 lines read
    PROFILING [tid=139713552033536]: /public2/users/chenbj/called/HCC2.pileup read in 2400 seconds [assignValues]
    ..use "pileup" format of reads to calculate BAF profiles
    ..Starting reading /public2/users/chenbj/called/CRN.pileup to calculate BAF profiles
    will skip chrMT
    will skip chrGL000207.1
    will skip chrGL000226.1
    will skip chrGL000229.1
    will skip chrGL000231.1
    will skip chrGL000210.1
    will skip chrGL000239.1
    will skip chrGL000235.1
    will skip chrGL000201.1
    will skip chrGL000247.1
    will skip chrGL000245.1
    will skip chrGL000197.1
    will skip chrGL000203.1
    will skip chrGL000246.1
    will skip chrGL000249.1
    will skip chrGL000196.1
    will skip chrGL000248.1
    will skip chrGL000244.1
    will skip chrGL000238.1
    will skip chrGL000202.1
    will skip chrGL000234.1
    will skip chrGL000232.1
    will skip chrGL000206.1
    will skip chrGL000240.1
    will skip chrGL000236.1
    will skip chrGL000241.1
    will skip chrGL000243.1
    will skip chrGL000242.1
    will skip chrGL000230.1
    will skip chrGL000237.1
    will skip chrGL000233.1
    will skip chrGL000204.1
    will skip chrGL000198.1
    will skip chrGL000208.1
    will skip chrGL000191.1
    will skip chrGL000227.1
    will skip chrGL000228.1
    will skip chrGL000214.1
    will skip chrGL000221.1
    will skip chrGL000209.1
    will skip chrGL000218.1
    will skip chrGL000220.1
    will skip chrGL000213.1
    will skip chrGL000211.1
    will skip chrGL000199.1
    will skip chrGL000217.1
    will skip chrGL000216.1
    will skip chrGL000215.1
    will skip chrGL000205.1
    will skip chrGL000219.1
    will skip chrGL000224.1
    will skip chrGL000223.1
    will skip chrGL000195.1
    will skip chrGL000212.1
    will skip chrGL000222.1
    will skip chrGL000200.1
    will skip chrGL000193.1
    will skip chrGL000194.1
    will skip chrGL000225.1
    will skip chrGL000192.1
    2841735157 lines read
    PROFILING [tid=139713541543680]: /public2/users/chenbj/called/CRN.pileup read in 1730 seconds [assignValues]
    ..File /public1/users/ruanys/human_genome/ref/b37.len was read
    total genome size: 3.1018e+09
    PROFILING [tid=139713552041760]: /public2/users/chenbj/called/HCC2.pileup read in 10471 seconds [getReadNumberFromPileup]
    read number:   1246830993
    coefficientOfVariation:    0.062
     evaluated window size: 647
     ..Starting reading /public2/users/chenbj/called/HCC2.pileup
     PROFILING [tid=139713552041760]: /public2/users/chenbj/called/HCC2.pileup read in 3551 seconds [fillMyHash]
     2842829201 lines read..
     1246830993 reads used to compute copy number profile
     printing counts into ./HCC2.pileup_sample.cpn
     ..Window size: 647
     ..Will not consider chrY..
     ..Erased chrY from the list of chromosomes
     ..File /public1/users/ruanys/human_genome/ref/b37.len was read
     ..Starting reading /public2/users/chenbj/called/CRN.pileup
     PROFILING [tid=139713552041760]: /public2/users/chenbj/called/CRN.pileup read in 2893 seconds [fillMyHash]
     2841735157 lines read..
     880018556 reads used to compute copy number profile
     printing counts into ./CRN.pileup_control.cpn
     ..Will not consider chrY..
     ..Erased chrY from the list of chromosomes
     ..using GC-content to normalize copy number profiles
     CG-content printed into ./GC_profile.cnp
     ..using GC-content to normalize the control profile
     file ./GC_profile.cnp is read
     ..will remove all windows with read count in the control less than 10
     Warning: control length is not equal to the sample length for chromosome MT
     Warning: control length is not equal to the sample length for chromosome GL000207.1
     Warning: control length is not equal to the sample length for chromosome GL000226.1
     Warning: control length is not equal to the sample length for chromosome GL000229.1
     Warning: control length is not equal to the sample length for chromosome GL000210.1
     Warning: control length is not equal to the sample length for chromosome GL000239.1
     Warning: control length is not equal to the sample length for chromosome GL000235.1
     Warning: control length is not equal to the sample length for chromosome GL000201.1
     Warning: control length is not equal to the sample length for chromosome GL000245.1
     Warning: control length is not equal to the sample length for chromosome GL000203.1
     Warning: control length is not equal to the sample length for chromosome GL000246.1
     Warning: control length is not equal to the sample length for chromosome GL000249.1
     Warning: control length is not equal to the sample length for chromosome GL000196.1
     Warning: control length is not equal to the sample length for chromosome GL000202.1
     Warning: control length is not equal to the sample length for chromosome GL000232.1
     Warning: control length is not equal to the sample length for chromosome GL000206.1
     Warning: control length is not equal to the sample length for chromosome GL000236.1
     Warning: control length is not equal to the sample length for chromosome GL000241.1
     Warning: control length is not equal to the sample length for chromosome GL000243.1
     Warning: control length is not equal to the sample length for chromosome GL000230.1
     Warning: control length is not equal to the sample length for chromosome GL000237.1
     Warning: control length is not equal to the sample length for chromosome GL000233.1
     Warning: control length is not equal to the sample length for chromosome GL000204.1
     Warning: control length is not equal to the sample length for chromosome GL000198.1
     Warning: control length is not equal to the sample length for chromosome GL000208.1
     Warning: control length is not equal to the sample length for chromosome GL000191.1
     Warning: control length is not equal to the sample length for chromosome GL000227.1
     Warning: control length is not equal to the sample length for chromosome GL000228.1
     Warning: control length is not equal to the sample length for chromosome GL000214.1
     Warning: control length is not equal to the sample length for chromosome GL000221.1
     Warning: control length is not equal to the sample length for chromosome GL000209.1
     Warning: control length is not equal to the sample length for chromosome GL000218.1
     Warning: control length is not equal to the sample length for chromosome GL000220.1
     Warning: control length is not equal to the sample length for chromosome GL000213.1
     Warning: control length is not equal to the sample length for chromosome GL000211.1
     Warning: control length is not equal to the sample length for chromosome GL000199.1
     Warning: control length is not equal to the sample length for chromosome GL000215.1
     Warning: control length is not equal to the sample length for chromosome GL000205.1
     Warning: control length is not equal to the sample length for chromosome GL000219.1
     Warning: control length is not equal to the sample length for chromosome GL000224.1
     Warning: control length is not equal to the sample length for chromosome GL000223.1
     Warning: control length is not equal to the sample length for chromosome GL000195.1
     Warning: control length is not equal to the sample length for chromosome GL000222.1
     Warning: control length is not equal to the sample length for chromosome GL000200.1
     Warning: control length is not equal to the sample length for chromosome GL000193.1
     Warning: control length is not equal to the sample length for chromosome GL000194.1
     Warning: control length is not equal to the sample length for chromosome GL000225.1
     Warning: control length is not equal to the sample length for chromosome GL000192.1
     ..will process the control file as well: removing all windows with read count in the control less than 10
     ..Set ploidy for the control genome equal to 2
     ..Running FREEC with ploidy set to 2
     2645.86    -3376.64    1406.51 -189.551
     1256.67    -1598.41    661.752 -88.7931
     642.334    -822.065    341.75  -46.061
     334.722    -432.897    181.612 -24.6936
     172.978    -225.899    95.6776 -13.1404
     87.9735    -114.832    48.6472 -6.68335
     41.8822    -55.0312    23.4134 -3.22546
     20.4673    -27.3629    11.8648 -1.66627
     13.0201    -17.4558    7.57621 -1.0632
     3.97755    -5.10018    2.12297 -0.28691
     2.15448    -2.89124    1.26563 -0.180688
     0.620571   -0.747793   0.292975    -0.0372971
     0  0   0   0
     Number of EM iterations :12
     root mean square error = 27.3447
     -9154.09   19731.4 -15436.1    5205.91 -638.203
     -4604.52   9828.18 -7620.35    2547.45 -309.835
     -2104.57   4427.19 -3386.03    1116.97 -134.13
     -959.901   1981.09 -1487.84    482.197 -56.9187
     -448.968   902.816 -662.002    209.83  -24.2583
     -236.815   466.895 -336.011    104.598 -11.8814
     -116.556   225.078 -159.085    48.7493 -5.4619
     -82.6468   155.462 -106.854    31.7517 -3.43634
     -47.7307   87.0257 -58.1954    16.9145 -1.80325
     -36.084    63.2147 -40.7236    11.4445 -1.18504
     -28.2197   50.9987 -33.7152    9.6528  -1.0098
     -23.3177   42.3226 -28.1507    8.10278 -0.846576
     -8.77687   15.6185 -10.1197    2.82089 -0.284334
     -8.45886   14.7795 -9.41635    2.59205 -0.260067
     -4.2932    7.44233 -4.73822    1.31233 -0.133346
     -0.497625  0.879831    -0.552636   0.14733 -0.0141353
     -0.404366  0.650716    -0.360985   0.0819369   -0.00635136
     0  0   0   0   0
     Number of EM iterations :17
     root mean square error = 27.3093
     2645.86    -3376.64    1406.51 -189.551
     1256.67    -1598.41    661.752 -88.7931
     642.334    -822.065    341.75  -46.061
     334.722    -432.897    181.612 -24.6936
     q172.978   -225.899    95.6776 -13.1404
     87.9735    -114.832    48.6472 -6.68335
     41.8822    -55.0312    23.4134 -3.22546
     20.4673    -27.3629    11.8648 -1.66627
     13.0201    -17.4558    7.57621 -1.0632
     3.97755    -5.10018    2.12297 -0.28691
     2.15448    -2.89124    1.26563 -0.180688
     0.620571   -0.747793   0.292975    -0.0372971
     0  0   0   0
     Number of EM iterations :12
     root mean square error = 27.3447
     Y = 110.664*x*x*x+-749.239*x*x+769.89*x+71.4395
     Segmentation fault (core dumped)

    Leave a comment:


  • valeu
    replied
    Hi, I do not see any evident mistake in the config file. If you want me to debug it, please share your config and corresponding files with me. Valentina.Boeva%at%inserm.fr

    Leave a comment:


  • CLFougner
    replied
    Segmentation fault (core dumped)

    Hi Valeu,

    I'm trying to run Control-FREEC on mouse exome sequencing data, but I've run into an issue! It works fine when I run Control-FREEC without the BAF analysis, but when I enable it I get the error "Segmentation fault (core dumped)". I'm wondering if this is an issue you've run into before and if you know how to sort it out?

    The full output from when I run Control-FREEC:
    Code:
    Control-FREEC v9.1 : a method for automatic detection of copy number alterations, subclones and for accurate estimation of contamination and main ploidy using deep-sequencing data
    MT-mode using 4 threads
    ..Breakpoint threshold for segmentation of copy number profiles is 0.8
    ..telocenromeric set to 50000
    ..FREEC is not going to output normalized copy number profiles into a BedGraph file (for example, for visualization in the UCSC GB). Use "[general] BedGraphOutput=TRUE" if you want a BedGraph file
    ..FREEC is not going to adjust profiles for a possible contamination by normal cells
    ..Window = 0 was set
    ..Output directory:     /data2/christian/Sequencing/Output/
    ..Sample file:  /data2/christian/Sequencing/Output/DeduppedBams/123_14_6_correctRGs_mm10_BQSR.sorted.dedupped.bam
    ..Sample input format:  BAM
    ..will use this instance of samtools: 'samtools' to read BAM files
    ..Control file: /data2/christian/Sequencing/Output/DeduppedBams/123_14_8_correctRGs_mm10_BQSR.sorted.dedupped.bam
    ..Input format for the control file:    BAM
    FREEC will create a pileup to compute BAF profile! 
    ...File with SNPs : /data2/christian/Sequencing/ReferenceFiles/hg19_snp142.SingleDiNucl.1based.bed
    ..Polynomial degree for "Sample ReadCount ~ Control ReadCount" normalization is 1
    ..Minimal CNA length (in windows) is 5
    ..File with chromosome lengths: /data2/christian/Sequencing/ReferenceFiles/mm10_chrom_lengths.fa
    ..Mappability and GC-content won't be used
    ..Control-FREEC won't use minimal mappability. All windows overlaping capture regions will be considered
    ..Mappability file/data2/christian/Sequencing/ReferenceFiles/GEM_mapp_GRCm38_68_mm10.gem be used: all low mappability positions will be discarded
    ..uniqueMatch = FALSE
    ..average ploidy set to 2
    ..break-point type set to 4
    ..noisyData set to 1
    ..minimal number of reads per window in the control sample is set to 10
    Creating Pileup file to compute BAF profile...
    ..will increase flanking regions by 100 bp
    Segmentation fault (core dumped)

    My config file is as follows:
    Code:
    [general]
    chrLenFile = /data2/christian/Sequencing/ReferenceFiles/mm10_chrom_lengths.fa
    bedtools=/data2/christian/Sequencing/Frameworks/bedtools2/bedtools
    ploidy = 2
    gemMappabilityFile = /data2/christian/Sequencing/ReferenceFiles/GEM_mapp_GRCm38_68_mm10.gem
    noisyData=TRUE
    outputDir=/data2/christian/Sequencing/Output/
    printNA=FALSE
    samtools=samtools
    window=0
    telocentromeric=50000
    breakPointType=4
    breakpointThreshold=0.6
    minCNAlength=5
    maxThreads=4
    
    
    [sample]
    mateFile = /data2/christian/Sequencing/Output/DeduppedBams/123_14_6_correctRGs_mm10_BQSR.sorted.dedupped.bam
    inputFormat = BAM
    mateOrientation = FR
    
    
    [control]
    mateFile = /data2/christian/Sequencing/Output/DeduppedBams/123_14_8_correctRGs_mm10_BQSR.sorted.dedupped.bam
    inputFormat = BAM
    mateOrientation = FR
    
    [BAF]
    SNPfile=/data2/christian/Sequencing/ReferenceFiles/mm10_dbSNP137.ucsc.freec.txt
    fastaFile=/data2/christian/Sequencing/ReferenceFiles/mm10.fa
    makePileup=/data2/christian/Sequencing/ReferenceFiles/mm10_dbSNP137.ucsc.freec.bed
    minimalCoveragePerPosition=5
    
    [target]
    captureRegions=/data2/christian/Sequencing/ReferenceFiles/S0276129/S0276129_AllTracks.bed
    Specifically, the error disappears when I remove the 'makePileup=' line (although then the BAF analysis isn't performed). The file is generated according to the instructions on the FREEC website (awk-ing the SNP-file for mm10 that's posted on the website).

    I'm running the analysis on exome data from mouse tumors, sequenced on an Illumina HiSeq in paired end mode using the Agilent Mouse All Exon kit. The files have been aligned to mm10 using BWA-men and dedupped with Picard. I'm running the analysis on Ubuntu (64 bit). I downloaded the Control-FREEC framework and the relevant SNP and mappability files from your website 2-3 days ago.

    Any help is much appreciated!

    Leave a comment:


  • morrowliu
    replied
    Originally posted by smapdy View Post
    I ended up figuring out what was going on. I had some multiallelic variants in the .snp file that were causing it to fail to load, and my sex variable in the configuration file didn't match up with the actual sample sex which caused problems as well. I ended up dropping the sex argument and using the following general configuration file for my samples:
    [general]
    window = 8000
    step = 2500
    samtools = samtools
    minCNAlength = 4
    BedGraphOutput = TRUE
    chrLenFile = NCBIM37_um.fa.len
    chrFiles = chrfiles
    outputDir = 31208T_31668N_FREEC_V1
    printNA = FALSE
    maxThreads = 6
    ploidy = 2
    breakPointType = 4
    contaminationAdjustment = TRUE
    noisyData = TRUE

    [sample]
    mateFile = 31208_EXOME.pileup.gz
    inputFormat = pileup
    mateOrientation = 0

    [control]
    mateFile = 31668_EXOME.pileup.gz
    inputFormat = pileup
    mateOrientation = 0

    [target]
    captureRegions = S0276129_Merged_Sorted_Probes.bed

    [BAF]
    SNPfile = snp128.singlebases.monoalleleic.freec_baf.txt
    minimalCoveragePerPosition = 5

    If anyone is interested I also have the commands I used to generate the pileups from the .bams, as well as the script I used to generate a working Mm9 and Mm10 .snp file.
    Hi, Smapdy,

    I am also working on a mouse project and want to use FreeC to call CNVs. However, when I use the Snp137 file I have the same error message as you mentioned above.
    I noticed it's been 2 years. But still wondering if you can send me the mm10.snp file?

    Thank you very much!
    Best,
    Yihua

    Leave a comment:


  • vd4mindia
    replied
    what should be the parameter for normal/tumor clone with varying coverage

    I would like to discuss certain things with you regarding the samples am using to infer CNV with exome data with Control-FREEC. I am using WES tumor data. I have tumor sample with a coverage of 70X(polyclonal) and its match normal as blood with same coverage. I used 500 windows and step 250 to infer the CNVs. I found 120 CNVs with signifiance with a median of 42kb for a region that is called CNV. However am applying the same parameters when I am using to infer CNVs from my tumor reprogrammed clones which are sequenced at 35X since they are single clone but the normal control in that case is again 70X coverage blood sample. So can you suggest me if the window length for this? Should it be the same as that of tumor/normal pair? I did with same window and found the median distribution of the bases is higher for single clone iPSCs than the tumor. Do you have any suggestion is I should double the window and step size for the single clone or reduce it by half? Also the coverage of normal blood is 70X while that of the iPSC clone is 35X so wont the results be spurious taking the same window and step as with tumor/normal samples having both 70X coverage? What should be ideal window and step if the control is having double the coverage than its tumor sample? or is it preferable to use the coefficientofVariation? If so then what should be the suggestion of coefficientofvariation that I should use. Also the breakpointType and breakpoint threshold that should be used. Am attaching the config file which I already used for my normal/tumor (both 70X coverage) . I have used the same config file for normal/tumor-IPSC (70X/35X) coverage. The results look promising but am thinking if am tampering with the sensitivity or not, but as far as I know the read depths are normalized for both and then the CNV are calculated. Still I would like some suggestions about the parameters I should change for varying normal/tumor depth. Should I also use intercept=0 and readcountThreshold >=50 since it is WES data. I would like some suggestions if it seems that am tampering with the sensitivity since am keeping the parameters same for norma/tumor and normal/ipsc which has different coverage.

    Code:
    [general]
    
    chrLenFile = /scratch/GT/vdas/pietro/exome_seq/test_Control_FREEC/hs19_chr.len
    window = 500
    
    step = 250
    ploidy = 2
    
    outputDir = /scratch/GT/vdas/pietro/exome_seq/results/control_freec_out/output_S313_tumor/
    BedGraphOutput=TRUE
    breakPointType=4
    
    gemMappabilityFile = /scratch/GT/vdas/pietro/exome_seq/test_Control_FREEC/out100m1_hg19.gem
    
    chrFiles =  /scratch/GT/vdas/test_exome/exome/
    
    maxThreads=6
    
    breakPointThreshold=1.5
    noisyData=TRUE
    printNA=FALSE
    #breakPointThreshold = -.002;
    #window = 50000
    #chrFiles = hg18/hg18_per_chromosome
    #outputDir = test
    #degree=3
    #intercept = 0
    
    [sample]
    
    mateFile = /scratch/GT/vdas/pietro/exome_seq/results/T_S7998/T_S7998.realigned.recal.bam
    inputFormat = bam
    mateOrientation = FR
    
    [control]
    
    mateFile = /scratch/GT/vdas/pietro/exome_seq/results/N_S8980/N_S8980.realigned.recal.bam
    inputFormat = bam
    mateOrientation = FR
    
    [BAF]
    
    SNPfile = /scratch/GT/vdas/pietro/exome_seq/test_Control_FREEC/hg19_snp137.SingleDiNucl.1based.txt
    minimalCoveragePerPosition = 5
    
    [target]
    
    captureRegions = /scratch/GT/vdas/referenceBed/hg19/ss_v4/Exon_SSV4_clean.bed

    Leave a comment:


  • AnweshaM7
    replied
    error while control free C

    Hi,
    I have bam files for my sample. I ran control-freeC (WGS) for all chromosomes and got _CNV for them .
    However for chromosome X and Y I am getting error :
    'Unable to proceed..
    Try to rerun the program with higher number of reads'

    The data (tumor) is of 27x coverage for hg18 track.
    I have tried a winow length of 1000,1500 and 3000 but still get the same error.

    I am not able to understand the reason for getting this error.

    Thanks
    Anwesha

    Leave a comment:


  • shruti
    replied
    Hi,

    I am running ControlFreec for matched tumor/normal pairs whole exome sequencing.
    However for one sample I am always getting the error.

    Initial guess for polynomial:
    Error: variation in read count per window is too small.
    Unable to proceed..
    Wed Nov 12 14:41:11 GMT 2014

    I have tried to increase the window size but still get the same problem. Last setting for window size was 1500.

    The average coverage for the normal and tumor is 107x and 24x respectively.

    I am a bit clueless here.. should I increase or decrease the window size?

    Thanks

    Regards
    Shruti

    Leave a comment:


  • valeu
    replied
    Originally posted by AnweshaM7 View Post
    Hi , I would like to download all tracks > SNP130 (if your using hg18, for hg19 its 131) >.provide the hg18 snp 130 txt file. I checked ucsc but am not able to understand which filters to select. Secondly how do I change the order of the columns. I checked the tutorial but am not getting any option to do
    I will try to add it. But I assure you that the results will be the same as if you use hg18_snp130.SingleDiNucl.1based.txt

    Leave a comment:


  • AnweshaM7
    replied
    Hi , I would like to download all tracks > SNP130 (if your using hg18, for hg19 its 131) >.provide the hg18 snp 130 txt file. I checked ucsc but am not able to understand which filters to select. Secondly how do I change the order of the columns. I checked the tutorial but am not getting any option to do

    Thanks
    Anwesha

    Leave a comment:


  • valeu
    replied
    Originally posted by tatinhawk View Post
    I noticed that in the "_CNVs" output file there are overlapping CNVs.
    FREEC uses overlapping windows to scan the genome (if step < window). This is why you may have overlapping predictions. The breakpoint should be located somewhere in the overlapping part.

    Leave a comment:


  • tatinhawk
    replied
    Question about the _CNV output

    Dear Value,

    I would like to ask you something about the "_CNVs" output of Control-FREEC. I have a set of mouse cancer whole genomes that have been sequence at high depth ~45X using Illumina. I have used Control-FREEC to call CNVs on the samples as well as the BAF(using the set of SNPs idetified by the mouse resequencing project on the same mouse strain). I noticed that in the "_CNVs" output file there are overlapping CNVs. For instance (highlighted in bold below as reported in the _CNVs output file)

    1 2960000 3029999 2 normal AA 20.8697
    1 2990000 3389999 8 gain AAAAABBB 5.57241
    1 3350000 3499999 3 gain AAB 44.5596

    1 3460000 3549999 11 gain AAAAAAAAABB 100
    1 3510000 3739999 3 gain AAB 7.9066
    1 11890000 12709999 3 gain AAB 2.14849
    1 12670000 16909999 3 gain AAB 0.411016


    In most of the cases that I have encountered so far, the overlapping CNV windows have either different predicted genotypes and copy number (like in the firs example) or only different precentages of uncertainty of the predicted genotype.

    In the former case I assume the presence of the overlapped CNVs is due to the prediction of different genotypes (is this correct?) and a filter by percentage of uncertainity would remove them. However, in the latter the predicted genotypes and copy numbers are the same and the percentages of uncertainity are low as well.

    Do you have any clues on why this might be occuring? Also would you recommend to filter out the CNVs based on the precentages of uncertainty up to the point where one ends up with non overlapping CNVs?

    Thanks and I hope that you have a good day!

    Leave a comment:


  • valeu
    replied
    Originally posted by bhdavis1978 View Post
    Hi Valeu,
    What would be the consequences of this? More variability in the copy number estimation? More breakpoints? Less confidence in identifying break points?
    More variability in the normalized read count signal => less confidence in breakpoints.

    Anyway, you can try and then visually check the resulting profile.

    Leave a comment:

Latest Articles

Collapse

  • seqadmin
    Essential Discoveries and Tools in Epitranscriptomics
    by seqadmin




    The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
    04-22-2024, 07:01 AM
  • seqadmin
    Current Approaches to Protein Sequencing
    by seqadmin


    Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
    04-04-2024, 04:25 PM

ad_right_rmr

Collapse

News

Collapse

Topics Statistics Last Post
Started by seqadmin, Yesterday, 08:47 AM
0 responses
12 views
0 likes
Last Post seqadmin  
Started by seqadmin, 04-11-2024, 12:08 PM
0 responses
60 views
0 likes
Last Post seqadmin  
Started by seqadmin, 04-10-2024, 10:19 PM
0 responses
59 views
0 likes
Last Post seqadmin  
Started by seqadmin, 04-10-2024, 09:21 AM
0 responses
54 views
0 likes
Last Post seqadmin  
Working...
X