Help with FREEC output
First of, I must say FREEC is a great tool for CNV detection in exome seq data!
I have a few questions about the output files I obtained.
I have two _cnv, _ratio, _BAF files.
For instance, how is *_mpileup_CNV different from *mpileup_normal_CNV? and depending on the file I use my R plots are so different! why would this be?
PS: I have paired end tumor-normal illumina data from exome sequencing.
~Thanks for your help,
Rini
Seqanswers Leaderboard Ad
Collapse
X
-
Posted by fjrossello
Hi Valeu,
Thanks for your explanation and in regards to the R plots, I downloaded the latest makeGraph.R and works perfectly.
Cheers,
Fernando
I´m having problems finding the most recent version of this.
Thanks in advance
Leave a comment:
-
-
Hi Valeu,
Sorry to be so insistent in this aspect. I re-run control-freec on an mpileup file of one my samples with and without BAF options and I found a few differences between both runs. First, a simple and rather obvious question, if you have a control match file, does the CNA only analysis output only the somatic gain/loss regions of the sample? This question arises because the CNA+BAF run outputs a CNVs file which reports genotype information and gain/loss/normal in the predicted copy number. When I filter this file to report only somatic gains/losses and compare this output to the CNA only analysis output, the results are not quite the same.
Is this a fair comparison? Am I missing something which prevents me from understanding these results?
Thanks in advance.
Cheers,
Fernando
Ps: find below the parameters of my config file. As I said, I run it plus and minus BAF, i.e., BAF commented.
[general]
chrLenFile = hg19.len
coefficientOfVariation = 0.05
outputDir = ./ch209_cnv_CNA_only
degree = 3
ploidy = 2
samtools = /usr/local/biotools/bin/samtools
sex = XY
chrFiles = /home/fernandr/biotools/references/iGenomes/Homo_sapiens/UCSC/hg19/Sequence/Chromosomes
# step = 5000
# window = 20000
[sample]
mateFile = /media/data/projects/wg_fr_20121024/sample_mpileup_files/sample_bwa_wg.mpileup
inputFormat = pileup
mateOrientation = FR
[control]
mateFile = /media/data/projects/wg_fr_20121024/sample_mpileup_files/control_bwa_wg.mpileup
inputFormat = pileup
mateOrientation = FR
# [BAF]
#
# SNPfile = /home/fernandr/biotools/references/freec/hg19/hg19_snp131.SingleDiNucl.1based.txt
# minimalCoveragePerPosition = 1
# minimalQualityPerPosition = 0
# shiftInQuality = 33
Leave a comment:
-
-
No, mateOrientation is not relevant when you use pileup. Still, you need to set this parameter to something
Leave a comment:
-
-
Originally posted by valeu View PostHi Fernando,
I think running FREEC on a pileup should be more or less identical to running it on a BAM files with "mateOrientation=0". In this case, all reads are taken into account during calculation of read count per window. When you select "mateOrientation=FR" for a BAM file, FREEC will keep only pairs mapped in the correct orientation and insert size.
Also, in some cases having BAF info can improve predictions (e.g., when float copy number is 2.5 and FREEC hesitates between assigning 2 or 3 copies to the region)
Also, in the version 5.9 and before there was a bug that did not allowed FREEC to get correct read count in window with extremely high coverage (> 1000x per position) when using .pileup files. This bug is fixed in 6.0 which must be available the next week. Also, the new version works ~10x faster on an 8 core computer. It can process 30x genome (with control, BAF, in pileup.gz) in one hour
Just to be clear, when you use a pileup file, should the mateOrientation parameter be set to 0? Is that paremeter relevant at all when use this format?
Thanks in advance.
Cheers,
Fernando
Leave a comment:
-
-
Hi Fernando,
I think running FREEC on a pileup should be more or less identical to running it on a BAM files with "mateOrientation=0". In this case, all reads are taken into account during calculation of read count per window. When you select "mateOrientation=FR" for a BAM file, FREEC will keep only pairs mapped in the correct orientation and insert size.
Also, in some cases having BAF info can improve predictions (e.g., when float copy number is 2.5 and FREEC hesitates between assigning 2 or 3 copies to the region)
Also, in the version 5.9 and before there was a bug that did not allowed FREEC to get correct read count in window with extremely high coverage (> 1000x per position) when using .pileup files. This bug is fixed in 6.0 which must be available the next week. Also, the new version works ~10x faster on an 8 core computer. It can process 30x genome (with control, BAF, in pileup.gz) in one hour
Leave a comment:
-
-
Hi Valeu,
This is Fernando again. I have re-run Freec on one of my samples where I previously run CNA analysis from a SAM file (unsorted, I use the FR mateOrientation parameter). The difference this time was that I wanted to run CNA + BAF analyses. To run BAF I first created a pileup from the sample SAM file and then run it using exactly the same parameters.
Even though that the results look graphically the same (R created plots), when I compared the CNVs text files produced by both analyses the results look slightly different. The differences are seen in the start and end position (the regions are roughfly the same) and in terms the copy number predicted.
Are there any reasons why this could be happening? Which one should be more reliable?
Thanks in advance.
Cheers,
Fernando
Leave a comment:
-
-
You need to define window size (window=1000) and you have to run it with a control dataset when you use the "target" option
Leave a comment:
-
-
Error while specifying target BED file
Hello everyone,
I have been trying out Control-FREEC with some test data (exome samples), and I encountered an error when trying to specify a target BED file.
Basically, Control-FREEC seems to run fine, whether I use a control sample or not (I tried both options), but when I add these lines :
Code:[target] captureRegions = /home/volatile/swe/exomes/TruSeq-for-FREEC.bed
Code:FREEC v5.9 (Control-FREEC v2.9) : calling copy number alterations and LOH regions using deep-sequencing data ..Using 1 process(es) ..Minimal CNA length (in windows) was set to 4 ..consider the sample being male ..breakPointThreshold set to 0.8 ..Polynomial degree for "ReadCount ~ GC-content" or "Sample ReadCount ~ Control ReadCount" is 3 ..FREEC is not going to output normalized copy number profiles into a BedGraph file. Use "[general] BedGraphOutput=TRUE" if you want a BedGraph file ..FREEC is not going to adjust profiles for a possible contamination by normal cells ..Output directory: /home/volatile/swe/2013-01-10/Test-FREEC5 ..Directory with files containing chromosome sequences: /home/genmol/genomes/homo_sapiens/hg19/chromosomes ..Sample file: /home/volatile/swe/exomes/exome2.bam ..Sample input format: BAM ..will use this instance of samtools: samtools to read BAM files ..Control file: /home/volatile/swe/exomes/exome1.bam ..Input format for the control file: BAM ..File with chromosome lengths: hg19.len ..Coefficient Of Variation set equal to 0.062 ..Note, this coefficient won't be used if "window" is set ..File hg19.len was read total genome size: 3.09568e+09 ..samtools should be installed to be able to read BAM files read number: 76963934 coefficientOfVariation: 0.062 evaluated window size: 10464 ..Starting reading /home/volatile/swe/exomes/exome2.bam ..samtools should be installed to be able to read BAM files; will use the following command for samtools: samtools view /home/volatile/swe/exomes/exome2.bam 76963934 lines read.. 75080830 reads used to compute copy number profile printing counts into /home/volatile/swe/2013-01-10/Test-FREEC5/exome2.bam_sample.cpn ..Window size: 10464 ..Will use hg19.len to calculate RC for the control sample ..File hg19.len was read ..Starting reading /home/volatile/swe/exomes/exome1.bam ..samtools should be installed to be able to read BAM files; will use the following command for samtools: samtools view /home/volatile/swe/exomes/exome1.bam 51311982 lines read.. 50082356 reads used to compute copy number profile printing counts into /home/volatile/swe/2013-01-10/Test-FREEC5/exome1.bam_control.cpn ..FREEC will take into account only regions from /home/volatile/swe/exomes/TruSeq-for-FREEC.bed ..Mappability and GC-content won't be used ..Control-FREEC won't use minimal mappability. All windows overlaping capture regions will be considered ..Reading /home/volatile/swe/exomes/TruSeq-for-FREEC.bed ..Your file must be in .BED format, and it must be sorted ..Reading capture for chromosome 1 ..Reading capture for chromosome 2 ..Reading capture for chromosome 3 ..Reading capture for chromosome 4 ..Reading capture for chromosome 5 ..Reading capture for chromosome 6 ..Reading capture for chromosome 7 ..Reading capture for chromosome 8 ..Reading capture for chromosome 9 ..Reading capture for chromosome 10 ..Reading capture for chromosome 11 ..Reading capture for chromosome 12 ..Reading capture for chromosome 13 ..Reading capture for chromosome 14 ..Reading capture for chromosome 15 ..Reading capture for chromosome 16 ..Reading capture for chromosome 17 ..Reading capture for chromosome 18 ..Reading capture for chromosome 19 ..Reading capture for chromosome 20 ..Reading capture for chromosome 21 ..Reading capture for chromosome 22 ..Reading capture for chromosome X ..Reading capture for chromosome Y file /home/volatile/swe/exomes/TruSeq-for-FREEC.bed is read ..Setting read counts to Zero for all windows outside of capture ..Total size of captured regions 6.18842e+07bp ..processing chromosome 1 ..processing chromosome 2 ..processing chromosome 3 ..processing chromosome 4 ..processing chromosome 5 ..processing chromosome 6 ..processing chromosome 7 ..processing chromosome 8 ..processing chromosome 9 ..processing chromosome 10 ..processing chromosome 11 ..processing chromosome 12 ..processing chromoso..At this point you need to profide window size, option 'window' in group of parameters [general] in your config file me 13 ..processing chromosome 14 ..processing chromosome 15 ..processing chromosome 16 ..processing chromosome 17 ..processing chromosome 18 ..processing chromosome 19 ..processing chromosome 20 ..processing chromosome 21 ..processing chromosome 22 ..processing chromosome X ..processing chromosome Y ..telocenromeric set to 1 since it is a minimal capture region
I formatted my BED file as follows:
chr start end
(tab-delimited), and it's ordered by chr (chr1, chr2, ... chr22, chrX, chrY), and then by start position.
Am I doing something wrong here?
Thanks in advance.
Regards,
Stephane
PS : Since samtools' pileup function is now deprecated, it's not possible to generate pileup files anymore. Do you plan on supporting BAM or VCF files as input for the BAF calculation function? Or do you know how I can work around this limitation? Thanks.Last edited by stephwen; 01-10-2013, 05:08 AM. Reason: added question about BAM or VCF support for BAF calculation
Leave a comment:
-
-
Hi Valeu,
Thanks for your explanation and in regards to the R plots, I downloaded the latest makeGraph.R and works perfectly.
Cheers,
Fernando
Leave a comment:
-
-
Hi Fernando,
Are they the output obtained when CNV and LOH were calculated on the control sample when using the CG_profile.cnp?
Any ideas of why is this is happening?
What does it write into the command line?
Leave a comment:
-
-
Hi Valeu,
I am using control-freec to detect CNV and LOH in normal vs tumor samples (low pass whole genome).
I had no problems to run it at all. However, I would like to ask you a couple of questions in regards to the files outputted and the plotting process.
First, when I run CNV + LOH using SAM pileups, apart from creating the standard _CNVs, _ratio.txt, _BAF.txt _sample.cnp, _control.cnp and GC_profile.cnp output files, it also generates three extra files with suffix _normal_CNVs, _normal_ratio.txt and _normal_BAF.txt. Are they the output obtained when CNV and LOH were calculated on the control sample when using the CG_profile.cnp?
Second, even though it works flawlessly for the ratios CNV data, I cannot make the script makeGraph.R to plot the LOH _BAF.txt file.
I used the following line:
cat /usr/local/biotools/freec/scripts/makeGraph.R | R --slave --args 2 sample_bwa_wg.mpileup_ratio.txt sample_bwa_wg.mpileup_BAF.txt
Any ideas of why is this is happening?
Thanks in advance.
Cheers,
Fernando
Leave a comment:
-
-
Hi Hao,
You know, two cell lines for the same type of cancer can be very differentEspecially for "non-copy-number" tumors.
But even for "copy-number" tumors, such as neuroblastoma, CNA regions can be different. See, for example, sequencing data for neuroblastoma samples: suppl.figures from Molenaar et al., 2012
Leave a comment:
-
-
Hi, valeu,
I am currently have two cancer cells datas(the same cancer) from human, the coverage depth are about 33,39, with a depth statistics for each base. In this case, what is the best software for CNV detection? I use FREEC and get the result with parameters (window=3000, step=1000 and other same parameters as in test config file provided in the website), and I am facing a problem is how to see the CNV? how to compare these two results? In stead of list all the CNVs with CNV type, start and ends positions and copy number, what other statistics do we usually use to anaylze CNV?
I find that the CNV detected for these two cancer cells doesn't share any commons, the break points are different, the copy number are different, it looks like they are different, but it is strange, two cancer cells with the sam cancer their CNV are completely different, I am wondering if there is anything wrong in the case?
Thank you !
Leave a comment:
-
-
Hi Hao,
Originally posted by yuhao View PostThe output intervals have some overlaps, e.x., 58000, 8387999, 3 gain, 8386000, 9404999 5 gain , so 8386000 < 8387999, how could this thing happen?
Originally posted by yuhao View PostWhat does control database mean here?Normally we just have a test genome and a reference genome.
Originally posted by yuhao View PostAs far as I know, there are typically two different methods to call CNV, segmentation based, and hidden markov model, I am wondering if FREEC is based on segmentation based method?
Pubmed links
Both papers are in open access. Have a look!
FREEC uses Lasso-based segmentation.
Originally posted by yuhao View PostHow do we determine the window size and steps parameters? Which parameters can affect the accuracy of the result, that's very crucial for the result so I care much about this?
Using "step" will help to improve sensitivity and get prettier graphs, but it can be time consuming.
One of the most important parameters is "breakpoint threshold" (positive, default 0.8). Use smaller values to get more segments, if by eye you see that segmentation was not sensitive enough.
Originally posted by yuhao View PostFinally, aside from FREEC, can you recommend some other softwares which had been widely used for CNV detection in the world (because I have many choices but I don't know which ones are best among all). I also tried CNVnator, but the result seems very different from FREEC.
Leave a comment:
-
Latest Articles
Collapse
-
by seqadmin
This year’s Advances in Genome Biology and Technology (AGBT) General Meeting commemorated the 25th anniversary of the event at its original venue on Marco Island, Florida. While this year’s event didn’t include high-profile musical performances, the industry announcements and cutting-edge research still drew the attention of leading scientists.
The Headliner
The biggest announcement was Roche stepping back into the sequencing platform market. In the years since...-
Channel: Articles
03-03-2025, 01:39 PM -
-
by seqadmin
The human gut contains trillions of microorganisms that impact digestion, immune functions, and overall health1. Despite major breakthroughs, we’re only beginning to understand the full extent of the microbiome’s influence on health and disease. Advances in next-generation sequencing and spatial biology have opened new windows into this complex environment, yet many questions remain. This article highlights two recent studies exploring how diet influences microbial...-
Channel: Articles
02-24-2025, 06:31 AM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, Today, 07:27 AM
|
0 responses
10 views
0 reactions
|
Last Post
by seqadmin
Today, 07:27 AM
|
||
Started by seqadmin, Yesterday, 12:50 PM
|
0 responses
14 views
0 reactions
|
Last Post
by seqadmin
Yesterday, 12:50 PM
|
||
Started by seqadmin, 03-03-2025, 01:15 PM
|
0 responses
185 views
0 reactions
|
Last Post
by seqadmin
03-03-2025, 01:15 PM
|
||
Started by seqadmin, 02-28-2025, 12:58 PM
|
0 responses
283 views
0 reactions
|
Last Post
by seqadmin
02-28-2025, 12:58 PM
|
Leave a comment: