Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • CNV + LOH detection on NGS data with Control FREEC

    Hello everybody !

    I am using Control freec (FREEC v6.4 (Control-FREEC v3.4) ) on exome paired (tumor-control) data, on about 25 patients.
    I have several issues I would like to share here to get some help hopefully.

    First, for CNV of the control sample, I got a line at 2 for all my 25 patients (attached file "CNV.mpileup_normal_ratio.txt.png"). I wonder why there is no noise around 2 and why I cannot get points as I do for the tumor sample (attached file "CNV.mpileup_ratio.txt.png"). I changed the option noisyData but it doesn't seem to be the explanation.

    My second problem is for the BAF calculations. As you can see in the control ("baf.mpileup_normal_BAF.txt.png") sample (it's the same for tumor), there is something wrong on chromosome 7, for all patients. There is one point after the end of chromosome. I don't understand why. I can see indeed this point in the BAF.txt file :

    7 94855040 0.6 0.5 0.5 0.5 0.5 19.905
    7 94937446 0.47619 0.5 0.5 0.5 0.5 19.905
    7 94941038 0.5 -1 2 -1 2 -1
    7 892613944 1.07685e-08 0.5 0.5 0.324618 0.675382 0.5
    8 182807 0.333333 -1 2 -1 2 -1
    Finally, for some patients, I have anarchical results (see example2). All my patients have the same pathology (a leukemia), so I doubt that such karyotypes are true... I am wondering if these kind of results are not the consequence of altered DNA...

    Could you please tell me if you faced one of these problems and how did you solve it?
    Thank you in advance,
    Jane
    Attached Files
    Last edited by Jane M; 06-03-2013, 12:24 AM.

  • #2
    No Control FREEC users among the members ?

    Comment


    • #3
      Hi Jane,

      First, for CNV of the control sample, I got a line at 2 for all my 25 patients (attached file "CNV.mpileup_normal_ratio.txt.png"). I wonder why there is no noise around 2 and why I cannot get points as I do for the tumor sample (attached file "CNV.mpileup_ratio.txt.png"). I changed the option noisyData but it doesn't seem to be the explanation.

      In exome data (unlike whole genome sequencing data) FREEC cannot call CNVs in the control sample. This is because capture bias is so strong that normalizing only with GC-content and mappability does not help. Thus, FREEC assumes that the whole control sample is present in two copies.
      To understand why ratios are not plotted, please check the format of the _ratio.txt of the control and tumor samples.

      My second problem is for the BAF calculations. As you can see in the control ("baf.mpileup_normal_BAF.txt.png") sample (it's the same for tumor), there is something wrong on chromosome 7, for all patients. There is one point after the end of chromosome. I don't understand why. I can see indeed this point in the BAF.txt file :

      Quote:
      7 94855040 0.6 0.5 0.5 0.5 0.5 19.905
      7 94937446 0.47619 0.5 0.5 0.5 0.5 19.905
      7 94941038 0.5 -1 2 -1 2 -1
      7 892613944 1.07685e-08 0.5 0.5 0.324618 0.675382 0.5
      8 182807 0.333333 -1 2 -1 2 -1


      Check the file with chromosome lengths you use to run FREEC.

      Finally, for some patients, I have anarchical results (see example2). All my patients have the same pathology (a leukemia), so I doubt that such karyotypes are true... I am wondering if these kind of results are not the consequence of altered DNA...


      Sometimes, exome profeile look very noisy. Try to play with "degree" of the polynomial (e.g., try "1" instead of "3") and use the NoisyData option.

      Comment


      • #4
        Thank you a lot for your reply!

        Originally posted by valeu View Post

        Quote:
        7 94855040 0.6 0.5 0.5 0.5 0.5 19.905
        7 94937446 0.47619 0.5 0.5 0.5 0.5 19.905
        7 94941038 0.5 -1 2 -1 2 -1
        7 892613944 1.07685e-08 0.5 0.5 0.324618 0.675382 0.5
        8 182807 0.333333 -1 2 -1 2 -1[/FONT]

        Check the file with chromosome lengths you use to run FREEC.



        My chromosome length file should be fine:
        1 chr1 249250621
        2 chr2 243199373
        3 chr3 198022430
        4 chr4 191154276
        5 chr5 180915260
        6 chr6 171115067
        7 chr7 159138663
        8 chr8 146364022
        9 chr9 141213431
        10 chr10 135534747
        11 chr11 135006516
        12 chr12 133851895
        13 chr13 115169878
        14 chr14 107349540
        15 chr15 102531392
        16 chr16 90354753
        17 chr17 81195210
        18 chr18 78077248
        19 chr19 59128983
        20 chr20 63025520
        21 chr21 48129895
        22 chr22 51304566
        23 chrX 155270560
        24 chrY 59373566
        I think that in the computation, an outlier is generated as shown in the output file: in hg19.len, I provided a length of 159138663, but in the output, there is an point at 892613944...


        Finally, for some patients, I have anarchical results (see example2). All my patients have the same pathology (a leukemia), so I doubt that such karyotypes are true... I am wondering if these kind of results are not the consequence of altered DNA...

        Sometimes, exome profeile look very noisy. Try to play with "degree" of the polynomial (e.g., try "1" instead of "3") and use the NoisyData option.
        I tried a degree of 1 and 2 (and I always use the option noisyData). The results are very similar for these 2 values but very different from the ones with degree=3. It's weird to have such different results. I attached the new result. I will in addition try to increase (more than 10) the coverage.
        Attached Files

        Comment


        • #5
          If I have a large number of control (ie germline) exome data, can I pool them such that I can call CNV for the individual germline exome data? I heard that ExomeCNV can do that, I wonder if it is also doable in Control-FREEC

          Comment


          • #6
            Originally posted by ymc View Post
            If I have a large number of control (ie germline) exome data, can I pool them such that I can call CNV for the individual germline exome data? I heard that ExomeCNV can do that, I wonder if it is also doable in Control-FREEC
            No, so far there is such an option, although it could improve greatly the results. I think CONTRA does it in a right way. You can check. Otherwise, you can simple merge BAM files and use FREEC

            Comment


            • #7
              Originally posted by valeu View Post
              No, so far there is such an option, although it could improve greatly the results. I think CONTRA does it in a right way. You can check. Otherwise, you can simple merge BAM files and use FREEC
              merge them and treat it as normal and then the one we are interested in as tumor, correct?

              Comment


              • #8
                Originally posted by ymc View Post
                merge them and treat it as normal and then the one we are interested in as tumor, correct?
                Yes, you are right.

                Comment

                Latest Articles

                Collapse

                • seqadmin
                  Advanced Tools Transforming the Field of Cytogenomics
                  by seqadmin


                  At the intersection of cytogenetics and genomics lies the exciting field of cytogenomics. It focuses on studying chromosomes at a molecular scale, involving techniques that analyze either the whole genome or particular DNA sequences to examine variations in structure and behavior at the chromosomal or subchromosomal level. By integrating cytogenetic techniques with genomic analysis, researchers can effectively investigate chromosomal abnormalities related to diseases, particularly...
                  09-26-2023, 06:26 AM
                • seqadmin
                  How RNA-Seq is Transforming Cancer Studies
                  by seqadmin



                  Cancer research has been transformed through numerous molecular techniques, with RNA sequencing (RNA-seq) playing a crucial role in understanding the complexity of the disease. Maša Ivin, Ph.D., Scientific Writer at Lexogen, and Yvonne Goepel Ph.D., Product Manager at Lexogen, remarked that “The high-throughput nature of RNA-seq allows for rapid profiling and deep exploration of the transcriptome.” They emphasized its indispensable role in cancer research, aiding in biomarker...
                  09-07-2023, 11:15 PM
                • seqadmin
                  Methods for Investigating the Transcriptome
                  by seqadmin




                  Ribonucleic acid (RNA) represents a range of diverse molecules that play a crucial role in many cellular processes. From serving as a protein template to regulating genes, the complex processes involving RNA make it a focal point of study for many scientists. This article will spotlight various methods scientists have developed to investigate different RNA subtypes and the broader transcriptome.

                  Whole Transcriptome RNA-seq
                  Whole transcriptome sequencing...
                  08-31-2023, 11:07 AM

                ad_right_rmr

                Collapse

                News

                Collapse

                Topics Statistics Last Post
                Started by seqadmin, Yesterday, 06:57 AM
                0 responses
                7 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 09-26-2023, 07:53 AM
                0 responses
                8 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 09-25-2023, 07:42 AM
                0 responses
                14 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 09-22-2023, 09:05 AM
                0 responses
                44 views
                0 likes
                Last Post seqadmin  
                Working...
                X