Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • CNV + LOH detection on NGS data with Control FREEC

    Hello everybody !

    I am using Control freec (FREEC v6.4 (Control-FREEC v3.4) ) on exome paired (tumor-control) data, on about 25 patients.
    I have several issues I would like to share here to get some help hopefully.

    First, for CNV of the control sample, I got a line at 2 for all my 25 patients (attached file "CNV.mpileup_normal_ratio.txt.png"). I wonder why there is no noise around 2 and why I cannot get points as I do for the tumor sample (attached file "CNV.mpileup_ratio.txt.png"). I changed the option noisyData but it doesn't seem to be the explanation.

    My second problem is for the BAF calculations. As you can see in the control ("baf.mpileup_normal_BAF.txt.png") sample (it's the same for tumor), there is something wrong on chromosome 7, for all patients. There is one point after the end of chromosome. I don't understand why. I can see indeed this point in the BAF.txt file :

    7 94855040 0.6 0.5 0.5 0.5 0.5 19.905
    7 94937446 0.47619 0.5 0.5 0.5 0.5 19.905
    7 94941038 0.5 -1 2 -1 2 -1
    7 892613944 1.07685e-08 0.5 0.5 0.324618 0.675382 0.5
    8 182807 0.333333 -1 2 -1 2 -1
    Finally, for some patients, I have anarchical results (see example2). All my patients have the same pathology (a leukemia), so I doubt that such karyotypes are true... I am wondering if these kind of results are not the consequence of altered DNA...

    Could you please tell me if you faced one of these problems and how did you solve it?
    Thank you in advance,
    Jane
    Attached Files
    Last edited by Jane M; 06-03-2013, 12:24 AM.

  • #2
    No Control FREEC users among the members ?

    Comment


    • #3
      Hi Jane,

      First, for CNV of the control sample, I got a line at 2 for all my 25 patients (attached file "CNV.mpileup_normal_ratio.txt.png"). I wonder why there is no noise around 2 and why I cannot get points as I do for the tumor sample (attached file "CNV.mpileup_ratio.txt.png"). I changed the option noisyData but it doesn't seem to be the explanation.

      In exome data (unlike whole genome sequencing data) FREEC cannot call CNVs in the control sample. This is because capture bias is so strong that normalizing only with GC-content and mappability does not help. Thus, FREEC assumes that the whole control sample is present in two copies.
      To understand why ratios are not plotted, please check the format of the _ratio.txt of the control and tumor samples.

      My second problem is for the BAF calculations. As you can see in the control ("baf.mpileup_normal_BAF.txt.png") sample (it's the same for tumor), there is something wrong on chromosome 7, for all patients. There is one point after the end of chromosome. I don't understand why. I can see indeed this point in the BAF.txt file :

      Quote:
      7 94855040 0.6 0.5 0.5 0.5 0.5 19.905
      7 94937446 0.47619 0.5 0.5 0.5 0.5 19.905
      7 94941038 0.5 -1 2 -1 2 -1
      7 892613944 1.07685e-08 0.5 0.5 0.324618 0.675382 0.5
      8 182807 0.333333 -1 2 -1 2 -1


      Check the file with chromosome lengths you use to run FREEC.

      Finally, for some patients, I have anarchical results (see example2). All my patients have the same pathology (a leukemia), so I doubt that such karyotypes are true... I am wondering if these kind of results are not the consequence of altered DNA...


      Sometimes, exome profeile look very noisy. Try to play with "degree" of the polynomial (e.g., try "1" instead of "3") and use the NoisyData option.

      Comment


      • #4
        Thank you a lot for your reply!

        Originally posted by valeu View Post

        Quote:
        7 94855040 0.6 0.5 0.5 0.5 0.5 19.905
        7 94937446 0.47619 0.5 0.5 0.5 0.5 19.905
        7 94941038 0.5 -1 2 -1 2 -1
        7 892613944 1.07685e-08 0.5 0.5 0.324618 0.675382 0.5
        8 182807 0.333333 -1 2 -1 2 -1[/FONT]

        Check the file with chromosome lengths you use to run FREEC.



        My chromosome length file should be fine:
        1 chr1 249250621
        2 chr2 243199373
        3 chr3 198022430
        4 chr4 191154276
        5 chr5 180915260
        6 chr6 171115067
        7 chr7 159138663
        8 chr8 146364022
        9 chr9 141213431
        10 chr10 135534747
        11 chr11 135006516
        12 chr12 133851895
        13 chr13 115169878
        14 chr14 107349540
        15 chr15 102531392
        16 chr16 90354753
        17 chr17 81195210
        18 chr18 78077248
        19 chr19 59128983
        20 chr20 63025520
        21 chr21 48129895
        22 chr22 51304566
        23 chrX 155270560
        24 chrY 59373566
        I think that in the computation, an outlier is generated as shown in the output file: in hg19.len, I provided a length of 159138663, but in the output, there is an point at 892613944...


        Finally, for some patients, I have anarchical results (see example2). All my patients have the same pathology (a leukemia), so I doubt that such karyotypes are true... I am wondering if these kind of results are not the consequence of altered DNA...

        Sometimes, exome profeile look very noisy. Try to play with "degree" of the polynomial (e.g., try "1" instead of "3") and use the NoisyData option.
        I tried a degree of 1 and 2 (and I always use the option noisyData). The results are very similar for these 2 values but very different from the ones with degree=3. It's weird to have such different results. I attached the new result. I will in addition try to increase (more than 10) the coverage.
        Attached Files

        Comment


        • #5
          If I have a large number of control (ie germline) exome data, can I pool them such that I can call CNV for the individual germline exome data? I heard that ExomeCNV can do that, I wonder if it is also doable in Control-FREEC

          Comment


          • #6
            Originally posted by ymc View Post
            If I have a large number of control (ie germline) exome data, can I pool them such that I can call CNV for the individual germline exome data? I heard that ExomeCNV can do that, I wonder if it is also doable in Control-FREEC
            No, so far there is such an option, although it could improve greatly the results. I think CONTRA does it in a right way. You can check. Otherwise, you can simple merge BAM files and use FREEC

            Comment


            • #7
              Originally posted by valeu View Post
              No, so far there is such an option, although it could improve greatly the results. I think CONTRA does it in a right way. You can check. Otherwise, you can simple merge BAM files and use FREEC
              merge them and treat it as normal and then the one we are interested in as tumor, correct?

              Comment


              • #8
                Originally posted by ymc View Post
                merge them and treat it as normal and then the one we are interested in as tumor, correct?
                Yes, you are right.

                Comment

                Latest Articles

                Collapse

                • seqadmin
                  Exploring the Dynamics of the Tumor Microenvironment
                  by seqadmin




                  The complexity of cancer is clearly demonstrated in the diverse ecosystem of the tumor microenvironment (TME). The TME is made up of numerous cell types and its development begins with the changes that happen during oncogenesis. “Genomic mutations, copy number changes, epigenetic alterations, and alternative gene expression occur to varying degrees within the affected tumor cells,” explained Andrea O’Hara, Ph.D., Strategic Technical Specialist at Azenta. “As...
                  07-08-2024, 03:19 PM
                • seqadmin
                  Exploring Human Diversity Through Large-Scale Omics
                  by seqadmin


                  In 2003, researchers from the Human Genome Project (HGP) announced the most comprehensive genome to date1. Although the genome wasn’t fully completed until nearly 20 years later2, numerous large-scale projects, such as the International HapMap Project and 1000 Genomes Project, continued the HGP's work, capturing extensive variation and genomic diversity within humans. Recently, newer initiatives have significantly increased in scale and expanded beyond genomics, offering a more detailed...
                  06-25-2024, 06:43 AM

                ad_right_rmr

                Collapse

                News

                Collapse

                Topics Statistics Last Post
                Started by seqadmin, 07-10-2024, 07:30 AM
                0 responses
                23 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 07-03-2024, 09:45 AM
                0 responses
                200 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 07-03-2024, 08:54 AM
                0 responses
                209 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 07-02-2024, 03:00 PM
                0 responses
                192 views
                0 likes
                Last Post seqadmin  
                Working...
                X