Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • CNV + LOH detection on NGS data with Control FREEC

    Hello everybody !

    I am using Control freec (FREEC v6.4 (Control-FREEC v3.4) ) on exome paired (tumor-control) data, on about 25 patients.
    I have several issues I would like to share here to get some help hopefully.

    First, for CNV of the control sample, I got a line at 2 for all my 25 patients (attached file "CNV.mpileup_normal_ratio.txt.png"). I wonder why there is no noise around 2 and why I cannot get points as I do for the tumor sample (attached file "CNV.mpileup_ratio.txt.png"). I changed the option noisyData but it doesn't seem to be the explanation.

    My second problem is for the BAF calculations. As you can see in the control ("baf.mpileup_normal_BAF.txt.png") sample (it's the same for tumor), there is something wrong on chromosome 7, for all patients. There is one point after the end of chromosome. I don't understand why. I can see indeed this point in the BAF.txt file :

    7 94855040 0.6 0.5 0.5 0.5 0.5 19.905
    7 94937446 0.47619 0.5 0.5 0.5 0.5 19.905
    7 94941038 0.5 -1 2 -1 2 -1
    7 892613944 1.07685e-08 0.5 0.5 0.324618 0.675382 0.5
    8 182807 0.333333 -1 2 -1 2 -1
    Finally, for some patients, I have anarchical results (see example2). All my patients have the same pathology (a leukemia), so I doubt that such karyotypes are true... I am wondering if these kind of results are not the consequence of altered DNA...

    Could you please tell me if you faced one of these problems and how did you solve it?
    Thank you in advance,
    Jane
    Attached Files
    Last edited by Jane M; 06-03-2013, 12:24 AM.

  • #2
    No Control FREEC users among the members ?

    Comment


    • #3
      Hi Jane,

      First, for CNV of the control sample, I got a line at 2 for all my 25 patients (attached file "CNV.mpileup_normal_ratio.txt.png"). I wonder why there is no noise around 2 and why I cannot get points as I do for the tumor sample (attached file "CNV.mpileup_ratio.txt.png"). I changed the option noisyData but it doesn't seem to be the explanation.

      In exome data (unlike whole genome sequencing data) FREEC cannot call CNVs in the control sample. This is because capture bias is so strong that normalizing only with GC-content and mappability does not help. Thus, FREEC assumes that the whole control sample is present in two copies.
      To understand why ratios are not plotted, please check the format of the _ratio.txt of the control and tumor samples.

      My second problem is for the BAF calculations. As you can see in the control ("baf.mpileup_normal_BAF.txt.png") sample (it's the same for tumor), there is something wrong on chromosome 7, for all patients. There is one point after the end of chromosome. I don't understand why. I can see indeed this point in the BAF.txt file :

      Quote:
      7 94855040 0.6 0.5 0.5 0.5 0.5 19.905
      7 94937446 0.47619 0.5 0.5 0.5 0.5 19.905
      7 94941038 0.5 -1 2 -1 2 -1
      7 892613944 1.07685e-08 0.5 0.5 0.324618 0.675382 0.5
      8 182807 0.333333 -1 2 -1 2 -1


      Check the file with chromosome lengths you use to run FREEC.

      Finally, for some patients, I have anarchical results (see example2). All my patients have the same pathology (a leukemia), so I doubt that such karyotypes are true... I am wondering if these kind of results are not the consequence of altered DNA...


      Sometimes, exome profeile look very noisy. Try to play with "degree" of the polynomial (e.g., try "1" instead of "3") and use the NoisyData option.

      Comment


      • #4
        Thank you a lot for your reply!

        Originally posted by valeu View Post

        Quote:
        7 94855040 0.6 0.5 0.5 0.5 0.5 19.905
        7 94937446 0.47619 0.5 0.5 0.5 0.5 19.905
        7 94941038 0.5 -1 2 -1 2 -1
        7 892613944 1.07685e-08 0.5 0.5 0.324618 0.675382 0.5
        8 182807 0.333333 -1 2 -1 2 -1[/FONT]

        Check the file with chromosome lengths you use to run FREEC.



        My chromosome length file should be fine:
        1 chr1 249250621
        2 chr2 243199373
        3 chr3 198022430
        4 chr4 191154276
        5 chr5 180915260
        6 chr6 171115067
        7 chr7 159138663
        8 chr8 146364022
        9 chr9 141213431
        10 chr10 135534747
        11 chr11 135006516
        12 chr12 133851895
        13 chr13 115169878
        14 chr14 107349540
        15 chr15 102531392
        16 chr16 90354753
        17 chr17 81195210
        18 chr18 78077248
        19 chr19 59128983
        20 chr20 63025520
        21 chr21 48129895
        22 chr22 51304566
        23 chrX 155270560
        24 chrY 59373566
        I think that in the computation, an outlier is generated as shown in the output file: in hg19.len, I provided a length of 159138663, but in the output, there is an point at 892613944...


        Finally, for some patients, I have anarchical results (see example2). All my patients have the same pathology (a leukemia), so I doubt that such karyotypes are true... I am wondering if these kind of results are not the consequence of altered DNA...

        Sometimes, exome profeile look very noisy. Try to play with "degree" of the polynomial (e.g., try "1" instead of "3") and use the NoisyData option.
        I tried a degree of 1 and 2 (and I always use the option noisyData). The results are very similar for these 2 values but very different from the ones with degree=3. It's weird to have such different results. I attached the new result. I will in addition try to increase (more than 10) the coverage.
        Attached Files

        Comment


        • #5
          If I have a large number of control (ie germline) exome data, can I pool them such that I can call CNV for the individual germline exome data? I heard that ExomeCNV can do that, I wonder if it is also doable in Control-FREEC

          Comment


          • #6
            Originally posted by ymc View Post
            If I have a large number of control (ie germline) exome data, can I pool them such that I can call CNV for the individual germline exome data? I heard that ExomeCNV can do that, I wonder if it is also doable in Control-FREEC
            No, so far there is such an option, although it could improve greatly the results. I think CONTRA does it in a right way. You can check. Otherwise, you can simple merge BAM files and use FREEC

            Comment


            • #7
              Originally posted by valeu View Post
              No, so far there is such an option, although it could improve greatly the results. I think CONTRA does it in a right way. You can check. Otherwise, you can simple merge BAM files and use FREEC
              merge them and treat it as normal and then the one we are interested in as tumor, correct?

              Comment


              • #8
                Originally posted by ymc View Post
                merge them and treat it as normal and then the one we are interested in as tumor, correct?
                Yes, you are right.

                Comment

                Latest Articles

                Collapse

                • seqadmin
                  Current Approaches to Protein Sequencing
                  by seqadmin


                  Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                  04-04-2024, 04:25 PM
                • seqadmin
                  Strategies for Sequencing Challenging Samples
                  by seqadmin


                  Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                  03-22-2024, 06:39 AM

                ad_right_rmr

                Collapse

                News

                Collapse

                Topics Statistics Last Post
                Started by seqadmin, 04-11-2024, 12:08 PM
                0 responses
                14 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 04-10-2024, 10:19 PM
                0 responses
                21 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 04-10-2024, 09:21 AM
                0 responses
                16 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 04-04-2024, 09:00 AM
                0 responses
                44 views
                0 likes
                Last Post seqadmin  
                Working...
                X