Announcement

Collapse
No announcement yet.

ONCOCNV: a method to extract CNAs from amplicon (or targeted) sequencing data

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • ONCOCNV: a method to extract CNAs from amplicon (or targeted) sequencing data

    We are happy to present ONCOCNV, a method to detect copy number alterations in amplicon or targeted sequencing data. The method can be applied to exome-seq data as well, but it will not adjust the profiles for contamination by normal cells or evaluate genotypes (LOH).

    ONCOCNV was developed by OncoDNA with the collaboration with the Bioinformatics Laboratory of Institut Curie (Paris). It automatically computes, normalizes, segments copy number profiles, then calls copy number alterations. The user can provide any number of control samples in order to construct the baseline. However, we recommend to use at least three control samples. The more the better

    Webpage: http://oncocnv.curie.fr/
    Publication: Boeva,V. et al. (2014) Multi-factor data normalization enables the detection of copy number aberrations in amplicon sequencing data. Bioinformatics, 30(24):3443-3450. Link

    Input for CNA detection: aligned single-end or paired-end data in the BAM format.
    Output: Annotation of genes with copy number changes + visualization of the profile (.png).

    Paper abstract:
    MOTIVATION:
    Because of its low cost, amplicon sequencing, also known as ultra-deep targeted sequencing, is now becoming widely used in oncology for detection of actionable mutations, i.e. mutations influencing cell sensitivity to targeted therapies. Amplicon sequencing is based on the polymerase chain reaction amplification of the regions of interest, a process that considerably distorts the information on copy numbers initially present in the tumor DNA. Therefore, additional experiments such as single nucleotide polymorphism (SNP) or comparative genomic hybridization (CGH) arrays often complement amplicon sequencing in clinics to identify copy number status of genes whose amplification or deletion has direct consequences on the efficacy of a particular cancer treatment. So far, there has been no proven method to extract the information on gene copy number aberrations based solely on amplicon sequencing.
    RESULTS:
    Here we present ONCOCNV, a method that includes a multifactor normalization and annotation technique enabling the detection of large copy number changes from amplicon sequencing data. We validated our approach on high and low amplicon density datasets and demonstrated that ONCOCNV can achieve a precision comparable with that of array CGH techniques in detecting copy number aberrations. Thus, ONCOCNV applied on amplicon sequencing data would make the use of additional array CGH or SNP array experiments unnecessary.
    Last edited by valeu; 02-10-2015, 09:34 AM.

  • #2
    Hi Valeu,
    I wonder if you can help me on this. I tried to run your ONCOCNV v6.1, with the test running, I got the error of Error in file(file, "rt") : cannot open the connection. Then the program quits.
    More details are shown below.
    Thanks!
    -Tony

    =====================================================
    $ ./RUNME.sh
    Package 'mclust' version 5.0.2
    Type 'citation("mclust")' for citing this R package in publications.
    Warning: you have both male and female samples in the control. We will try to assign sex using read coverage on chrX
    0.5 0.5 0.5 1 1 0.5 0.5 1 1 0.5 1 1 0.5 1 1
    Centering
    Whitening
    Symmetric FastICA using logcosh approx. to neg-entropy function
    Iteration 1 tol=0.354678
    Iteration 2 tol=0.401774
    Iteration 3 tol=0.485327
    Iteration 4 tol=0.644948
    Iteration 5 tol=0.960465
    Iteration 6 tol=0.518261
    Iteration 7 tol=0.071013
    Iteration 8 tol=0.006314
    Iteration 9 tol=0.004754
    Iteration 10 tol=0.004012
    Iteration 11 tol=0.003472
    Iteration 12 tol=0.004472
    Iteration 13 tol=0.005501
    Iteration 14 tol=0.005915
    Iteration 15 tol=0.005593
    Iteration 16 tol=0.004960
    Iteration 17 tol=0.004012
    Iteration 18 tol=0.002834
    Iteration 19 tol=0.001732
    Iteration 20 tol=0.001043
    Iteration 21 tol=0.000649
    Iteration 22 tol=0.000395
    Iteration 23 tol=0.000245
    Iteration 24 tol=0.000161
    Iteration 25 tol=0.000133
    Iteration 26 tol=0.000131
    Iteration 27 tol=0.000133
    Iteration 28 tol=0.000138
    Iteration 29 tol=0.000146
    Iteration 30 tol=0.000157
    Iteration 31 tol=0.000171
    Iteration 32 tol=0.000186
    Iteration 33 tol=0.000203
    Iteration 34 tol=0.000221
    Iteration 35 tol=0.000239
    Iteration 36 tol=0.000256
    Iteration 37 tol=0.000272
    Iteration 38 tol=0.000285
    Iteration 39 tol=0.000294
    Iteration 40 tol=0.000299
    Iteration 41 tol=0.000298
    Iteration 42 tol=0.000290
    Iteration 43 tol=0.000276
    Iteration 44 tol=0.000257
    Iteration 45 tol=0.000233
    Iteration 46 tol=0.000206
    Iteration 47 tol=0.000179
    Iteration 48 tol=0.000151
    Iteration 49 tol=0.000128
    Iteration 50 tol=0.000107
    Iteration 51 tol=0.000088
    Iteration 52 tol=0.000072
    Iteration 53 tol=0.000058
    Iteration 54 tol=0.000047
    Iteration 55 tol=0.000037
    Iteration 56 tol=0.000030
    Iteration 57 tol=0.000024
    Iteration 58 tol=0.000019
    Iteration 59 tol=0.000015
    Iteration 60 tol=0.000012
    Iteration 61 tol=0.000010
    Iteration 62 tol=0.000008
    Iteration 63 tol=0.000006
    Iteration 64 tol=0.000005
    Iteration 65 tol=0.000004
    Iteration 66 tol=0.000003
    Iteration 67 tol=0.000003
    Iteration 68 tol=0.000002
    Iteration 69 tol=0.000002
    Iteration 70 tol=0.000001
    Iteration 71 tol=0.000001
    Iteration 72 tol=0.000001
    Explained variance by the first pronicpal components of PCA:Comp.1 Comp.2 Comp.3 Comp.4 Comp.5 Comp.6 Comp.7 Comp.8 Comp.9 Comp.10 Comp.11 Comp.12 Comp.13 Comp.14 Comp.150.7874551 0.8498553 0.9053619 0.9284231 0.9443718 0.9596778 0.9682324 0.9741502 0.9794208 0.983909 0.9881928 0.991948 0.9950968 0.9978144 1null device
    1
    Package 'mclust' version 5.0.2
    Type 'citation("mclust")' for citing this R package in publications.
    PSCBS v0.44.0 (2015-02-22) successfully loaded. See ?PSCBS for help.

    Attaching package: ‘PSCBS’

    The following objects are masked from ‘package:base’:

    append, load

    R.cache v0.10.0 (2014-06-10) successfully loaded. See ?R.cache for help.
    Loading required package: lattice
    Loading required package: grid
    Loading required package: parallel
    Error in file(file, "rt") : cannot open the connection
    Calls: read.table -> file
    In addition: Warning message:
    In file(file, "rt") :
    cannot open file './Test.stats.txt': No such file or directory
    Execution halted
    Last edited by xxqtony; 08-24-2015, 09:20 AM.

    Comment


    • #3
      I also tried with my own data, and finally managed to get the program run, however the results are not expected. I have 4x CNV regions/amplicons, but they get 2x prediction. I wonder if there's anything I missed.
      Thanks.

      Comment


      • #4
        For the test dataset, do you see that './Test.stats.txt' has been created?

        For the second dataset, I don't understand what is wrong.

        Please, contact me by email.

        Comment


        • #5
          Hi Valeu,

          I have 10 sample 8 test and 2 control for which I am trying to run oncocnv v6.4. I have configured ONCOCNV.sh file as per instructions given. It is throwing following error.

          Detected 2 control sample(s)
          reading 11.bam
          sample name: 11
          read 100000 reads
          read 200000 reads
          read 300000 reads
          read 400000 reads
          read 500000 reads
          reading 12.bam
          sample name: 12
          read 100000 reads
          read 200000 reads
          read 300000 reads
          read 400000 reads
          Total target length: 272944
          processed 2 controls, 11 12
          Illegal division by zero at
          /san2/mallya/exome_cnv/anantha/unmapped/oncocnv/ONCOCNV//ONCOCNV_getCounts.v6.4.pl line 466 (#1)
          (F) You tried to divide a number by 0. Either something was wrong in
          your logic, or you need to put a conditional in to guard against
          meaningless input.

          Uncaught exception from user code:
          Illegal division by zero at /san2/mallya/exome_cnv/anantha/unmapped/oncocnv/ONCOCNV//ONCOCNV_getCounts.v6.4.pl line 466.
          at /san2/mallya/exome_cnv/anantha/unmapped/oncocnv/ONCOCNV//ONCOCNV_getCounts.v6.4.pl line 466.

          ------------------------


          --Coordinates are read--


          ------------------------

          Total target length: 0
          Detected 8 tumor sample(s)
          reading 1.bam
          reading 2.bam
          reading 3.bam
          reading 4.bam
          reading 5.bam
          reading 6.bam
          reading 7.bam
          reading 8.bam
          Error: The requested bed file (/san2/mallya/exome_cnv/anantha/unmapped/oncocnv/result//target.bed) could not be opened. Exiting!
          Any suggestions to proceed further?

          Thanks

          Comment


          • #6
            I believe something is wrong with your .bed file with regions. Please check the readme.

            Comment


            • #7
              span is too small

              Anyone experience this issue
              Error in simpleLoess(y, x, w, span, degree = degree, parametric = parametric, :
              span is too small

              I assume this is because my bed file contain sections where there are no reads? Outputs at https://drive.google.com/file/d/0B4x...ew?usp=sharing

              *** rest of stdout ***
              Calls: loess -> simpleLoess
              Execution halted
              Package 'mclust' version 5.2
              Type 'citation("mclust")' for citing this R package in publications.
              PSCBS v0.61.0 (2016-02-03) successfully loaded. See ?PSCBS for help.

              Attaching package: 'PSCBS'

              The following objects are masked from 'package:base':

              append, load

              R.cache v0.12.0 (2015-11-12) successfully loaded. See ?R.cache for help.
              Loading required package: lattice
              Loading required package: grid
              Loading required package: parallel
              Error in file(file, "rt") : cannot open the connection
              Calls: read.table -> file
              In addition: Warning message:
              In file(file, "rt") :
              cannot open file '/apps/outputDEEPCNA//Control.stats.Processed.txt': No such file or directory
              Execution halted

              Comment


              • #8
                perhaps switching out https://stat.ethz.ch/R-manual/R-deve...tml/loess.html
                for https://stat.ethz.ch/R-manual/R-deve.../html/rlm.html ?

                Comment


                • #9
                  Originally posted by arnoldliao View Post
                  Anyone experience this issue
                  Error in simpleLoess(y, x, w, span, degree = degree, parametric = parametric, :
                  span is too small

                  I assume this is because my bed file contain sections where there are no reads? Outputs at https://drive.google.com/file/d/0B4x...ew?usp=sharing

                  *** rest of stdout ***
                  Calls: loess -> simpleLoess
                  Execution halted
                  Package 'mclust' version 5.2
                  Type 'citation("mclust")' for citing this R package in publications.
                  PSCBS v0.61.0 (2016-02-03) successfully loaded. See ?PSCBS for help.

                  Attaching package: 'PSCBS'

                  The following objects are masked from 'package:base':

                  append, load

                  R.cache v0.12.0 (2015-11-12) successfully loaded. See ?R.cache for help.
                  Loading required package: lattice
                  Loading required package: grid
                  Loading required package: parallel
                  Error in file(file, "rt") : cannot open the connection
                  Calls: read.table -> file
                  In addition: Warning message:
                  In file(file, "rt") :
                  cannot open file '/apps/outputDEEPCNA//Control.stats.Processed.txt': No such file or directory
                  Execution halted
                  Hi,
                  I got the same error. Resolved, problem was with the bam file.

                  Sachin A

                  Comment


                  • #10
                    mclust error

                    I figured it out. My be file contain only chr,start,end while ONCOCONV needed chr,start,end,name,score,geneName

                    I'm getting an error
                    Error in if (minFrac < minFractionOfShortOrLongAmplicons & maxFrac < minFractionOfShortOrLongAm
                    missing value where TRUE/FALSE needed

                    Any idea where I can start to debug? Use a smaller bed file?

                    stderr


                    /outputDEEPCNA//Test.stats.txt was created
                    -rw-rw-rw- 1 root root 29 Jun 14 06:26 /outputDEEPCNA//Test.stats.txt
                    creating target.bed
                    -rw-rw-rw- 1 root root 0 Jun 14 06:26 /outputDEEPCNA//target.bed
                    creating target.GC.txt
                    ..Oops.. File /outputDEEPCNA//target.fasta is empty!
                    ..It seems that there is not 'chr' prefixes in your reference genome fasta file..
                    ..But no worries! OncoCNV will adjust for it
                    -rw-rw-rw- 1 root root 10 Jun 14 06:27 /outputDEEPCNA//target.GC.txt
                    running processControl.R
                    running processSamples.R

                    Package 'mclust' version 5.3
                    Type 'citation("mclust")' for citing this R package in publications.
                    Error in if (minFrac < minFractionOfShortOrLongAmplicons & maxFrac < minFractionOfShortOrLongAm
                    missing value where TRUE/FALSE needed
                    Execution halted
                    ls: cannot access '/outputDEEPCNA//Control.stats.Processed.txt': No such file or directory
                    Package 'mclust' version 5.3
                    Type 'citation("mclust")' for citing this R package in publications.
                    PSCBS v0.62.0 (2016-11-10) successfully loaded. See ?PSCBS for help.
                    Last edited by arnoldliao; 06-14-2017, 01:10 PM. Reason: figured it out.

                    Comment


                    • #11
                      Does your new (tab-delimited) .bed file satisfy the requirements listed in the OncoCNV manual?

                      Check formats:
                      o reads should be given in .BAM format
                      o amplicon coordinates should be given in .bed format (with or without the headline) and have amplicon ID in column 4 and gene symbol in column 6, e.g.: chr1 2488068 2488201 AMPL223847 0 TNFRSF14

                      It is mandatory to provide gene names in the 6th column.

                      VERY IMPORTANT

                      Please make sure that:
                      - There is no duplicates in the coordinates
                      - Coordinates are sorted
                      - Gene names are gene names in the sense that corresponding amplicons fall in the same genomic locus and not on different chromosomes
                      - Gene names cannot be the same as amplicon names or IDs because ONCOCNV assumes to have several amplicons per gene

                      Comment


                      • #12
                        Thank you

                        Merci for the reply, I got it to work with a correct bed file. I did get many Na . I will email you separately on the issues.

                        Comment


                        • #13
                          oncoCNV trainner samples

                          should the control samples do the same library prep with samples which need to call CNV?
                          if I had an amplicon library want to call CNV then whether the control samples should also do an amplicon sequencing? or just use the database samples.

                          thank you!

                          Comment


                          • #14
                            What does the copy.number column in the summary.txt file represent? That is what does the 1,2 and 2.5 copy.number represent in the summary.txt file?

                            Comment


                            • #15
                              Hi,

                              can I use OncoCNV with WGS data (30X)?

                              For WES have I to add any specific option or other?

                              Many thanks

                              Comment

                              Working...
                              X