Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • clarissaboschi
    replied
    Alicia, I gave up to use CNV-seq tool and I am using CNVnator which I do not need any reference.

    Leave a comment:


  • jflores
    replied
    Hello Alicia,

    The most straight forward design you can employ here is to simply choose one of your test#.hits files as the reference. You can use the sample with the highest coverage as reference, or a sample that has been used before as reference in other studies, etc. Then you go on comparing all the rest of the chicken samples against this sample and obtain CNV calls relative to this sample. Let's say you choose test1.hits as the reference:

    test2.hits vs test1.hits

    test3.hits vs test1.hits

    test4.hits vs test1.hits

    ....

    Hope it helps,
    Rodrigo F.

    Leave a comment:


  • Alicia B
    replied
    Originally posted by clarissaboschi View Post
    Dear members,

    I am using data from Illumina next generation sequencing from different chickens. I have different bam files (for each chicken), and I obtained the test.hits using the command line:
    samtools view –F 4 file1.bam | perl –lane ‘print “$F[2]\t$F[3]”’ > test1.hits

    But I need to have also the ref.hits, but I only have the reference chicken genome reference (fasta file). How can I get this ref.hits?

    Thanks
    Clarissa
    @clarissaboschi did you ever find a solution to your question regarding what to use as a a ref-hits file? I am in what seems to be a similar situation to yours: I don't have a control sample and I am only trying to detect CNVs in various individuals (rice) comparing it to the reference genome. What did you end up doing?

    Thanks for your help!

    Leave a comment:


  • arcolombo698
    replied
    Understanding the Log2 Ratio plotted

    Hi. thank you in advance.
    so after re-reading the manual, I have a question about the values for the y-axis.

    The y-axis has the log2 data from the output in the .cnv file.

    In the paper, it gives the predicted copy number r = z*(N_Y)/(N_X) .... is the values for r, the same values as the log2 column entries that is plotted on the y axis?

    For instance, I am trying to interpret the ratio correctly.. given this test example

    Since I use a reference/control (N_Y), and a tumor sample (N_X), from the original paper Xiao published, the positive log2 values indicate that (N_Y > N_X) which means that it was not an increase in copy number. is this correct understanding of the log2 value for their ratio?
    Attached Files

    Leave a comment:


  • arcolombo698
    replied
    thank you for the response. What does the CNV argument mean? and what does the glim argument mean? is the glim the limits on the p.value column in the data frame for data?

    From the manual...

    plot.cnv.chr <- function (data, chromosome = NA,
    from = NA, to = NA, title = NA,
    ylim = c(-4, 4), glim = c(NA, NA),

    Leave a comment:


  • arcolombo698
    replied
    I am not a CGH expert by any means whatsoever. From my experiments, the germline samples are the controls, which are tumor free. The tumor samples have tumors, and we are measuring the variation between the two.

    My best advice, is to go onto PubMed and find experiments similar to you. It depends on your experiment design and hypothesis. I am under the impression that the reference is the control. I am also not an expert with chicken experiments, I have only worked on hg19, and mm10 for NBL, and Medullo.

    if you use the layer as the reference, the CNV data that will be output is in relation to the broiler changes with respect to layer. This just depends on your experiment design.

    Leave a comment:


  • clarissaboschi
    replied
    Originally posted by arcolombo698 View Post
    @ Clarissa

    normally with CGH experiments you are comparing a tumor sample with a non-tumor sample. my data has a germline sample and also a tumor sample. the germline is used as a reference. So you should have two samples to compare.

    ?
    @arcolombo698
    Thanks for the response.
    In my experiment I have 2 different chickens lines - broiler x layer. But I have individual samples from broilers and layers. Should I combine all my broilers and layers? But I want to check in each sample (chicken). I have 10 broilers and 10 layers. I dont have a control sample. So maybe I need a control from another chicken line? I am not sure because in my experiment I detect SNPs using a reference from NCBI (chicken reference genome).
    Any suggestions?

    Leave a comment:


  • xiechao
    replied
    Originally posted by arcolombo698 View Post
    Thank you... I have a last question

    when plotting, how do I rotate 90 degrees the x labels for each tick mark. I know with ggplot there is an angle option that can be used. right now my images look like this


    Here is my command

    plot.cnv(data)
    Warning messages:
    1: In plot.cnv.all(data, ...) :
    missed some data points due to small ylim range
    2: Removed 111874 rows containing missing values (geom_point).
    > ggsave("allchroms.pdf")
    Saving 6.99 x 6.99 in image
    Warning message:
    Removed 111874 rows containing missing values (geom_point).



    Thank you again very much.

    Should I use the reshape package?
    Try something like:
    g <- plot.cnv(data, CNV = 4)
    g + theme(axis.text.x = element_text(angle = 90))

    Leave a comment:


  • arcolombo698
    replied
    Rotating the X tick marks with plot.cnv

    Thank you... I have a last question

    when plotting, how do I rotate 90 degrees the x labels for each tick mark. I know with ggplot there is an angle option that can be used. right now my images look like this


    Here is my command

    plot.cnv(data)
    Warning messages:
    1: In plot.cnv.all(data, ...) :
    missed some data points due to small ylim range
    2: Removed 111874 rows containing missing values (geom_point).
    > ggsave("allchroms.pdf")
    Saving 6.99 x 6.99 in image
    Warning message:
    Removed 111874 rows containing missing values (geom_point).



    Thank you again very much.

    Should I use the reshape package?

    Leave a comment:


  • xiechao
    replied
    Originally posted by arcolombo698 View Post
    The manual does not specify that mouse genome is not allowed, has anyone used CNV-seq for mouse data?

    I was able to add the genome.fa used under alignment under --genome parameter, but wish to confirm If this is correct.

    thank you
    Yes, it should work with any genome. But you need to specify --genome-size, which is used for sliding window size calculation.

    Leave a comment:


  • arcolombo698
    replied
    @ Clarissa

    normally with CGH experiments you are comparing a tumor sample with a non-tumor sample. my data has a germline sample and also a tumor sample. the germline is used as a reference. So you should have two samples to compare.

    ?

    Leave a comment:


  • arcolombo698
    replied
    The manual does not specify that mouse genome is not allowed, has anyone used CNV-seq for mouse data?

    I was able to add the genome.fa used under alignment under --genome parameter, but wish to confirm If this is correct.

    thank you

    Leave a comment:


  • clarissaboschi
    replied
    how to obtain the ref.hits file from a genome reference

    Dear members,

    I am using data from Illumina next generation sequencing from different chickens. I have different bam files (for each chicken), and I obtained the test.hits using the command line:
    samtools view –F 4 file1.bam | perl –lane ‘print “$F[2]\t$F[3]”’ > test1.hits

    But I need to have also the ref.hits, but I only have the reference chicken genome reference (fasta file). How can I get this ref.hits?

    Thanks
    Clarissa

    Leave a comment:


  • Ayush_Saxena
    replied
    Originally posted by billthebrute View Post
    Hi CNV-seq users. I am having a bit of trouble understanding how the cnv caller works.

    The results of one of my .cnv files is as follows:


    "22" 16275937 18988591 204 2822 17632264 -0.683143010100174 2.46090424342812e-08 0 NA NA NA
    "22" 17632265 20344919 195 3190 18988592 -0.925076474377026 2.85143491959854e-12 0 NA NA NA
    "22" 18988593 21701247 191 3107 20344920 -0.916943777560497 3.8950753868117e-12 0 NA NA NA
    "22" 20344921 23057575 188 3082 21701248 -0.928128374311553 2.53633054795169e-12 0 NA NA NA
    "22" 21701249 24413903 210 3108 23057576 -0.780591350215992 6.85565176334874e-10 0 NA NA NA
    "22" 23057577 25770231 206 3256 24413904 -0.875450536557434 1.90371570869566e-11 0 NA NA NA
    "22" 24413905 27126559 177 3139 25770232 -1.04154984297352 3.22421238629005e-14 0 NA NA NA
    "22" 25770233 28482887 180 3103 27126560 -1.00066096467345 1.55775796481263e-13 0 NA NA NA
    "22" 27126561 29839215 157 2718 28482888 -1.00677507135017 1.23086398489238e-13 0 NA NA NA
    "22" 28482889 31195543 143 2606 29839216 -1.08081611126415 7.11362307806884e-15 0 NA NA NA
    "22" 29839217 32551871 160 2852 31195544 -1.04889625103093 2.42971963892229e-14 0 NA NA NA
    "22" 31195545 33908199 155 2827 32551872 -1.08199784203134 6.7976211176428e-15 0 NA NA NA
    "22" 32551873 35264527 166 2610 33908200 -0.867860739584727 2.5422391389699e-11 0 NA NA NA
    "22" 33908201 36620855 170 2621 35264528 -0.839576781665121 7.44538364328285e-11 0 NA NA NA
    "22" 35264529 37977183 210 3210 36620856 -0.827178143817948 1.19029886619997e-10 0 NA NA NA
    "22" 36620857 39333511 218 3208 37977184 -0.772340181152399 9.32704407763649e-10 0 NA NA NA
    "22" 37977185 40689839 195 2891 39333512 -0.783088659201844 6.24484354174541e-10 0 NA NA NA
    "22" 39333513 42046167 191 2608 40689840 -0.664365405669734 4.82741849789952e-08 0 NA NA NA
    "22" 40689841 43402495 306 4365 42046168 -0.727444175298556 4.90998416767461e-09 0 NA NA NA
    "22" 42046169 44758823 409 5386 43402496 -0.612107561034328 3.04270126227705e-07 0 NA NA NA
    "22" 43402497 46115151 273 3718 44758824 -0.660619993474034 5.51777958936528e-08 0 NA NA NA
    "22" 44758825 47471479 193 3190 46115152 -0.939949750858556 1.61095440200424e-12 0 NA NA NA

    Surely this should have been annotated as a cnv as there are more than 4 windows with a log2<-0.6 and high significance !
    Moreover I find the coordinates strange:
    look at the first and third line:
    "22" 16275937 18988591
    "22" 18988593 21701247

    18988591 is not equal to 18988593. Whereas sometimes, if I change the window size the end of the first line and the beginning of the third would be the same.

    These files were created with default values, window size = 2712655

    Thanks in advance for any insight you can give me. I am tempted to call the cnvs manually...what would be the best way to do so from the .cnv file ?
    I overlooked your post before I posted mine, I'm having a very similar problem and I've also posted my data. Did you get around your problem or called manually?
    Last edited by Ayush_Saxena; 03-15-2013, 09:12 AM. Reason: Forgot to Quote

    Leave a comment:


  • Ayush_Saxena
    replied
    Bugs

    Is there any way we can report a bug for CNV-Seq, I tried searching for it but couldn't find.

    the software, for some strange reasons is not calling CNVs even when all conditions are met. I used a log2 threshold of 0.8 and a window size of only 2 so that I can look at the result(cnv.print()) by eye and judge which ones to pick.

    "CHROMOSOME_II" 1450045 1451171 404 416 1450608 0.881401332001257 2.29121717076986e-13 0 NA NA NA
    "CHROMOSOME_II" 1450609 1451735 570 464 1451172 1.22046668131509 1.31845290945157e-22 0 NA NA NA
    "CHROMOSOME_II" 1451173 1452299 629 550 1451736 1.11725796585782 1.18360892978282e-19 0 NA NA NA
    "CHROMOSOME_II" 1451737 1452863 671 657 1452300 0.954048963268409 3.24101505961822e-15 0 NA NA NA
    "CHROMOSOME_II" 1452301 1453427 602 577 1452864 0.984821735504775 5.02877099034994e-16 0 NA NA NA
    "CHROMOSOME_II" 1452865 1453991 513 516 1453428 0.915217327574355 3.2389370038797e-14 0 NA NA NA
    "CHROMOSOME_II" 1453429 1454555 590 516 1453992 1.1169734562165 1.20564598087743e-19 0 NA NA NA
    "CHROMOSOME_II" 1453993 1455119 551 465 1454556 1.16845116996632 4.16186035989176e-21 0 NA NA NA
    "CHROMOSOME_II" 1454557 1455683 374 369 1455120 0.943047021217795 6.25787394726544e-15 0 NA NA NA
    "CHROMOSOME_II" 1455121 1456247 415 422 1455684 0.899497904917657 8.08876111589643e-14 0 NA NA NA
    "CHROMOSOME_II" 1455685 1456811 615 573 1456248 1.02568083886025 4.03369375071261e-17 0 NA NA NA
    "CHROMOSOME_II" 1456249 1457375 666 575 1456812 1.13558978863008 3.59196486144447e-20 0 NA NA NA
    "CHROMOSOME_II" 1456813 1457939 663 565 1457376 1.15438757020059 1.04963378485956e-20 0 NA NA NA
    "CHROMOSOME_II" 1457377 1458503 609 524 1457940 1.14050498375944 2.60576418573692e-20 0 NA NA NA
    "CHROMOSOME_II" 1457941 1459067 410 408 1458504 0.930684324924505 1.30371091397085e-14 0 NA NA NA

    This entire >8kb region qualifies all conditions set for a CNV to be called and its still not called which makes me a little skeptical about the software itself. Can anyone list some other reliable alternatives

    Leave a comment:

Latest Articles

Collapse

  • seqadmin
    Recent Advances in Sequencing Analysis Tools
    by seqadmin


    The sequencing world is rapidly changing due to declining costs, enhanced accuracies, and the advent of newer, cutting-edge instruments. Equally important to these developments are improvements in sequencing analysis, a process that converts vast amounts of raw data into a comprehensible and meaningful form. This complex task requires expertise and the right analysis tools. In this article, we highlight the progress and innovation in sequencing analysis by reviewing several of the...
    05-06-2024, 07:48 AM
  • seqadmin
    Essential Discoveries and Tools in Epitranscriptomics
    by seqadmin




    The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
    04-22-2024, 07:01 AM

ad_right_rmr

Collapse

News

Collapse

Topics Statistics Last Post
Started by seqadmin, 05-14-2024, 07:03 AM
0 responses
26 views
0 likes
Last Post seqadmin  
Started by seqadmin, 05-10-2024, 06:35 AM
0 responses
46 views
0 likes
Last Post seqadmin  
Started by seqadmin, 05-09-2024, 02:46 PM
0 responses
59 views
0 likes
Last Post seqadmin  
Started by seqadmin, 05-07-2024, 06:57 AM
0 responses
47 views
0 likes
Last Post seqadmin  
Working...
X