Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • super0925
    replied
    Hi D,
    Just two questions.
    1)I am doing RNA-Seq analysis (reads mapped to human genome) and want to find significant Differential expressed human genes, but I also find there is some reads counts mapped to miRNA, is it normal? Do I need to separate them to analysis it (cause I don't is it could impact the reads distripbution) ? Or together analysis it?
    2)BTW, the log fold change inf or -inf (I know what does it means) but with Q< 0.05 , is it useful for downstream analysis?

    Leave a comment:


  • dpryan
    replied
    I don't think using cufflinks or not has anything to do with it. Every tool makes assumptions, it's just a matter of which tool's assumptions happen to match your particular dataset better.

    Leave a comment:


  • super0925
    replied
    Originally posted by dpryan View Post
    I expect that the tool that gets the most true positives will vary by dataset. This will also vary wildly by cuffdiff version.
    Thank you D. Very useful! I got it
    So do you think it is due to I didn't do Cufflinks? Cause I omit this steps that I don't need find new genes and assemble the transcripts. And for the counts based approach (edgeR/DESeq2), we don't do the cufflinks step as well.

    Leave a comment:


  • dpryan
    replied
    I expect that the tool that gets the most true positives will vary by dataset. This will also vary wildly by cuffdiff version.

    Leave a comment:


  • super0925
    replied
    Originally posted by dpryan View Post
    From the MDS plot, I'd guess that edgeR and DESeq2 are correct and there aren't any DE genes, but it's usually a good idea to not read too much into these. At the very least it's likely that the experiment is underpowered.
    Hi D
    A quick question
    I have shown that my results comparison among the pipelines to you before.

    tophat-bam-cuffdiff2 (I don't use cufflinks cause I don't need to finid the new genes)
    tophat-htseq-deseq2/edgeR

    I found that the Cuffdiff2 is more liberal than edgeR.

    However, from the paper
    Recent advances in next-generation sequencing technology allow high-throughput cDNA sequencing (RNA-Seq) to be widely applied in transcriptomic studies, in particular for detecting differentially expressed genes between groups. Many software packages have been developed for the identification of differentially expressed genes (DEGs) between treatment groups based on RNA-Seq data. However, there is a lack of consensus on how to approach an optimal study design and choice of suitable software for the analysis. In this comparative study we evaluate the performance of three of the most frequently used software tools: Cufflinks-Cuffdiff2, DESeq and edgeR. A number of important parameters of RNA-Seq technology were taken into consideration, including the number of replicates, sequencing depth, and balanced vs. unbalanced sequencing depth within and between groups. We benchmarked results relative to sets of DEGs identified through either quantitative RT-PCR or microarray. We observed that edgeR performs slightly better than DESeq and Cuffdiff2 in terms of the ability to uncover true positives. Overall, DESeq or taking the intersection of DEGs from two or more tools is recommended if the number of false positives is a major concern in the study. In other circumstances, edgeR is slightly preferable for differential expression analysis at the expense of potentially introducing more false positives.


    it seems that edgeR got most True positive.

    Could it happen (i.e. Cuffdiff2 is more liberal) because of different animals? I have done it on bovine samples. Or because the problem of my scripts?
    I used the default parameter in Cuffdiff2 and edgeR/DESeq2.
    Thank you!
    Attached Files

    Leave a comment:


  • dpryan
    replied
    From the MDS plot, I'd guess that edgeR and DESeq2 are correct and there aren't any DE genes, but it's usually a good idea to not read too much into these. At the very least it's likely that the experiment is underpowered.

    Leave a comment:


  • super0925
    replied
    Originally posted by dpryan View Post
    You can always change the threshold for significance a bit (0.1 is very common). I'm not familiar enough with the inner workings of cuffdiff2 to provide any insights there.
    Thank you D. So from the MDS and MA plots are looked like OK (cause they are not totally separate between two groups)?
    Last edited by super0925; 08-13-2014, 11:56 PM.

    Leave a comment:


  • dpryan
    replied
    You can always change the threshold for significance a bit (0.1 is very common). I'm not familiar enough with the inner workings of cuffdiff2 to provide any insights there.

    Leave a comment:


  • super0925
    replied
    Originally posted by dpryan View Post
    It's probably a bug in bowtie2 then. If you can subset your fastq file to a reasonable size and can still reproduce the issue then you can either post that and the gene you're aligning against and I'll have a look or you can then just directly submit a bug report to the bowtie2 authors.

    BTW, you don't need to specify --local if you also specify --very-sensitive-local.

    Hi teacher, another question again.
    I am analysising 6 samples in 2 conditions.
    From Cuffdiff2, I got ~700 DE genes (default is Q <0.05 , you got it)
    But if I use edgeR and DESeq2, I didn't find any DE genes by the default threshold (i.e. Q<0.05).
    Why? Do I need to change any setting or parameters? (currently I use default)
    Could I insist use Tuxedo? or change parameter (e.g. P value) for count-based methods?

    I have attached the MA plot and MDS plot.

    Thank you!
    Attached Files

    Leave a comment:


  • dpryan
    replied
    It's probably a bug in bowtie2 then. If you can subset your fastq file to a reasonable size and can still reproduce the issue then you can either post that and the gene you're aligning against and I'll have a look or you can then just directly submit a bug report to the bowtie2 authors.

    BTW, you don't need to specify --local if you also specify --very-sensitive-local.

    Leave a comment:


  • super0925
    replied
    Originally posted by dpryan View Post
    You forgot the "-x" before "GQ2Bowtie2Index/GQ2". I presume that was just a typo in this post, though, so you'd have to run bowtie2 in a debugger to find the source of the error. It's likely that this sort of thing is a bug rather than you doing something wrong.
    Sorry D, if I add the "-x" , I still get the same error... T__T
    Last edited by super0925; 07-31-2014, 05:30 AM.

    Leave a comment:


  • dpryan
    replied
    You forgot the "-x" before "GQ2Bowtie2Index/GQ2". I presume that was just a typo in this post, though, so you'd have to run bowtie2 in a debugger to find the source of the error. It's likely that this sort of thing is a bug rather than you doing something wrong.

    Leave a comment:


  • super0925
    replied
    Originally posted by dpryan View Post
    We should setup a consultation contract



    If you just want a quick and dirty check then just mapping reads to that gene should suffice. Just tweak your settings to only permit perfect or near perfect matches.



    That should suffice. If you need to know exact read numbers or you need the alignments for SNP calling, then this method isn't ideal. In those cases, you would really need to map to the entire genome so as to not bias alignments (this is also why I suggested only accepting near-perfect matches above).

    Hello D,
    Sorry to trouble you again in this issue.
    If I want to see whether a sequence (supposed named "GQ2") is expressed in bovine cells. I followed the solution in our last discussion: 'firstly I will generate the bowtie2 index of this "GQ2" based on the fasta sequence, and then map my fastq file to that "GQ2" genome.'
    But when I did it, I got the error.
    After I ran the bowtie2, it goes to the endless loop (obviously it is error). If I terminate the bowtie2, I found the error.
    "bowtie2-align died with signal 2 (INT)"

    I have googled the error but I didn't find any solution about this error. some post said it may due to memory problem but I don't think it is my problem. Hence I am wondering are my command lines wrong?

    My step:
    (1) generate the bowtie2 index of GQ2.

    >GQ2
    ATGGAGCACTTTCCCCGCTGTGTGCACGAGTCCTGGGGTTCCTCAAAGGA
    GCCCCAGAAAACAGAGGTGCTGCAACTCTTGAGCTTAGCGGACCCTGAGG
    .....

    mkdir GQ2Bowtie2Index
    cd GQ2Bowtie2Index
    bowtie2 GQ2.fa GQ2
    ls
    GQ2.1.bt2 GQ2.2.bt2 GQ2.3.bt2 GQ2.4.bt2 GQ2.fa GQ2.rev.1.bt2 GQ2.rev.2.bt2

    (2) map the gene on to 'GQ2' genome

    bowtie2 --local --very-sensitive-local -p 8 ./GQ2Bowtie2Index/GQ2 -U bovine_sample1.fastq

    then get the error!!!
    I really don't know why I am wrong ...
    Thank you!
    Attached Files
    Last edited by super0925; 07-31-2014, 02:17 AM.

    Leave a comment:


  • super0925
    replied
    Originally posted by dpryan View Post
    We should setup a consultation contract



    If you just want a quick and dirty check then just mapping reads to that gene should suffice. Just tweak your settings to only permit perfect or near perfect matches.



    That should suffice. If you need to know exact read numbers or you need the alignments for SNP calling, then this method isn't ideal. In those cases, you would really need to map to the entire genome so as to not bias alignments (this is also why I suggested only accepting near-perfect matches above).

    Thank you soooo much! You are not only my consultant, but also my teacher
    I will try to do it.

    Leave a comment:


  • dpryan
    replied
    Originally posted by super0925 View Post
    Hi D
    I have two samples (i.e. two .fastq files) from bovine. If I want to see whether a sequence (supposed named "GQ2") is expressed in bovine cells. (I have got the FASTA of this sequence) The sequence is unannotated in the current bovine genome, so might not have been tested in the analyses thus far.
    We should setup a consultation contract

    Q1: How could I do it?
    If you just want a quick and dirty check then just mapping reads to that gene should suffice. Just tweak your settings to only permit perfect or near perfect matches.

    Q2: My solution (I don't know is it correct)
    firstly I generate the bowtie/bowtie2 index of this "GQ2" based on the FASTA sequence, and then map my fastq file to that "GQ2" genome. is it correct?
    Thank you!
    That should suffice. If you need to know exact read numbers or you need the alignments for SNP calling, then this method isn't ideal. In those cases, you would really need to map to the entire genome so as to not bias alignments (this is also why I suggested only accepting near-perfect matches above).

    Leave a comment:

Latest Articles

Collapse

  • seqadmin
    Essential Discoveries and Tools in Epitranscriptomics
    by seqadmin




    The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
    04-22-2024, 07:01 AM
  • seqadmin
    Current Approaches to Protein Sequencing
    by seqadmin


    Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
    04-04-2024, 04:25 PM

ad_right_rmr

Collapse

News

Collapse

Topics Statistics Last Post
Started by seqadmin, Yesterday, 08:47 AM
0 responses
12 views
0 likes
Last Post seqadmin  
Started by seqadmin, 04-11-2024, 12:08 PM
0 responses
60 views
0 likes
Last Post seqadmin  
Started by seqadmin, 04-10-2024, 10:19 PM
0 responses
59 views
0 likes
Last Post seqadmin  
Started by seqadmin, 04-10-2024, 09:21 AM
0 responses
54 views
0 likes
Last Post seqadmin  
Working...
X