Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Originally posted by dpryan View Post
    We should setup a consultation contract



    If you just want a quick and dirty check then just mapping reads to that gene should suffice. Just tweak your settings to only permit perfect or near perfect matches.



    That should suffice. If you need to know exact read numbers or you need the alignments for SNP calling, then this method isn't ideal. In those cases, you would really need to map to the entire genome so as to not bias alignments (this is also why I suggested only accepting near-perfect matches above).

    Hello D,
    Sorry to trouble you again in this issue.
    If I want to see whether a sequence (supposed named "GQ2") is expressed in bovine cells. I followed the solution in our last discussion: 'firstly I will generate the bowtie2 index of this "GQ2" based on the fasta sequence, and then map my fastq file to that "GQ2" genome.'
    But when I did it, I got the error.
    After I ran the bowtie2, it goes to the endless loop (obviously it is error). If I terminate the bowtie2, I found the error.
    "bowtie2-align died with signal 2 (INT)"

    I have googled the error but I didn't find any solution about this error. some post said it may due to memory problem but I don't think it is my problem. Hence I am wondering are my command lines wrong?

    My step:
    (1) generate the bowtie2 index of GQ2.

    >GQ2
    ATGGAGCACTTTCCCCGCTGTGTGCACGAGTCCTGGGGTTCCTCAAAGGA
    GCCCCAGAAAACAGAGGTGCTGCAACTCTTGAGCTTAGCGGACCCTGAGG
    .....

    mkdir GQ2Bowtie2Index
    cd GQ2Bowtie2Index
    bowtie2 GQ2.fa GQ2
    ls
    GQ2.1.bt2 GQ2.2.bt2 GQ2.3.bt2 GQ2.4.bt2 GQ2.fa GQ2.rev.1.bt2 GQ2.rev.2.bt2

    (2) map the gene on to 'GQ2' genome

    bowtie2 --local --very-sensitive-local -p 8 ./GQ2Bowtie2Index/GQ2 -U bovine_sample1.fastq

    then get the error!!!
    I really don't know why I am wrong ...
    Thank you!
    Attached Files
    Last edited by super0925; 07-31-2014, 02:17 AM.

    Comment


    • You forgot the "-x" before "GQ2Bowtie2Index/GQ2". I presume that was just a typo in this post, though, so you'd have to run bowtie2 in a debugger to find the source of the error. It's likely that this sort of thing is a bug rather than you doing something wrong.

      Comment


      • Originally posted by dpryan View Post
        You forgot the "-x" before "GQ2Bowtie2Index/GQ2". I presume that was just a typo in this post, though, so you'd have to run bowtie2 in a debugger to find the source of the error. It's likely that this sort of thing is a bug rather than you doing something wrong.
        Sorry D, if I add the "-x" , I still get the same error... T__T
        Last edited by super0925; 07-31-2014, 05:30 AM.

        Comment


        • It's probably a bug in bowtie2 then. If you can subset your fastq file to a reasonable size and can still reproduce the issue then you can either post that and the gene you're aligning against and I'll have a look or you can then just directly submit a bug report to the bowtie2 authors.

          BTW, you don't need to specify --local if you also specify --very-sensitive-local.

          Comment


          • Originally posted by dpryan View Post
            It's probably a bug in bowtie2 then. If you can subset your fastq file to a reasonable size and can still reproduce the issue then you can either post that and the gene you're aligning against and I'll have a look or you can then just directly submit a bug report to the bowtie2 authors.

            BTW, you don't need to specify --local if you also specify --very-sensitive-local.

            Hi teacher, another question again.
            I am analysising 6 samples in 2 conditions.
            From Cuffdiff2, I got ~700 DE genes (default is Q <0.05 , you got it)
            But if I use edgeR and DESeq2, I didn't find any DE genes by the default threshold (i.e. Q<0.05).
            Why? Do I need to change any setting or parameters? (currently I use default)
            Could I insist use Tuxedo? or change parameter (e.g. P value) for count-based methods?

            I have attached the MA plot and MDS plot.

            Thank you!
            Attached Files

            Comment


            • You can always change the threshold for significance a bit (0.1 is very common). I'm not familiar enough with the inner workings of cuffdiff2 to provide any insights there.

              Comment


              • Originally posted by dpryan View Post
                You can always change the threshold for significance a bit (0.1 is very common). I'm not familiar enough with the inner workings of cuffdiff2 to provide any insights there.
                Thank you D. So from the MDS and MA plots are looked like OK (cause they are not totally separate between two groups)?
                Last edited by super0925; 08-13-2014, 11:56 PM.

                Comment


                • From the MDS plot, I'd guess that edgeR and DESeq2 are correct and there aren't any DE genes, but it's usually a good idea to not read too much into these. At the very least it's likely that the experiment is underpowered.

                  Comment


                  • Originally posted by dpryan View Post
                    From the MDS plot, I'd guess that edgeR and DESeq2 are correct and there aren't any DE genes, but it's usually a good idea to not read too much into these. At the very least it's likely that the experiment is underpowered.
                    Hi D
                    A quick question
                    I have shown that my results comparison among the pipelines to you before.

                    tophat-bam-cuffdiff2 (I don't use cufflinks cause I don't need to finid the new genes)
                    tophat-htseq-deseq2/edgeR

                    I found that the Cuffdiff2 is more liberal than edgeR.

                    However, from the paper
                    Recent advances in next-generation sequencing technology allow high-throughput cDNA sequencing (RNA-Seq) to be widely applied in transcriptomic studies, in particular for detecting differentially expressed genes between groups. Many software packages have been developed for the identification of differentially expressed genes (DEGs) between treatment groups based on RNA-Seq data. However, there is a lack of consensus on how to approach an optimal study design and choice of suitable software for the analysis. In this comparative study we evaluate the performance of three of the most frequently used software tools: Cufflinks-Cuffdiff2, DESeq and edgeR. A number of important parameters of RNA-Seq technology were taken into consideration, including the number of replicates, sequencing depth, and balanced vs. unbalanced sequencing depth within and between groups. We benchmarked results relative to sets of DEGs identified through either quantitative RT-PCR or microarray. We observed that edgeR performs slightly better than DESeq and Cuffdiff2 in terms of the ability to uncover true positives. Overall, DESeq or taking the intersection of DEGs from two or more tools is recommended if the number of false positives is a major concern in the study. In other circumstances, edgeR is slightly preferable for differential expression analysis at the expense of potentially introducing more false positives.


                    it seems that edgeR got most True positive.

                    Could it happen (i.e. Cuffdiff2 is more liberal) because of different animals? I have done it on bovine samples. Or because the problem of my scripts?
                    I used the default parameter in Cuffdiff2 and edgeR/DESeq2.
                    Thank you!
                    Attached Files

                    Comment


                    • I expect that the tool that gets the most true positives will vary by dataset. This will also vary wildly by cuffdiff version.

                      Comment


                      • Originally posted by dpryan View Post
                        I expect that the tool that gets the most true positives will vary by dataset. This will also vary wildly by cuffdiff version.
                        Thank you D. Very useful! I got it
                        So do you think it is due to I didn't do Cufflinks? Cause I omit this steps that I don't need find new genes and assemble the transcripts. And for the counts based approach (edgeR/DESeq2), we don't do the cufflinks step as well.

                        Comment


                        • I don't think using cufflinks or not has anything to do with it. Every tool makes assumptions, it's just a matter of which tool's assumptions happen to match your particular dataset better.

                          Comment


                          • Hi D,
                            Just two questions.
                            1)I am doing RNA-Seq analysis (reads mapped to human genome) and want to find significant Differential expressed human genes, but I also find there is some reads counts mapped to miRNA, is it normal? Do I need to separate them to analysis it (cause I don't is it could impact the reads distripbution) ? Or together analysis it?
                            2)BTW, the log fold change inf or -inf (I know what does it means) but with Q< 0.05 , is it useful for downstream analysis?

                            Comment


                            • 1) A few reads here and there being mapped to miRNAs is pretty common. You'll generally not get enough reads mapping to miRNAs when you look at mRNA to have decent statistical power. However, if you do, then I suppose you can go ahead and use those reads. Have a look at them and see if they're of the mature form or the pri-miRNA, which would seem more likely.

                              2) It could be useful, though that's most common among poorly expressed genes (rather than those with some level of detectable baseline expression). In any case, the real fold-change is unlikely to be infinite, that's just the estimate given the input data.

                              Comment


                              • Originally posted by dpryan View Post
                                1) A few reads here and there being mapped to miRNAs is pretty common. You'll generally not get enough reads mapping to miRNAs when you look at mRNA to have decent statistical power. However, if you do, then I suppose you can go ahead and use those reads. Have a look at them and see if they're of the mature form or the pri-miRNA, which would seem more likely.

                                2) It could be useful, though that's most common among poorly expressed genes (rather than those with some level of detectable baseline expression). In any case, the real fold-change is unlikely to be infinite, that's just the estimate given the input data.

                                Thank you D, very useful.
                                We extracted the RNAs from the cells using tryzol method followed by a clean up using RNAeasy Mini kit. And then remove most of the RNA shorter than 200 nts in length. Thus, most of the miRNA reads may come from contaminant and then are not relevant for our analysis. So do you think it will affect our analysis, do we need to do separately analysis without miRNA? If the result is still OK as you said and will not impact the DE human genes, we are glad. If not , So far I don't know about how to remove those miRNA from the analysis.

                                Comment

                                Latest Articles

                                Collapse

                                • seqadmin
                                  Choosing Between NGS and qPCR
                                  by seqadmin



                                  Next-generation sequencing (NGS) and quantitative polymerase chain reaction (qPCR) are essential techniques for investigating the genome, transcriptome, and epigenome. In many cases, choosing the appropriate technique is straightforward, but in others, it can be more challenging to determine the most effective option. A simple distinction is that smaller, more focused projects are typically better suited for qPCR, while larger, more complex datasets benefit from NGS. However,...
                                  10-18-2024, 07:11 AM
                                • seqadmin
                                  Non-Coding RNA Research and Technologies
                                  by seqadmin




                                  Non-coding RNAs (ncRNAs) do not code for proteins but play important roles in numerous cellular processes including gene silencing, developmental pathways, and more. There are numerous types including microRNA (miRNA), long ncRNA (lncRNA), circular RNA (circRNA), and more. In this article, we discuss innovative ncRNA research and explore recent technological advancements that improve the study of ncRNAs.

                                  Nobel Prize for MicroRNA Discovery
                                  This week,...
                                  10-07-2024, 08:07 AM

                                ad_right_rmr

                                Collapse

                                News

                                Collapse

                                Topics Statistics Last Post
                                Started by seqadmin, Yesterday, 05:31 AM
                                0 responses
                                10 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 10-24-2024, 06:58 AM
                                0 responses
                                20 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 10-23-2024, 08:43 AM
                                0 responses
                                48 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 10-17-2024, 07:29 AM
                                0 responses
                                58 views
                                0 likes
                                Last Post seqadmin  
                                Working...
                                X