Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Originally posted by dpryan View Post
    We should setup a consultation contract



    If you just want a quick and dirty check then just mapping reads to that gene should suffice. Just tweak your settings to only permit perfect or near perfect matches.



    That should suffice. If you need to know exact read numbers or you need the alignments for SNP calling, then this method isn't ideal. In those cases, you would really need to map to the entire genome so as to not bias alignments (this is also why I suggested only accepting near-perfect matches above).

    Hello D,
    Sorry to trouble you again in this issue.
    If I want to see whether a sequence (supposed named "GQ2") is expressed in bovine cells. I followed the solution in our last discussion: 'firstly I will generate the bowtie2 index of this "GQ2" based on the fasta sequence, and then map my fastq file to that "GQ2" genome.'
    But when I did it, I got the error.
    After I ran the bowtie2, it goes to the endless loop (obviously it is error). If I terminate the bowtie2, I found the error.
    "bowtie2-align died with signal 2 (INT)"

    I have googled the error but I didn't find any solution about this error. some post said it may due to memory problem but I don't think it is my problem. Hence I am wondering are my command lines wrong?

    My step:
    (1) generate the bowtie2 index of GQ2.

    >GQ2
    ATGGAGCACTTTCCCCGCTGTGTGCACGAGTCCTGGGGTTCCTCAAAGGA
    GCCCCAGAAAACAGAGGTGCTGCAACTCTTGAGCTTAGCGGACCCTGAGG
    .....

    mkdir GQ2Bowtie2Index
    cd GQ2Bowtie2Index
    bowtie2 GQ2.fa GQ2
    ls
    GQ2.1.bt2 GQ2.2.bt2 GQ2.3.bt2 GQ2.4.bt2 GQ2.fa GQ2.rev.1.bt2 GQ2.rev.2.bt2

    (2) map the gene on to 'GQ2' genome

    bowtie2 --local --very-sensitive-local -p 8 ./GQ2Bowtie2Index/GQ2 -U bovine_sample1.fastq

    then get the error!!!
    I really don't know why I am wrong ...
    Thank you!
    Attached Files
    Last edited by super0925; 07-31-2014, 02:17 AM.

    Comment


    • You forgot the "-x" before "GQ2Bowtie2Index/GQ2". I presume that was just a typo in this post, though, so you'd have to run bowtie2 in a debugger to find the source of the error. It's likely that this sort of thing is a bug rather than you doing something wrong.

      Comment


      • Originally posted by dpryan View Post
        You forgot the "-x" before "GQ2Bowtie2Index/GQ2". I presume that was just a typo in this post, though, so you'd have to run bowtie2 in a debugger to find the source of the error. It's likely that this sort of thing is a bug rather than you doing something wrong.
        Sorry D, if I add the "-x" , I still get the same error... T__T
        Last edited by super0925; 07-31-2014, 05:30 AM.

        Comment


        • It's probably a bug in bowtie2 then. If you can subset your fastq file to a reasonable size and can still reproduce the issue then you can either post that and the gene you're aligning against and I'll have a look or you can then just directly submit a bug report to the bowtie2 authors.

          BTW, you don't need to specify --local if you also specify --very-sensitive-local.

          Comment


          • Originally posted by dpryan View Post
            It's probably a bug in bowtie2 then. If you can subset your fastq file to a reasonable size and can still reproduce the issue then you can either post that and the gene you're aligning against and I'll have a look or you can then just directly submit a bug report to the bowtie2 authors.

            BTW, you don't need to specify --local if you also specify --very-sensitive-local.

            Hi teacher, another question again.
            I am analysising 6 samples in 2 conditions.
            From Cuffdiff2, I got ~700 DE genes (default is Q <0.05 , you got it)
            But if I use edgeR and DESeq2, I didn't find any DE genes by the default threshold (i.e. Q<0.05).
            Why? Do I need to change any setting or parameters? (currently I use default)
            Could I insist use Tuxedo? or change parameter (e.g. P value) for count-based methods?

            I have attached the MA plot and MDS plot.

            Thank you!
            Attached Files

            Comment


            • You can always change the threshold for significance a bit (0.1 is very common). I'm not familiar enough with the inner workings of cuffdiff2 to provide any insights there.

              Comment


              • Originally posted by dpryan View Post
                You can always change the threshold for significance a bit (0.1 is very common). I'm not familiar enough with the inner workings of cuffdiff2 to provide any insights there.
                Thank you D. So from the MDS and MA plots are looked like OK (cause they are not totally separate between two groups)?
                Last edited by super0925; 08-13-2014, 11:56 PM.

                Comment


                • From the MDS plot, I'd guess that edgeR and DESeq2 are correct and there aren't any DE genes, but it's usually a good idea to not read too much into these. At the very least it's likely that the experiment is underpowered.

                  Comment


                  • Originally posted by dpryan View Post
                    From the MDS plot, I'd guess that edgeR and DESeq2 are correct and there aren't any DE genes, but it's usually a good idea to not read too much into these. At the very least it's likely that the experiment is underpowered.
                    Hi D
                    A quick question
                    I have shown that my results comparison among the pipelines to you before.

                    tophat-bam-cuffdiff2 (I don't use cufflinks cause I don't need to finid the new genes)
                    tophat-htseq-deseq2/edgeR

                    I found that the Cuffdiff2 is more liberal than edgeR.

                    However, from the paper
                    Recent advances in next-generation sequencing technology allow high-throughput cDNA sequencing (RNA-Seq) to be widely applied in transcriptomic studies, in particular for detecting differentially expressed genes between groups. Many software packages have been developed for the identification of differentially expressed genes (DEGs) between treatment groups based on RNA-Seq data. However, there is a lack of consensus on how to approach an optimal study design and choice of suitable software for the analysis. In this comparative study we evaluate the performance of three of the most frequently used software tools: Cufflinks-Cuffdiff2, DESeq and edgeR. A number of important parameters of RNA-Seq technology were taken into consideration, including the number of replicates, sequencing depth, and balanced vs. unbalanced sequencing depth within and between groups. We benchmarked results relative to sets of DEGs identified through either quantitative RT-PCR or microarray. We observed that edgeR performs slightly better than DESeq and Cuffdiff2 in terms of the ability to uncover true positives. Overall, DESeq or taking the intersection of DEGs from two or more tools is recommended if the number of false positives is a major concern in the study. In other circumstances, edgeR is slightly preferable for differential expression analysis at the expense of potentially introducing more false positives.


                    it seems that edgeR got most True positive.

                    Could it happen (i.e. Cuffdiff2 is more liberal) because of different animals? I have done it on bovine samples. Or because the problem of my scripts?
                    I used the default parameter in Cuffdiff2 and edgeR/DESeq2.
                    Thank you!
                    Attached Files

                    Comment


                    • I expect that the tool that gets the most true positives will vary by dataset. This will also vary wildly by cuffdiff version.

                      Comment


                      • Originally posted by dpryan View Post
                        I expect that the tool that gets the most true positives will vary by dataset. This will also vary wildly by cuffdiff version.
                        Thank you D. Very useful! I got it
                        So do you think it is due to I didn't do Cufflinks? Cause I omit this steps that I don't need find new genes and assemble the transcripts. And for the counts based approach (edgeR/DESeq2), we don't do the cufflinks step as well.

                        Comment


                        • I don't think using cufflinks or not has anything to do with it. Every tool makes assumptions, it's just a matter of which tool's assumptions happen to match your particular dataset better.

                          Comment


                          • Hi D,
                            Just two questions.
                            1)I am doing RNA-Seq analysis (reads mapped to human genome) and want to find significant Differential expressed human genes, but I also find there is some reads counts mapped to miRNA, is it normal? Do I need to separate them to analysis it (cause I don't is it could impact the reads distripbution) ? Or together analysis it?
                            2)BTW, the log fold change inf or -inf (I know what does it means) but with Q< 0.05 , is it useful for downstream analysis?

                            Comment


                            • 1) A few reads here and there being mapped to miRNAs is pretty common. You'll generally not get enough reads mapping to miRNAs when you look at mRNA to have decent statistical power. However, if you do, then I suppose you can go ahead and use those reads. Have a look at them and see if they're of the mature form or the pri-miRNA, which would seem more likely.

                              2) It could be useful, though that's most common among poorly expressed genes (rather than those with some level of detectable baseline expression). In any case, the real fold-change is unlikely to be infinite, that's just the estimate given the input data.

                              Comment


                              • Originally posted by dpryan View Post
                                1) A few reads here and there being mapped to miRNAs is pretty common. You'll generally not get enough reads mapping to miRNAs when you look at mRNA to have decent statistical power. However, if you do, then I suppose you can go ahead and use those reads. Have a look at them and see if they're of the mature form or the pri-miRNA, which would seem more likely.

                                2) It could be useful, though that's most common among poorly expressed genes (rather than those with some level of detectable baseline expression). In any case, the real fold-change is unlikely to be infinite, that's just the estimate given the input data.

                                Thank you D, very useful.
                                We extracted the RNAs from the cells using tryzol method followed by a clean up using RNAeasy Mini kit. And then remove most of the RNA shorter than 200 nts in length. Thus, most of the miRNA reads may come from contaminant and then are not relevant for our analysis. So do you think it will affect our analysis, do we need to do separately analysis without miRNA? If the result is still OK as you said and will not impact the DE human genes, we are glad. If not , So far I don't know about how to remove those miRNA from the analysis.

                                Comment

                                Latest Articles

                                Collapse

                                • seqadmin
                                  Strategies for Sequencing Challenging Samples
                                  by seqadmin


                                  Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                                  03-22-2024, 06:39 AM
                                • seqadmin
                                  Techniques and Challenges in Conservation Genomics
                                  by seqadmin



                                  The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                                  Avian Conservation
                                  Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                                  03-08-2024, 10:41 AM

                                ad_right_rmr

                                Collapse

                                News

                                Collapse

                                Topics Statistics Last Post
                                Started by seqadmin, 03-27-2024, 06:37 PM
                                0 responses
                                13 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 03-27-2024, 06:07 PM
                                0 responses
                                11 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 03-22-2024, 10:03 AM
                                0 responses
                                53 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 03-21-2024, 07:32 AM
                                0 responses
                                69 views
                                0 likes
                                Last Post seqadmin  
                                Working...
                                X