Announcement

Collapse
No announcement yet.

DEGseq

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • I would like know what the letters NA means as a result of DEGSeq.
    Another question: log2 is two fold change or four fold change
    thanks

    Comment


    • Originally posted by Sol View Post
      I would like know what the letters NA means as a result of DEGSeq.
      Another question: log2 is two fold change or four fold change
      thanks
      Thanks for your question.

      NA: when the read counts for a gene in both samples are zero, or zero and a small number (say, <5), the program will not calculate the values (such as fold-change, p-value) for this gene. "NA"s appear in those places.

      log2 means base-2 logarithm. So
      if fold-change = 1, log2(fold-change) = 0;
      if fold-change = 2, log2(fold-change) = 1;
      if fold-change = 4, log2(fold-change) = 2;
      if fold-change = 0.5, log2(fold-change) = -1.
      Xi Wang

      Comment


      • How do you do to calculated the cutoff in the value the DEGseq, in pvalue. cutoff = 2 for example
        thanks

        Comment


        • Originally posted by Sol View Post
          How do you do to calculated the cutoff in the value the DEGseq, in pvalue. cutoff = 2 for example
          thanks
          The cufoffs are specified by users. If you ask how to calculate the p-values, please refer to our paper: http://bioinformatics.oxfordjournals.../full/26/1/136

          BTW, p-value should be any real number between 0 and 1.
          Xi Wang

          Comment


          • HI,
            Using the sam to bed Perl script, I got the file like

            chr1 435837 435913 U0 0 +
            chr1 435837 435913 U0 0 -
            chr1 435837 435913 U1 0 -
            chr1 435838 435914 U1 0 +
            chr1 435838 435914 U1 0 -
            chr1 435838 435914 U1 0 -
            chr1 435840 435916 U2 0 -
            chr1 435840 435916 U2 0 -
            chr1 435840 435916 U3 0 -
            chr1 435840 435916 U2 0 -
            chr1 435842 435918 U4 0 -
            chr1 435842 435918 U4 0 -
            chr1 435844 435920 U2 0 -
            chr1 435844 435920 U2 0 -
            chr1 437189 437265 U2 0 +

            Could someone explain how U0, U1, U2 are assigned and
            what they are?

            Thanks,

            Comment


            • Originally posted by wdt View Post
              HI,
              Using the sam to bed Perl script, I got the file like

              chr1 435837 435913 U0 0 +
              chr1 435837 435913 U0 0 -
              chr1 435837 435913 U1 0 -
              chr1 435838 435914 U1 0 +
              chr1 435838 435914 U1 0 -
              chr1 435838 435914 U1 0 -
              chr1 435840 435916 U2 0 -
              chr1 435840 435916 U2 0 -
              chr1 435840 435916 U3 0 -
              chr1 435840 435916 U2 0 -
              chr1 435842 435918 U4 0 -
              chr1 435842 435918 U4 0 -
              chr1 435844 435920 U2 0 -
              chr1 435844 435920 U2 0 -
              chr1 437189 437265 U2 0 +

              Could someone explain how U0, U1, U2 are assigned and
              what they are?

              Thanks,
              U (unique) means the uniquely mapped reads. Maybe the script regards all the reads as unique reads.

              And the integer means the number of mismatches.
              Xi Wang

              Comment


              • I have RNA-seq data analyzed using tophat that generated bam files for each sample.
                Each group (cases/controls) has 5 samples each.
                Would the following be correct way to use DEGseq
                1. Convert BAMs to SAM to BED using samtools + sam2bed.pl
                2. Use DEGseq samWrapper to test 5 samples in one group with 5 samples in the other
                to identify diff expressed genes?

                Thanks a lot!
                Last edited by wdt; 11-23-2010, 09:20 PM.

                Comment


                • Originally posted by wdt View Post
                  I have RNA-seq data analyzed using tophat that generated bam files for each sample.
                  Each group (cases/controls) has 5 samples each.
                  Would the following be correct way to use DEGseq
                  1. Convert BAMs to BED using sam2bed.pl
                  2. Use DEGseq samWrapper to test 5 samples in one group with 5 samples in the other
                  to identify diff expressed genes?

                  Thanks a lot!
                  Agreed. But please note that you need first convert BAM to SAM using samtools.
                  Xi Wang

                  Comment


                  • Many thanks for your quick replies about the DEGseq.

                    Once BED files are provided, does DEGseq internally compute "raw counts" that are used for differential exp analysis?

                    Is there a way to output those raw counts (or equivalent numbers) per sample?

                    Thanks a lot!

                    Comment


                    • Originally posted by wdt View Post
                      Many thanks for your quick replies about the DEGseq.

                      Once BED files are provided, does DEGseq internally compute "raw counts" that are used for differential exp analysis?

                      Is there a way to output those raw counts (or equivalent numbers) per sample?

                      Thanks a lot!
                      you can use the script below.

                      Code:
                      refFlat <- "refFlat.txt"
                      mapResultBatch = c("sample1","sample2","sample3","...") # replace the file names accordingly
                      geneExpr <- "geneExpr.txt"   # you may specify the file name to save the gene expresion values
                      getGeneExp(mapResultBatch, refFlat=refFlat, output=geneExpr)
                      Xi Wang

                      Comment


                      • help With DEGseq

                        Hello all,

                        I have a 1.0 GB data file and was wondering how long it would take for the program to load this data? All i get after showing the path to sample A, is a spinning ball (mac) that keeps going on for half hour. I dont get the R prompt again and I just kill the process thinking some thing is wrong. Do i have to be patient ? The computer has 8 gb ram if that help. So please let me know.

                        Sample bed format file using the samtobed script

                        chr1 15562 15637 ILLUMINA-927B2F_0001:1:110:7901:1208#0/1 10 +
                        chr1 15564 15636 ILLUMINA-927B2F_0001:1:92:5422:11873#0/1 10 +
                        chr1 15564 15636 ILLUMINA-927B2F_0001:1:117:10103:16792#0/1 10 +
                        chr1 16084 16159 ILLUMINA-927B2F_0001:1:3:3987:6468#0/1 10 -

                        So please let me know if its the format or i just need the patience.
                        Last edited by newbietonextgen; 12-06-2010, 08:07 AM.

                        Comment


                        • Originally posted by newbietonextgen View Post
                          Hello all,

                          I have a 1.0 GB data file and was wondering how long it would take for the program to load this data? All i get after showing the path to sample A, is a spinning ball (mac) that keeps going on for half hour. I just kill the process thinking some thing is wrong. Do i have to be patient ? The computer has 8 gb ram if that help. So please let me know. Thanks
                          What kind of data file you fed to DEGseq, BED, BAM? Usually, it couldn't need to take so much time to load 1GB data.
                          Xi Wang

                          Comment


                          • Thanks Xi for the quick reply. It was a BED format file. I converted using the samTobed tools.

                            Comment


                            • Originally posted by newbietonextgen View Post
                              Thanks Xi for the quick reply. It was a BED format file. I converted using the samTobed tools.
                              I just saw you updated the message.
                              Were there any screen display?
                              Xi Wang

                              Comment


                              • No. I have tried both formats: giving the path to the file and then setting up the working dir and then naming the file. I am using a 64 bit R and i am nots sure if it a problem with it.

                                This is how the console looks:
                                >library(DEGseq)
                                Loading required package: qvalue
                                Loading Tcl/Tk interface
                                > sample A <- "path to the file (bed.txt)"
                                |

                                So there was no screen message after i hit return...

                                Comment

                                Working...
                                X