Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • what tool can i use for illumina-solexa data RNA-seq

    Hi everybody im new y this, im from mexico and i have a data from total RNAm in one condition of the bacteria R. etli the first what i see is tha some intergenic region got a read, that the orfs have diferents numbers of reads.

    the cuestion is how can i normalize the data? of the RNA-seq
    if i have to use RPKM?
    is exist any tool to normalize the results from teh illumina?

    remember that i just have the results of one condition and i think that i have to normalize this data to next compare whit other condition.



    thank you everybody for your time and im sorry for the spell

  • #2
    Originally posted by Gamaliel View Post
    Hi everybody im new y this, im from mexico and i have a data from total RNAm in one condition of the bacteria R. etli the first what i see is tha some intergenic region got a read, that the orfs have diferents numbers of reads.

    the cuestion is how can i normalize the data? of the RNA-seq
    if i have to use RPKM?
    is exist any tool to normalize the results from teh illumina?

    remember that i just have the results of one condition and i think that i have to normalize this data to next compare whit other condition.



    thank you everybody for your time and im sorry for the spell
    I can guess what you meant to type. RPKM is recommended if you want to do the comparison between samples. A tool named "cufflink" can be used to carry out your job.
    Xi Wang

    Comment


    • #3
      Originally posted by Xi Wang View Post
      I can guess what you meant to type. RPKM is recommended if you want to do the comparison between samples. A tool named "cufflink" can be used to carry out your job.


      Hi Xi wang, the problem is that i have to normalize one condition firts because i have estaocastic reads and i need to know in just one condition what genes are trasncribed (determinat transcription) and what are trasncribed product of the estocastic level (inespesific trascription). after compare to other condition.


      thanx for your time and help

      Comment


      • #4
        Originally posted by Gamaliel View Post
        Hi Xi wang, the problem is that i have to normalize one condition firts because i have estaocastic reads and i need to know in just one condition what genes are trasncribed (determinat transcription) and what are trasncribed product of the estocastic level (inespesific trascription). after compare to other condition.


        thanx for your time and help
        Can I understand your question in this way: as there are some background noise of the RNA-seq data, some regions may have reads, but the DNA don't transcribe. I think there are lots of researcher are doing this project or related, and I did not find any publication yet. A naive way could be filtering out the regions with RPKM value less than a given cutoff (say, 1 RPKM). And then you can compare the remaining between conditions.
        Xi Wang

        Comment


        • #5
          Originally posted by Xi Wang View Post
          Can I understand your question in this way: as there are some background noise of the RNA-seq data, some regions may have reads, but the DNA don't transcribe. I think there are lots of researcher are doing this project or related, and I did not find any publication yet. A naive way could be filtering out the regions with RPKM value less than a given cutoff (say, 1 RPKM). And then you can compare the remaining between conditions.



          hi xi wang thanx again, yes i mean it the background of my RNAseq data, my other cuestion is: in the genome there are genes of diferents legths for example:

          if i get 2 genes one gene size 1.2kb gen "a" and the other size 800pb gen "b"

          gen "a" __________________________ 1.2kb
          _________
          _______ ________

          imagine that i have for this gen "a" 50 reads


          gen "b" ________________ 800pb
          _____ ______ ____
          _____ _____ ____

          and for the gen "b" i have 48 reads

          my question is if the size of the orf import, because the gen "a" got more reads than gen "b" just for the size and not for is transcribed more than the gen "b"

          well thanx for all your suport and help, have a good day

          Comment


          • #6
            If the reads along the transcripts follow the uniform distribution assumption, you can the RPKM concept to calculate the proportion of trancribed copies for different genes. RPKM means reads per kilo-base of transcript per million reads. Taking your example, suppose the total reads of the experiment is 10 million, the RPKM for gene "a" is 50/(1.2k)/(10M)=4.17, which RPKN for gene "b" 48/(0.8k)/(10M)=6. So, gene "b" has a higher expression level than gene "a".
            Xi Wang

            Comment


            • #7
              Originally posted by Xi Wang View Post
              If the reads along the transcripts follow the uniform distribution assumption, you can the RPKM concept to calculate the proportion of trancribed copies for different genes. RPKM means reads per kilo-base of transcript per million reads. Taking your example, suppose the total reads of the experiment is 10 million, the RPKM for gene "a" is 50/(1.2k)/(10M)=4.17, which RPKN for gene "b" 48/(0.8k)/(10M)=6. So, gene "b" has a higher expression level than gene "a".


              Hi Xi Wang i understand you. but the reads along the transcripts are not uniform it mean the covert is difrent a long the ORF (transcrpts). thanx for your help this is my email [email protected] if you need some help too.

              Comment


              • #8
                Thanks Gamaliel.

                So your difficulty is to estimate the gene expression levels from ununiformly distributed reads, right? First, RNA-seq experiments following the random priming protocol are supposed to generate uniformly distributed reads from transcripts' 5'ends to 3'ends. However, I still saw some ununiformity on our data, but RPKM still worked. I think if the reads in your data is not extremely ununiformly distributed, RPKM still works. Second, if you still concern the ununiformity, you can give up the concept of ORFs, but use sliding windows to scan the genome, to see the read enrichment. Maybe the sliding window size is a key parameter.
                Xi Wang

                Comment


                • #9
                  Originally posted by Xi Wang View Post
                  Thanks Gamaliel.

                  So your difficulty is to estimate the gene expression levels from ununiformly distributed reads, right? First, RNA-seq experiments following the random priming protocol are supposed to generate uniformly distributed reads from transcripts' 5'ends to 3'ends. However, I still saw some ununiformity on our data, but RPKM still worked. I think if the reads in your data is not extremely ununiformly distributed, RPKM still works. Second, if you still concern the ununiformity, you can give up the concept of ORFs, but use sliding windows to scan the genome, to see the read enrichment. Maybe the sliding window size is a key parameter.

                  Hi Xi Wang, sorry to answer leate back. im already at home.
                  thanx for your helpe. i intersting to use the RPKM but i can`t run my data, i think because my reference genome need to be in other extension is that right?.


                  thanx for your help.
                  have a good day

                  Comment


                  • #10
                    i think is a UCSC refFlat format. how can i created that archive

                    Comment


                    • #11
                      Originally posted by Gamaliel View Post
                      Hi Xi Wang, sorry to answer leate back. im already at home.
                      thanx for your helpe. i intersting to use the RPKM but i can`t run my data, i think because my reference genome need to be in other extension is that right?.


                      thanx for your help.
                      have a good day
                      I am also at home now. :-)

                      To answer your question, I need to know which tool you used to calculate the RPKM values.

                      Thanks!
                      Xi Wang

                      Comment


                      • #12
                        Originally posted by Gamaliel View Post
                        i think is a UCSC refFlat format. how can i created that archive
                        You can download the archive from UCSC genome browser. Eg, for hg18 refSeq genes, follow the link below:

                        Xi Wang

                        Comment


                        • #13
                          Hi Xi Wang
                          I found your replies in this thread very informative. I have a further inquiry. For differential gene expression analysis (between two samples), do we need to transform and normalize the RPKM values (obtained for genes of individual samples)? I guess RPKM values are already obtained by normalization

                          Comment


                          • #14
                            Originally posted by bansal_raman View Post
                            Hi Xi Wang
                            I found your replies in this thread very informative. I have a further inquiry. For differential gene expression analysis (between two samples), do we need to transform and normalize the RPKM values (obtained for genes of individual samples)? I guess RPKM values are already obtained by normalization
                            Hi,

                            It has been realized that the further normalization is needed if the total numbers of expressed RNA molecules are different in two samples. See the reference:
                            Robinson MD, Oshlack A.A scaling normalization method for differential expression analysis of RNA-seq data.Genome Biol. 2010;11(3):R25. Epub 2010 Mar.
                            Xi Wang

                            Comment


                            • #15
                              Thanks Xi,
                              Do you think that the quality controls like box chart can help to determine if further normalization is required or not?

                              Comment

                              Latest Articles

                              Collapse

                              • seqadmin
                                Recent Advances in Sequencing Analysis Tools
                                by seqadmin


                                The sequencing world is rapidly changing due to declining costs, enhanced accuracies, and the advent of newer, cutting-edge instruments. Equally important to these developments are improvements in sequencing analysis, a process that converts vast amounts of raw data into a comprehensible and meaningful form. This complex task requires expertise and the right analysis tools. In this article, we highlight the progress and innovation in sequencing analysis by reviewing several of the...
                                05-06-2024, 07:48 AM
                              • seqadmin
                                Essential Discoveries and Tools in Epitranscriptomics
                                by seqadmin




                                The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
                                04-22-2024, 07:01 AM

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by seqadmin, 05-10-2024, 06:35 AM
                              0 responses
                              20 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 05-09-2024, 02:46 PM
                              0 responses
                              26 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 05-07-2024, 06:57 AM
                              0 responses
                              21 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 05-06-2024, 07:17 AM
                              0 responses
                              21 views
                              0 likes
                              Last Post seqadmin  
                              Working...
                              X