Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • what tool can i use for illumina-solexa data RNA-seq

    Hi everybody im new y this, im from mexico and i have a data from total RNAm in one condition of the bacteria R. etli the first what i see is tha some intergenic region got a read, that the orfs have diferents numbers of reads.

    the cuestion is how can i normalize the data? of the RNA-seq
    if i have to use RPKM?
    is exist any tool to normalize the results from teh illumina?

    remember that i just have the results of one condition and i think that i have to normalize this data to next compare whit other condition.



    thank you everybody for your time and im sorry for the spell

  • #2
    Originally posted by Gamaliel View Post
    Hi everybody im new y this, im from mexico and i have a data from total RNAm in one condition of the bacteria R. etli the first what i see is tha some intergenic region got a read, that the orfs have diferents numbers of reads.

    the cuestion is how can i normalize the data? of the RNA-seq
    if i have to use RPKM?
    is exist any tool to normalize the results from teh illumina?

    remember that i just have the results of one condition and i think that i have to normalize this data to next compare whit other condition.



    thank you everybody for your time and im sorry for the spell
    I can guess what you meant to type. RPKM is recommended if you want to do the comparison between samples. A tool named "cufflink" can be used to carry out your job.
    Xi Wang

    Comment


    • #3
      Originally posted by Xi Wang View Post
      I can guess what you meant to type. RPKM is recommended if you want to do the comparison between samples. A tool named "cufflink" can be used to carry out your job.


      Hi Xi wang, the problem is that i have to normalize one condition firts because i have estaocastic reads and i need to know in just one condition what genes are trasncribed (determinat transcription) and what are trasncribed product of the estocastic level (inespesific trascription). after compare to other condition.


      thanx for your time and help

      Comment


      • #4
        Originally posted by Gamaliel View Post
        Hi Xi wang, the problem is that i have to normalize one condition firts because i have estaocastic reads and i need to know in just one condition what genes are trasncribed (determinat transcription) and what are trasncribed product of the estocastic level (inespesific trascription). after compare to other condition.


        thanx for your time and help
        Can I understand your question in this way: as there are some background noise of the RNA-seq data, some regions may have reads, but the DNA don't transcribe. I think there are lots of researcher are doing this project or related, and I did not find any publication yet. A naive way could be filtering out the regions with RPKM value less than a given cutoff (say, 1 RPKM). And then you can compare the remaining between conditions.
        Xi Wang

        Comment


        • #5
          Originally posted by Xi Wang View Post
          Can I understand your question in this way: as there are some background noise of the RNA-seq data, some regions may have reads, but the DNA don't transcribe. I think there are lots of researcher are doing this project or related, and I did not find any publication yet. A naive way could be filtering out the regions with RPKM value less than a given cutoff (say, 1 RPKM). And then you can compare the remaining between conditions.



          hi xi wang thanx again, yes i mean it the background of my RNAseq data, my other cuestion is: in the genome there are genes of diferents legths for example:

          if i get 2 genes one gene size 1.2kb gen "a" and the other size 800pb gen "b"

          gen "a" __________________________ 1.2kb
          _________
          _______ ________

          imagine that i have for this gen "a" 50 reads


          gen "b" ________________ 800pb
          _____ ______ ____
          _____ _____ ____

          and for the gen "b" i have 48 reads

          my question is if the size of the orf import, because the gen "a" got more reads than gen "b" just for the size and not for is transcribed more than the gen "b"

          well thanx for all your suport and help, have a good day

          Comment


          • #6
            If the reads along the transcripts follow the uniform distribution assumption, you can the RPKM concept to calculate the proportion of trancribed copies for different genes. RPKM means reads per kilo-base of transcript per million reads. Taking your example, suppose the total reads of the experiment is 10 million, the RPKM for gene "a" is 50/(1.2k)/(10M)=4.17, which RPKN for gene "b" 48/(0.8k)/(10M)=6. So, gene "b" has a higher expression level than gene "a".
            Xi Wang

            Comment


            • #7
              Originally posted by Xi Wang View Post
              If the reads along the transcripts follow the uniform distribution assumption, you can the RPKM concept to calculate the proportion of trancribed copies for different genes. RPKM means reads per kilo-base of transcript per million reads. Taking your example, suppose the total reads of the experiment is 10 million, the RPKM for gene "a" is 50/(1.2k)/(10M)=4.17, which RPKN for gene "b" 48/(0.8k)/(10M)=6. So, gene "b" has a higher expression level than gene "a".


              Hi Xi Wang i understand you. but the reads along the transcripts are not uniform it mean the covert is difrent a long the ORF (transcrpts). thanx for your help this is my email [email protected] if you need some help too.

              Comment


              • #8
                Thanks Gamaliel.

                So your difficulty is to estimate the gene expression levels from ununiformly distributed reads, right? First, RNA-seq experiments following the random priming protocol are supposed to generate uniformly distributed reads from transcripts' 5'ends to 3'ends. However, I still saw some ununiformity on our data, but RPKM still worked. I think if the reads in your data is not extremely ununiformly distributed, RPKM still works. Second, if you still concern the ununiformity, you can give up the concept of ORFs, but use sliding windows to scan the genome, to see the read enrichment. Maybe the sliding window size is a key parameter.
                Xi Wang

                Comment


                • #9
                  Originally posted by Xi Wang View Post
                  Thanks Gamaliel.

                  So your difficulty is to estimate the gene expression levels from ununiformly distributed reads, right? First, RNA-seq experiments following the random priming protocol are supposed to generate uniformly distributed reads from transcripts' 5'ends to 3'ends. However, I still saw some ununiformity on our data, but RPKM still worked. I think if the reads in your data is not extremely ununiformly distributed, RPKM still works. Second, if you still concern the ununiformity, you can give up the concept of ORFs, but use sliding windows to scan the genome, to see the read enrichment. Maybe the sliding window size is a key parameter.

                  Hi Xi Wang, sorry to answer leate back. im already at home.
                  thanx for your helpe. i intersting to use the RPKM but i can`t run my data, i think because my reference genome need to be in other extension is that right?.


                  thanx for your help.
                  have a good day

                  Comment


                  • #10
                    i think is a UCSC refFlat format. how can i created that archive

                    Comment


                    • #11
                      Originally posted by Gamaliel View Post
                      Hi Xi Wang, sorry to answer leate back. im already at home.
                      thanx for your helpe. i intersting to use the RPKM but i can`t run my data, i think because my reference genome need to be in other extension is that right?.


                      thanx for your help.
                      have a good day
                      I am also at home now. :-)

                      To answer your question, I need to know which tool you used to calculate the RPKM values.

                      Thanks!
                      Xi Wang

                      Comment


                      • #12
                        Originally posted by Gamaliel View Post
                        i think is a UCSC refFlat format. how can i created that archive
                        You can download the archive from UCSC genome browser. Eg, for hg18 refSeq genes, follow the link below:

                        Xi Wang

                        Comment


                        • #13
                          Hi Xi Wang
                          I found your replies in this thread very informative. I have a further inquiry. For differential gene expression analysis (between two samples), do we need to transform and normalize the RPKM values (obtained for genes of individual samples)? I guess RPKM values are already obtained by normalization

                          Comment


                          • #14
                            Originally posted by bansal_raman View Post
                            Hi Xi Wang
                            I found your replies in this thread very informative. I have a further inquiry. For differential gene expression analysis (between two samples), do we need to transform and normalize the RPKM values (obtained for genes of individual samples)? I guess RPKM values are already obtained by normalization
                            Hi,

                            It has been realized that the further normalization is needed if the total numbers of expressed RNA molecules are different in two samples. See the reference:
                            Robinson MD, Oshlack A.A scaling normalization method for differential expression analysis of RNA-seq data.Genome Biol. 2010;11(3):R25. Epub 2010 Mar.
                            Xi Wang

                            Comment


                            • #15
                              Thanks Xi,
                              Do you think that the quality controls like box chart can help to determine if further normalization is required or not?

                              Comment

                              Latest Articles

                              Collapse

                              • seqadmin
                                Best Practices for Single-Cell Sequencing Analysis
                                by seqadmin



                                While isolating and preparing single cells for sequencing was historically the bottleneck, recent technological advancements have shifted the challenge to data analysis. This highlights the rapidly evolving nature of single-cell sequencing. The inherent complexity of single-cell analysis has intensified with the surge in data volume and the incorporation of diverse and more complex datasets. This article explores the challenges in analysis, examines common pitfalls, offers...
                                06-06-2024, 07:15 AM
                              • seqadmin
                                Latest Developments in Precision Medicine
                                by seqadmin



                                Technological advances have led to drastic improvements in the field of precision medicine, enabling more personalized approaches to treatment. This article explores four leading groups that are overcoming many of the challenges of genomic profiling and precision medicine through their innovative platforms and technologies.

                                Somatic Genomics
                                “We have such a tremendous amount of genetic diversity that exists within each of us, and not just between us as individuals,”...
                                05-24-2024, 01:16 PM

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by seqadmin, Yesterday, 06:58 AM
                              0 responses
                              13 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 06-06-2024, 08:18 AM
                              0 responses
                              20 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 06-06-2024, 08:04 AM
                              0 responses
                              18 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 06-03-2024, 06:55 AM
                              0 responses
                              13 views
                              0 likes
                              Last Post seqadmin  
                              Working...
                              X