Seqanswers Leaderboard Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • ytmnd85
    Member
    • Apr 2009
    • 10

    Illumina - RNA Seq. - Gene Expression Analysis

    Hello all,
    I have just delved into the world of RNA-Sequencing via Illumina, and as such, do not have much knowledge on the software available. Our lab has just performed RNA-seq. analysis on soybean RNA. The current goal of our lab is to compare our soybean samples and attempt to detect genes that are upregulated or downregulated between the different samples. I've noticed that Illumina offers a software package called "GenomeStudio" that can do analysis like this, but it currently seems to only be compatible with human, mouse, or rat genomes. Does anyone have suggestions on available software that could be used to compare transcription levels between samples? We just received our data from Illumina today, and the reads have been aligned to the soybean genome using Eland.

    Thank you for your advice!
  • alim
    Member
    • Jun 2008
    • 14

    #2
    Is there a soybean reference genome available ?

    If there is, then consider using TopHat or ERANGE (use the later only if you are ambitious enough to build your own Cistematic genome) to get RPKM numbers. Once you have those, you can treat them as microarray intensities & put them through the same normalization procedures as microarrays, etc...

    Ali

    Comment

    • ytmnd85
      Member
      • Apr 2009
      • 10

      #3
      Yes, a soy reference genome is available at phytozome. I'm not exactly sure how complete it is. It's currently in its first chromosome-scale assembly.

      Comment

      • alim
        Member
        • Jun 2008
        • 14

        #4
        Good - then take a look at TopHat & see if it does what you need it to do!

        Comment

        • liux
          Member
          • Mar 2009
          • 30

          #5
          DNAstar has a new add-on RNAseq module, Qseq, for its ArrayStar. You should be able to get a 30 day free demo.

          Comment

          • Malabady
            Member
            • Apr 2009
            • 12

            #6
            yes there is a reference genome for Soybean made by JGI and DOE. Here is a link to the Soybean:

            Comment

            • elaney_k
              Member
              • Mar 2008
              • 55

              #7
              If you have a reference genome then you can use Genome Studio, the human rat and mouse genomes are just supplied for ease.

              Comment

              • kkamerath
                Junior Member
                • Jun 2008
                • 3

                #8
                Originally posted by alim View Post
                Is there a soybean reference genome available ?

                If there is, then consider using TopHat or ERANGE (use the later only if you are ambitious enough to build your own Cistematic genome) to get RPKM numbers. Once you have those, you can treat them as microarray intensities & put them through the same normalization procedures as microarrays, etc...

                Ali
                Hi Ali and all,
                If I were to be so ambitious as to want to assemble my own Cistematic genome, can you suggest where to start? I am browsing around the Cistematic code looking for some documentation on this, but haven't found any? Sorry if I missed it.

                Comment

                • schmima
                  Member
                  • Apr 2010
                  • 56

                  #9
                  Hi together

                  @kkamerath's question (even though already quite old - it may help someone else - like me ^^):
                  important: the following will NOT work if you renamed any folders or scripts in the cistematic directory. If you did so - download it again and unpack it to get a clean installation.

                  If you want to build your own cistematic genome:
                  1. Check if your genome is supported (go to .../cistematic/genomes/__init__.py and search the line:

                  supportedGenomes = [...]

                  2. If your organism is listed there: lucky, proceed with next steps. Otherwise I'm not able to help (most probably you have to write your own organism.py and update the __init__.py in the cistematic/genomes folder).

                  3. Now you need to download the files required to build the genome. I will give the example I used for making the TAIR9 genome (therefore: arabidopsis.py - if you want to get a human genome you will have a look at human.py and change it later on):

                  go to .../cistematic/genomes/arabidopsis.py and have a look at the function buildArabidopsisDB - namely search for lines where the script points to directories that potentially contain information about your genome. You will find that for arabidopsis the script requires:
                  ---> the fasta files: chr1.fas, chr2.fas, chr3.fas, chr4.fas, chr5.fas, chrM.fas, chrC.fas
                  ---> GFF3 file with genes/transposons/whatever: TAIRX_GFF3_XXX.gff
                  ---> functional descriptions: TAIRX_functional_descriptions
                  ---> GO terms: ATH_GO_GOSLIM.txt
                  Luckely, the files in the original arabidopsis.py are named exactly the way they are named on the TAIR FTP server. Therefore it is easy to find the required files. If found, download them. I guess that you will find similar things for other organisms that are supported. Just get the required files and continue with step 4.

                  4. Update the paths in the arabidopsis.py. Here the example (excerpt):
                  (...............)
                  geneDB = cisRoot + '/A_thalianaTAIR9/arabidopsis.genedb'
                  (................)
                  def buildArabidopsisDB(db=geneDB, downloadDir= cisRoot + '/A_thalianaTAIR9/FASfiles'):
                  genePath = downloadDir + '/TAIR9_GFF3_genes_transposons.gff'
                  annotPath = downloadDir + '/TAIR9_functional_descriptions'
                  goPath = downloadDir + '/ATH_GO_GOSLIM.txt'

                  chromos = {'1': downloadDir + '/chr1.fas', '2': downloadDir + '/chr2.fas',
                  '3': downloadDir + '/chr3.fas', '4': downloadDir + '/chr4.fas', '5': downloadDir + '/chr5.fas',
                  'C': downloadDir + '/chrC.fas', 'M': downloadDir + '/chrM.fas'}
                  (...............)
                  You may need several tries to get all the paths correct. Especially the "CISTEMATIC_ROOT" variable needs to be set properly before running the scripts (see below)

                  5. Now everything should be ready for building the genome. Here the commands that need to be passed to the shell (and the explanations):
                  - > very important - set PYTHONPATH AND CISTEMATIC_ROOT. I set both to the directory where the folder "cistematic" is located:

                  export PYTHONPATH=/home/marc/ERANGE/ERANGE31
                  export CISTEMATIC_ROOT=/home/marc/ERANGE/ERANGE31

                  - > open python:

                  python

                  - > import the whole cistematic package:

                  from cistematic import *

                  - > run the command that builds your genome. Note that this name will most probably be something like: genomes.GOI.buildGOIDB(). [Hint: "genomes" points to the directory genomes, "GOI" points to your GOI.py script and buildGOIDB calls the function that builds the genome (this function is defined in the GOI.py - it is where you updated the paths).

                  genomes.arabidopsis.buildArabidopsisDB()

                  -> exit python environment:

                  exit()

                  This is it. In case you also want to update the genome sizes - type in a shell:

                  python
                  from string import *
                  fasDir = '/home/marc/ERANGE/ERANGE31/A_thalianaTAIR9/FASfiles'
                  chromos = {'1': fasDir + '/chr1.fas', '2': fasDir + '/chr2.fas', '3': fasDir + '/chr3.fas', '4': fasDir + '/chr4.fas', '5': fasDir + '/chr5.fas', 'C': fasDir + '/chrC.fas', 'M': fasDir + '/chrM.fas'}

                  def chromSize(chromID, chromPath):
                  seq = ''
                  seqLen = 0
                  seqArray = []
                  inFile = open(chromPath, 'r')
                  index = 0
                  line = inFile.readline()
                  for line in inFile:
                  seqArray.append(line.strip())
                  seq = join(seqArray,'')
                  seqLen = len(seq)
                  return seqLen

                  def genomeSize():
                  chroLen = {}
                  genoLen = {}
                  for chromID in ['1', '2', '3', '4', '5', 'C', 'M']:
                  seqLen = {chromID : chromSize(chromID, chromos[chromID])}
                  chroLen.update(seqLen)
                  return chroLen

                  resultingSizes = genomeSize()
                  print resultingSizes
                  print resultingSizes.values()[0] + resultingSizes.values()[1] + resultingSizes.values()[2] + resultingSizes.values()[3] + resultingSizes.values()[4] + resultingSizes.values()[5] + resultingSizes.values()[6]
                  exit()

                  Hope this helps. In case you're not familiar with python: search a course called "Introduction to Programming using Python - Programming Course for Biologists at the pasteur institute". Thanks to chapter 14 I understood what to do ^^

                  By the way - I have a question on my own:

                  In arabidopsis.py there is the entry background = {...}. Can anyone tell me what this is/what for it is required and how it is calculated?

                  Comment

                  • Karl_JV
                    Junior Member
                    • Dec 2009
                    • 4

                    #10
                    I'm very sorry that I cannot answer your question. Would be interesting to know.

                    Would just like to thank for the superb guide. Worked very well for me!

                    The only thing I would like to add, is that the folder, in the above case "cisRoot + '/A_thalianaTAIR9'" needs to be created manually before.

                    Comment

                    Latest Articles

                    Collapse

                    • seqadmin
                      Pathogen Surveillance with Advanced Genomic Tools
                      by seqadmin




                      The COVID-19 pandemic highlighted the need for proactive pathogen surveillance systems. As ongoing threats like avian influenza and newly emerging infections continue to pose risks, researchers are working to improve how quickly and accurately pathogens can be identified and tracked. In a recent SEQanswers webinar, two experts discussed how next-generation sequencing (NGS) and machine learning are shaping efforts to monitor viral variation and trace the origins of infectious...
                      03-24-2025, 11:48 AM
                    • seqadmin
                      New Genomics Tools and Methods Shared at AGBT 2025
                      by seqadmin


                      This year’s Advances in Genome Biology and Technology (AGBT) General Meeting commemorated the 25th anniversary of the event at its original venue on Marco Island, Florida. While this year’s event didn’t include high-profile musical performances, the industry announcements and cutting-edge research still drew the attention of leading scientists.

                      The Headliner
                      The biggest announcement was Roche stepping back into the sequencing platform market. In the years since...
                      03-03-2025, 01:39 PM

                    ad_right_rmr

                    Collapse

                    News

                    Collapse

                    Topics Statistics Last Post
                    Started by seqadmin, 03-20-2025, 05:03 AM
                    0 responses
                    41 views
                    0 reactions
                    Last Post seqadmin  
                    Started by seqadmin, 03-19-2025, 07:27 AM
                    0 responses
                    51 views
                    0 reactions
                    Last Post seqadmin  
                    Started by seqadmin, 03-18-2025, 12:50 PM
                    0 responses
                    38 views
                    0 reactions
                    Last Post seqadmin  
                    Started by seqadmin, 03-03-2025, 01:15 PM
                    0 responses
                    193 views
                    0 reactions
                    Last Post seqadmin  
                    Working...