Seqanswers Leaderboard Ad

**alim** · 04-14-2009, 08:35 AM

Is there a soybean reference genome available ?

If there is, then consider using TopHat or ERANGE (use the later only if you are ambitious enough to build your own Cistematic genome) to get RPKM numbers. Once you have those, you can treat them as microarray intensities & put them through the same normalization procedures as microarrays, etc...

Ali

**ytmnd85** · 04-14-2009, 11:11 AM

Yes, a soy reference genome is available at phytozome. I'm not exactly sure how complete it is. It's currently in its first chromosome-scale assembly.

**alim** · 04-16-2009, 02:38 PM

Good - then take a look at TopHat & see if it does what you need it to do!

**liux** · 04-24-2009, 07:51 AM

DNAstar has a new add-on RNAseq module, Qseq, for its ArrayStar. You should be able to get a 30 day free demo.

**Malabady** · 04-24-2009, 12:21 PM

yes there is a reference genome for Soybean made by JGI and DOE. Here is a link to the Soybean:

Phytozome v13

http://www.phytozome.net/soybean

Tools for exploring the Phytozome collection of green plant genomes

**elaney_k** · 04-27-2009, 09:24 AM

If you have a reference genome then you can use Genome Studio, the human rat and mouse genomes are just supplied for ease.

**kkamerath** · 07-02-2009, 06:30 AM

Originally posted by alim View Post

Is there a soybean reference genome available ?

If there is, then consider using TopHat or ERANGE (use the later only if you are ambitious enough to build your own Cistematic genome) to get RPKM numbers. Once you have those, you can treat them as microarray intensities & put them through the same normalization procedures as microarrays, etc...

Ali

Hi Ali and all,
If I were to be so ambitious as to want to assemble my own Cistematic genome, can you suggest where to start? I am browsing around the Cistematic code looking for some documentation on this, but haven't found any?

Sorry if I missed it.

**schmima** · 04-25-2010, 11:50 PM

Hi together

@kkamerath's question (even though already quite old - it may help someone else - like me ^^):
important: the following will NOT work if you renamed any folders or scripts in the cistematic directory. If you did so - download it again and unpack it to get a clean installation.

If you want to build your own cistematic genome:
1. Check if your genome is supported (go to .../cistematic/genomes/__init__.py and search the line:

supportedGenomes = [...]

2. If your organism is listed there: lucky, proceed with next steps. Otherwise I'm not able to help (most probably you have to write your own organism.py and update the __init__.py in the cistematic/genomes folder).

3. Now you need to download the files required to build the genome. I will give the example I used for making the TAIR9 genome (therefore: arabidopsis.py - if you want to get a human genome you will have a look at human.py and change it later on):

go to .../cistematic/genomes/arabidopsis.py and have a look at the function buildArabidopsisDB - namely search for lines where the script points to directories that potentially contain information about your genome. You will find that for arabidopsis the script requires:
---> the fasta files: chr1.fas, chr2.fas, chr3.fas, chr4.fas, chr5.fas, chrM.fas, chrC.fas
---> GFF3 file with genes/transposons/whatever: TAIRX_GFF3_XXX.gff
---> functional descriptions: TAIRX_functional_descriptions
---> GO terms: ATH_GO_GOSLIM.txt
Luckely, the files in the original arabidopsis.py are named exactly the way they are named on the TAIR FTP server. Therefore it is easy to find the required files. If found, download them. I guess that you will find similar things for other organisms that are supported. Just get the required files and continue with step 4.

4. Update the paths in the arabidopsis.py. Here the example (excerpt):
(...............)
geneDB = cisRoot + '/A_thalianaTAIR9/arabidopsis.genedb'
(................)
def buildArabidopsisDB(db=geneDB, downloadDir= cisRoot + '/A_thalianaTAIR9/FASfiles'):
genePath = downloadDir + '/TAIR9_GFF3_genes_transposons.gff'
annotPath = downloadDir + '/TAIR9_functional_descriptions'
goPath = downloadDir + '/ATH_GO_GOSLIM.txt'

chromos = {'1': downloadDir + '/chr1.fas', '2': downloadDir + '/chr2.fas',
'3': downloadDir + '/chr3.fas', '4': downloadDir + '/chr4.fas', '5': downloadDir + '/chr5.fas',
'C': downloadDir + '/chrC.fas', 'M': downloadDir + '/chrM.fas'}
(...............)
You may need several tries to get all the paths correct. Especially the "CISTEMATIC_ROOT" variable needs to be set properly before running the scripts (see below)

5. Now everything should be ready for building the genome. Here the commands that need to be passed to the shell (and the explanations):
- > very important - set PYTHONPATH AND CISTEMATIC_ROOT. I set both to the directory where the folder "cistematic" is located:

export PYTHONPATH=/home/marc/ERANGE/ERANGE31
export CISTEMATIC_ROOT=/home/marc/ERANGE/ERANGE31

- > open python:

python

- > import the whole cistematic package:

from cistematic import *

- > run the command that builds your genome. Note that this name will most probably be something like: genomes.GOI.buildGOIDB(). [Hint: "genomes" points to the directory genomes, "GOI" points to your GOI.py script and buildGOIDB calls the function that builds the genome (this function is defined in the GOI.py - it is where you updated the paths).

genomes.arabidopsis.buildArabidopsisDB()

-> exit python environment:

exit()

This is it. In case you also want to update the genome sizes - type in a shell:

python
from string import *
fasDir = '/home/marc/ERANGE/ERANGE31/A_thalianaTAIR9/FASfiles'
chromos = {'1': fasDir + '/chr1.fas', '2': fasDir + '/chr2.fas', '3': fasDir + '/chr3.fas', '4': fasDir + '/chr4.fas', '5': fasDir + '/chr5.fas', 'C': fasDir + '/chrC.fas', 'M': fasDir + '/chrM.fas'}

def chromSize(chromID, chromPath):
seq = ''
seqLen = 0
seqArray = []
inFile = open(chromPath, 'r')
index = 0
line = inFile.readline()
for line in inFile:
seqArray.append(line.strip())
seq = join(seqArray,'')
seqLen = len(seq)
return seqLen

def genomeSize():
chroLen = {}
genoLen = {}
for chromID in ['1', '2', '3', '4', '5', 'C', 'M']:
seqLen = {chromID : chromSize(chromID, chromos[chromID])}
chroLen.update(seqLen)
return chroLen

resultingSizes = genomeSize()
print resultingSizes
print resultingSizes.values()[0] + resultingSizes.values()[1] + resultingSizes.values()[2] + resultingSizes.values()[3] + resultingSizes.values()[4] + resultingSizes.values()[5] + resultingSizes.values()[6]
exit()

Hope this helps. In case you're not familiar with python: search a course called "Introduction to Programming using Python - Programming Course for Biologists at the pasteur institute". Thanks to chapter 14 I understood what to do ^^

By the way - I have a question on my own:

In arabidopsis.py there is the entry background = {...}. Can anyone tell me what this is/what for it is required and how it is calculated?

**Karl_JV** · 08-22-2010, 11:35 PM

I'm very sorry that I cannot answer your question. Would be interesting to know.

Would just like to thank for the superb guide. Worked very well for me!

The only thing I would like to add, is that the folder, in the above case "cisRoot + '/A_thalianaTAIR9'" needs to be created manually before.

Topics	Statistics	Last Post
Gene Misexpression in the Healthy Human Population by seqadmin Started by seqadmin, Yesterday, 06:46 AM	0 responses 9 views 0 likes	Last Post by seqadmin Yesterday, 06:46 AM
New Method for Rapid Genetic Diagnosis of Mendelian Disorders by seqadmin Started by seqadmin, 07-24-2024, 11:09 AM	0 responses 26 views 0 likes	Last Post by seqadmin 07-24-2024, 11:09 AM
Advancing Nanopore Technology for Portable Sensing Devices by seqadmin Started by seqadmin, 07-19-2024, 07:20 AM	0 responses 160 views 0 likes	Last Post by seqadmin 07-19-2024, 07:20 AM
New RNA-Based Gene Writing Technology Achieves Precise Gene Integration by seqadmin Started by seqadmin, 07-16-2024, 05:49 AM	0 responses 127 views 0 likes	Last Post by seqadmin 07-16-2024, 05:49 AM

Seqanswers Leaderboard Ad

Announcement

Illumina - RNA Seq. - Gene Expression Analysis

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News