Hi there.
I am completely new in the world of (de novo) genome assembly and I don't know what to begin with. When I asked help at the department they said "go to seqanswers", so here I am to have some help...
I have been given some sequencing data about an insect (colza pollen beetle) and have to make a genome assembly. This is Illumina data in paired-end format.
There are 3 fastq files :
- lane 5/1 : 11 423 167 reads of length 76
- lane 5/2 : 11 423 167 reads of length 76
- lane 7 : 9 294 857 reads of length 152
An average beetle genome size is said to be about 650Mbp.
Apparently "we" have a server with 192GB RAM where SOAPdenovo is/will be installed.
I have been told to first control the sequences quality so after a few surfing I found "FASTQC" (with a good Youtube tutorial). I don't know what I have to do after... at all.
I am not here to ask you to do the job in my place & I know a will have a lot of reading & research, but i would know what is the main guide-line to follow, what are the things to mind about, the traps to prevent, etc.
Thank you in advance for any kind of help,
M.
(PS: accordingly to the FASTQC tutorial, data quality are quite poor, i can post output on demand)
I am completely new in the world of (de novo) genome assembly and I don't know what to begin with. When I asked help at the department they said "go to seqanswers", so here I am to have some help...
I have been given some sequencing data about an insect (colza pollen beetle) and have to make a genome assembly. This is Illumina data in paired-end format.
There are 3 fastq files :
- lane 5/1 : 11 423 167 reads of length 76
- lane 5/2 : 11 423 167 reads of length 76
- lane 7 : 9 294 857 reads of length 152
An average beetle genome size is said to be about 650Mbp.
Apparently "we" have a server with 192GB RAM where SOAPdenovo is/will be installed.
I have been told to first control the sequences quality so after a few surfing I found "FASTQC" (with a good Youtube tutorial). I don't know what I have to do after... at all.
I am not here to ask you to do the job in my place & I know a will have a lot of reading & research, but i would know what is the main guide-line to follow, what are the things to mind about, the traps to prevent, etc.
Thank you in advance for any kind of help,
M.
(PS: accordingly to the FASTQC tutorial, data quality are quite poor, i can post output on demand)
Comment