Greetings to you all,
First of all thank you for creating this forum, it seems like a great way to share knowledge.
I am a undergraduate student starting on a job at my university involving assembly of sequenced data. I am an expirienced linux user so I will try to use linux programs.
Currently I have familiarized myself with velvet a bit and I wish to try to assemble some data that has been previously well assembled and I wanted to test velvets capabilities on it as well. After making sure I know how to use velvet I will also try Ray and SOAP.
So I have 2 files:
100611_s_4_1_seq_GDR-7.txt (1.6 GB)
100611_s_4_2_seq_GDR-7.txt (1.6 GB)
I have used this as a refrence for my work:
From what I understand I need to merge the two files into 1 file with shuffleSequences_fastq.pl. Is this correct?
I have trouble understanding what does subsetting mean from that page. If we look at:
It seems like they are trying to compare single ended with paired-end. I am only doing paired-end, do I need to do subsetting?
Now one more thing, after running velveth_de and velvetg_de, at the end of velvetg_de it tells me how many nodes have been created. Is that the number of contigs? How do I interpret that last line?
I am using the -shortPaired option for velveth.
My last question is, has anyone here used consed? I just wanted to ask since I have some problems setting up that program.
Thank you.
First of all thank you for creating this forum, it seems like a great way to share knowledge.
I am a undergraduate student starting on a job at my university involving assembly of sequenced data. I am an expirienced linux user so I will try to use linux programs.
Currently I have familiarized myself with velvet a bit and I wish to try to assemble some data that has been previously well assembled and I wanted to test velvets capabilities on it as well. After making sure I know how to use velvet I will also try Ray and SOAP.
So I have 2 files:
100611_s_4_1_seq_GDR-7.txt (1.6 GB)
100611_s_4_2_seq_GDR-7.txt (1.6 GB)
I have used this as a refrence for my work:
From what I understand I need to merge the two files into 1 file with shuffleSequences_fastq.pl. Is this correct?
Code:
shuffleSequences_fastq.pl 100611_s_4_1_seq_GDR-7.txt100611_s_4_2_seq_GDR-7.txt 100611_s_4_both_seq_GDR-7.txt
Code:
17. Do the subsetting. Soon we will compare the single ended assembly to the paired-end assembly. In order for the comparison to be fair, we must use the same total number of reads. Therefore each paired end file will contain 1/4 of the reads:
Code:
Final graph has 302039 nodes and n50 of 175, max 1779, total 5984104, using 13024530/17609332 reads
I am using the -shortPaired option for velveth.
My last question is, has anyone here used consed? I just wanted to ask since I have some problems setting up that program.
Thank you.
Comment