Seqanswers Leaderboard Ad

**natstreet** · 11-17-2010, 10:04 PM

For a reference mapping re-sequencing style assembly you could take a look at Mosaik.

**allcreation** · 11-18-2010, 11:48 AM

Originally posted by natstreet View Post

For a reference mapping re-sequencing style assembly you could take a look at Mosaik.

hi, thanks for the advice. I am trying, but MosaikAligner wants 181 days to process my 35M illumine reads. I already aligned my reads to the reference genome with BWA and it didn't have all this slowness, I can't understand the problem. Ideally I would only be interested in MosaikAssembler, by my reads are in SAM format and Mosaik wants its own format. DO you know a converter?

**natstreet** · 11-18-2010, 12:00 PM

I'm not sure if there is a convertor for SAM/BAM to mosaik .dat format.

What command are you using for MosaikAligner? Did you create a jump database using MosaikJump?

The last time I mapped some 76 bp Illumina reads I used these option

Code:

MosaikAligner -in in.dat -out out.dat -ia ref.dat -j ref_15 -hs 15 -mm 3 -act 35 -bw 29 -mhp 100 -p 12

where -j ref_15 is the output of MosaikJump with -hs 15

For MasaikAligner set -p as high as you can on your machine. In general I've found Mosaik to be fairly fast but not as fast as bowtie.

**colindaven** · 11-19-2010, 01:47 AM

There is another option called nesoni I am checking out at the moment. It allows you to map Illumina reads against 454 contigs with Shrimp2 and then attempts to integrate the output.
I'm not sure I completely understand what it's doing and how well its working for my dataset but it might be helpful for you.

See Torst's posts on this forum.

**allcreation** · 11-23-2010, 06:49 AM

hi,
thanks for the help.
I found out that Velvet, since this summer, has a new module called Columbus for mapping assemblies:

Error: 404 | EMBL-EBI

http://www.ebi.ac.uk/~zerbino/velvet/Columbus_manual.pdf

it worked well, or at least it seems... I have to check better the contigs. But it didn't crash like MIRA, it was not limited to 2M reads like MAQ and it didn't ask me to reduce the memory usage with something like MosaikJump. Actually, do you know if there is any benchmark for mapping assemblers?

**s052866** · 12-08-2010, 02:45 PM

Velvet rejects sam file to contain reference sequences

Hi allcreation
Is it possible for you to give som directions on how you created the SAM file and reference sequence for your Velvet-Columbus run?
I have tried to do exactly as it is described in the manual but I get the following error message:

Code:

[0.256032] SAM file r5p12t6_07testmap_novo.sorted.sam cannot contain reference sequences.

I have my reference sequence in the described for format:

Code:

>contig00001:1-35524
gACGCCGCGCGCCGCGGCCAGGGCTGGCCCACGGCCcTCTTCCGGCGCGCTGCGCAGGCG
TTCGGCCAGGCCGCGCGGCGTCGGCTGGCTGAGCGCCCAGCGTAGCAGGCGATCGAACGG
ATGCCGACGGGCGCTTTCCAGTCGTTCGCGCAAACGGGCGATCAACTGGGCGATCAACAG
CGAGTCGCCGCCAGCCCCGAAGAAGTCTTGCTCGACGCCCAGCGACGGGTTGTCCAGCAC
CTCCCGCCAGAGTGCCAGCAGCGCATTCTCCAGTTCGTCGGCCGGTGCCTGCGCGACGCC

And my SAM file which I created with Novoalign was sorted like this:

Code:

sort SAMfile.sam > SAMfile.sorted.sam

The header in the fasta file I used as reference for the alignment I have tried both like this
>contig00001:1-35524
and like this
>contig00001
But nothing avoids the error message.

So maybe you can give and header on some of your input data?

Best, s052866

**allcreation** · 12-09-2010, 06:20 AM

Originally posted by s052866 View Post

Hi allcreation
Is it possible for you to give som directions on how you created the SAM file and reference sequence for your Velvet-Columbus run?
I have tried to do exactly as it is described in the manual but I get the following error message:

Code:

[0.256032] SAM file r5p12t6_07testmap_novo.sorted.sam cannot contain reference sequences.

I have my reference sequence in the described for format:

Code:

>contig00001:1-35524
gACGCCGCGCGCCGCGGCCAGGGCTGGCCCACGGCCcTCTTCCGGCGCGCTGCGCAGGCG
TTCGGCCAGGCCGCGCGGCGTCGGCTGGCTGAGCGCCCAGCGTAGCAGGCGATCGAACGG
ATGCCGACGGGCGCTTTCCAGTCGTTCGCGCAAACGGGCGATCAACTGGGCGATCAACAG
CGAGTCGCCGCCAGCCCCGAAGAAGTCTTGCTCGACGCCCAGCGACGGGTTGTCCAGCAC
CTCCCGCCAGAGTGCCAGCAGCGCATTCTCCAGTTCGTCGGCCGGTGCCTGCGCGACGCC

And my SAM file which I created with Novoalign was sorted like this:

Code:

sort SAMfile.sam > SAMfile.sorted.sam

The header in the fasta file I used as reference for the alignment I have tried both like this
>contig00001:1-35524
and like this
>contig00001
But nothing avoids the error message.

So maybe you can give and header on some of your input data?

Best, s052866

HI,

to create my SAM file I started from the fastq files I had and I gave them to BWA

The headers of my reference sequences were like:

>ref1:1-100000

One thing I can think of is... do your reference sequences and your SAM file reference sequences have the same identical names? This could cause Velvet to not be able to match the information between the reference and the SAM file.

Edit: actually the error message is telling you that the SAM file can't contain reference sequences... are you using a command line like
velveth ./ 21 -reference ref.fasta -short -sam sorted.sam > Log.txt
?

**s052866** · 12-09-2010, 12:53 PM

Hi allcreation

I found out that my problem was because I did not have the "-short" in my commandline. Thanks.

Topics	Statistics	Last Post
The Role of Enhancers in Defining Cell Fate by seqadmin Started by seqadmin, Yesterday, 10:49 AM	0 responses 17 views 0 likes	Last Post by seqadmin Yesterday, 10:49 AM
Expanding the Horizons of Cellular Research with the Single Cell Atlas by seqadmin Started by seqadmin, 04-25-2024, 11:49 AM	0 responses 24 views 0 likes	Last Post by seqadmin 04-25-2024, 11:49 AM
Genetic Variants and Diabetes Risk in Childhood Cancer Survivors by seqadmin Started by seqadmin, 04-24-2024, 08:47 AM	0 responses 20 views 0 likes	Last Post by seqadmin 04-24-2024, 08:47 AM
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 62 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM

Seqanswers Leaderboard Ad

Announcement

Question about Illumina reads, SNPs and mapping assemblies

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News