error while maaping paired end reads in Maq

der_eiskern replied

08-03-2009, 12:21 PM
Originally posted by nilshomer View Post

I am a bit confused. So they aligned the reads and made variant calls using corona-lite, and then gave you the raw color data (fastq)? Why don't they just give you the variant calls and alignments?

I would definitely ask for the *csfasta and *qual files in this case and do your own alignment and SNP calling...

yeah, that's what i've been trying to do with MAQ and have been rather successful with the "homozygous" calls (using the flawed data they gave us) but not so much for the hets. i'm going to have to redo all of it though. thanks again for your help, nils.
Leave a comment:
nilshomer replied

08-03-2009, 11:27 AM
Originally posted by der_eiskern View Post

Our SOLiD data came from offsite and they did they're own SNPcalling using the Corona Lite pipeline and gave us converted qual files in the fastq format for us to run MAQ on. Email communication has been slow...its looking like i'll have to pay them a visit to get all this straightened out.

I am a bit confused. So they aligned the reads and made variant calls using corona-lite, and then gave you the raw color data (fastq)? Why don't they just give you the variant calls and alignments?

I would definitely ask for the *csfasta and *qual files in this case and do your own alignment and SNP calling...
Leave a comment:
der_eiskern replied

08-03-2009, 11:07 AM
Originally posted by nilshomer View Post

So some fraction of the time -" will occur not from one -1 quality but two independent qualities. Therefore it is fairly tricky, unless you try to match up the -1 qualities with the missing color (which is usually the case). This is starting to sound like a lot of work!

Did you delete the original qual files? How did you get the fastq file in the first place?

Our SOLiD data came from offsite and they did they're own SNPcalling using the Corona Lite pipeline and gave us converted qual files in the fastq format for us to run MAQ on. Email communication has been slow...its looking like i'll have to pay them a visit to get all this straightened out.
Leave a comment:
nilshomer replied

08-03-2009, 10:59 AM
Originally posted by der_eiskern View Post

Thanks! we don't have the original qual files unfortunately, can i apply this command to the fastq files i have? or are they beyond help?

You can try to modify the fastq files. The only problem is that -1 encoded in sanger ASCII is -", and both - and " are also sanger ASCII (I believe). So some fraction of the time -" will occur not from one -1 quality but two independent qualities. Therefore it is fairly tricky, unless you try to match up the -1 qualities with the missing color (which is usually the case). This is starting to sound like a lot of work!

Did you delete the original qual files? How did you get the fastq file in the first place?
Leave a comment:
der_eiskern replied

08-03-2009, 10:51 AM
Originally posted by nilshomer View Post

If it is in the .bfq format, you will have to convert it back to fastq (since the .bfq is gzip compressed).

You can always modify the input "qual" files using "sed":

Code:

sed -i 's_-1_1_g' <QV file>

Thanks! we don't have the original qual files unfortunately, can i apply this command to the fastq files i have? or are they beyond help?
Leave a comment:
nilshomer replied

08-03-2009, 10:47 AM
Originally posted by der_eiskern View Post

thanks. this is a part of an output where i printed every quality string

(*)&-"-"-",&&'&''*)',&&,-"1&+&))))1-"&&)/-")&',&(&-)-"-"'
<;>=-"-"-"7=><1=>?:<>;>A-"=;<88=:=6-"=0:<-">8?(;9;8,-"-",
5&,;-"-"-":&/,/(($-8,)5/-"1)((&+&,'-")$($-"/&0''(/28-"-"$
$##&-"-"-"%&'##'###%%###-"#1####&##-"##$#-"$#####$#$-"-"#
/,&&-"-"-",5<,*,<&1)',+/-",5,/&'/),-"//,4-"&&)7+&),)-"-"&
$/&&-"-"-"81/@*,>),)3)(,-"<(>/'/)-",&-"&,<1).&&8-"-"2
:?>8-"-"-">?8;0>6;:.>9=6-";98>/6$%9-")%47-"+#1;/)'.7-"-".
=<A?-"-"-"==><<0>1;@89>=-"579;A==>3-"1<79-":<=37)55;-"-"<
8;<<-"-"-":;8<86526<8,;<-"891:76,,9-"7037-"5.+:;1;65-"-"9
<::A-"-"-"9A@:9<<><8==@5-"8;;5:89;6-"<05:-"9<2=)8>68-"-"=
#&$#-"-"-"$''%##$'$'$-(&-"$&&#'#-#'-"#%&.-"$''#%%&(%-"-"*

yep. there's all those "-1" improperly translated. is there a way that i can correct these files without retranslating everything to fastq?

nisha, what was your way around this if you didn't change the script?

thanks.

If it is in the .bfq format, you will have to convert it back to fastq (since the .bfq is gzip compressed).

You can always modify the input "qual" files using "sed":

Code:

sed -i 's_-1_1_g' <QV file>
Leave a comment:
der_eiskern replied

08-03-2009, 10:33 AM
thanks. this is a part of an output where i printed every quality string

(*)&-"-"-",&&'&''*)',&&,-"1&+&))))1-"&&)/-")&',&(&-)-"-"'
<;>=-"-"-"7=><1=>?:<>;>A-"=;<88=:=6-"=0:<-">8?(;9;8,-"-",
5&,;-"-"-":&/,/(($-8,)5/-"1)((&+&,'-")$($-"/&0''(/28-"-"$
$##&-"-"-"%&'##'###%%###-"#1####&##-"##$#-"$#####$#$-"-"#
/,&&-"-"-",5<,*,<&1)',+/-",5,/&'/),-"//,4-"&&)7+&),)-"-"&
$/&&-"-"-"81/@*,>),)3)(,-"<(>/'/)-",&-"&,<1).&&8-"-"2
:?>8-"-"-">?8;0>6;:.>9=6-";98>/6$%9-")%47-"+#1;/)'.7-"-".
=<A?-"-"-"==><<0>1;@89>=-"579;A==>3-"1<79-":<=37)55;-"-"<
8;<<-"-"-":;8<86526<8,;<-"891:76,,9-"7037-"5.+:;1;65-"-"9
<::A-"-"-"9A@:9<<><8==@5-"8;;5:89;6-"<05:-"9<2=)8>68-"-"=
#&$#-"-"-"$''%##$'$'$-(&-"$&&#'#-#'-"#%&.-"$''#%%&(%-"-"*

yep. there's all those "-1" improperly translated. is there a way that i can correct these files without retranslating everything to fastq?

nisha, what was your way around this if you didn't change the script?

thanks.
Leave a comment:
nilshomer replied

07-31-2009, 07:50 PM
Originally posted by nisha View Post

hi der_eiskern,

Yeah repeating what nilshomer mentioned ...

Yes i figured out what the problem is. I'm assuming it would be the same problem for you.

The *.qual files containing the qualities for both the F3 and R3 reads have negative values, mainly -1. So when the solid2fastq.pl creates the fastq files it does not handle these negative values correctly (treating the "-" and "1" as separate entities) and the length of the quality string is not equal to the length of the read string.

You would have to change the script a bit to handle this problem.

hope this helps.

N

This is what I mentioned above. I have emailed Heng Li (MAQ's author) about the problem, but it should be a one liner in his code.
Leave a comment:
nisha replied

07-31-2009, 10:32 AM
hi der_eiskern,

Yeah repeating what nilshomer mentioned ...

Yes i figured out what the problem is. I'm assuming it would be the same problem for you.

The *.qual files containing the qualities for both the F3 and R3 reads have negative values, mainly -1. So when the solid2fastq.pl creates the fastq files it does not handle these negative values correctly (treating the "-" and "1" as separate entities) and the length of the quality string is not equal to the length of the read string.

You would have to change the script a bit to handle this problem.

hope this helps.

N
Leave a comment:
der_eiskern replied

07-31-2009, 10:31 AM
the quality strings in my fastq files are 50 bp and my read length is 50 bp. I didn't generate the files myself. just my task to run them. i haven't written a bash script to check the length of every single read though...so would just a single aberrant length stop maq completely before it begins?

any other ideas, nils? I'm scratching my head because maq gives me the error suggesting the two reads are of different lengths still. I'm hoping that my 70% Mappability will increase when i can get paired end assignments working.

thanks.
Leave a comment:
nilshomer replied

07-31-2009, 09:48 AM
Originally posted by der_eiskern View Post

yeah, i'm getting the same error. i've not been able to figure it out yet but in the meantime i ran single ends and got 70% of the reads mappable. did you figure it out? i'd be interested in hearing about your problems.

btw, to introduce myself, i'm der_eiskern and at the moment i'm doing whole genome sequencing with both SOLiD and Illumina platforms.

cheers.

Did you check the length of the quality strings? MAQ's convert script can output too long of a quality string if there are "-1" qualities. This fixed the problem for me.
Leave a comment:
der_eiskern replied

07-31-2009, 08:12 AM
maq + solid

yeah, i'm getting the same error. i've not been able to figure it out yet but in the meantime i ran single ends and got 70% of the reads mappable. did you figure it out? i'd be interested in hearing about your problems.

btw, to introduce myself, i'm der_eiskern and at the moment i'm doing whole genome sequencing with both SOLiD and Illumina platforms.

cheers.
Leave a comment:
nisha started a topic error while maaping paired end reads in Maq

06-29-2009, 02:59 PM
error while maaping paired end reads in Maq

Hi,

This is my first post.
I am running maq map for paired end SOLiD reads and ive gone through all the initial procedures of buildinf .bfq files and ref.csbfa sequence

when i run the command:
maq map -c aln.cs.map ref.csbfa in.read1.bfq in.read2.bfq 2>aln.log

i always get the following error:
maq: read.cc:61: longreads_t* ma_load_reads(void*, int, void*, int): Assertion `strncmp(name, lr->name[j], tl-1) == 0' failed.

Could someone please tell me what this mean? What are the strings that are being compared? i checked the name of the reads in both the read1 and read2 fastq files and they are matching with /1 and /2 respectively for the read pairs?

Also the length of the reads are the same in both the files.

Any help will be appreciated.

Thanks,
N
Tags: None

Previous template Next

Exploring the Dynamics of the Tumor Microenvironment

by seqadmin

The complexity of cancer is clearly demonstrated in the diverse ecosystem of the tumor microenvironment (TME). The TME is made up of numerous cell types and its development begins with the changes that happen during oncogenesis. “Genomic mutations, copy number changes, epigenetic alterations, and alternative gene expression occur to varying degrees within the affected tumor cells,” explained Andrea O’Hara, Ph.D., Strategic Technical Specialist at Azenta. “As...
- Channel: Articles
07-08-2024, 03:19 PM

Topics	Statistics	Last Post
Gene Misexpression in the Healthy Human Population by seqadmin Started by seqadmin, Yesterday, 06:46 AM	0 responses 9 views 0 likes	Last Post by seqadmin Yesterday, 06:46 AM
New Method for Rapid Genetic Diagnosis of Mendelian Disorders by seqadmin Started by seqadmin, 07-24-2024, 11:09 AM	0 responses 26 views 0 likes	Last Post by seqadmin 07-24-2024, 11:09 AM
Advancing Nanopore Technology for Portable Sensing Devices by seqadmin Started by seqadmin, 07-19-2024, 07:20 AM	0 responses 159 views 0 likes	Last Post by seqadmin 07-19-2024, 07:20 AM
New RNA-Based Gene Writing Technology Achieves Precise Gene Integration by seqadmin Started by seqadmin, 07-16-2024, 05:49 AM	0 responses 127 views 0 likes	Last Post by seqadmin 07-16-2024, 05:49 AM

Seqanswers Leaderboard Ad

Announcement

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment: