Seqanswers Leaderboard Ad

**dpryan** · 05-21-2013, 12:59 AM

That's odd, you might run the following on your reference fasta file to see if this is expected or not:

Code:

grep ">" reference_genome.fa

If ">15" pops up, then this is normal, though it'd be odd to have that and chr9 in the same fasta file. bismark does play around a bit with contig names, but something being messed up in the code dealing with that should result in different behaviour.

**fkrueger** · 05-21-2013, 01:42 AM

Originally posted by dpryan View Post

That's odd, you might run the following on your reference fasta file to see if this is expected or not:

Code:

grep ">" reference_genome.fa

If ">15" pops up, then this is normal, though it'd be odd to have that and chr9 in the same fasta file. bismark does play around a bit with contig names, but something being messed up in the code dealing with that should result in different behaviour.

Bismark takes whatever the fasta files had in the header until it hits the first white space, if you get '15' and 'chr9' in the output I would assume that these entries looked like '>15' and '>chr9' in the fasta files you used for the genome indexing process. I think it does replace '|' characters with underscores, but it would certainly not add or remove 'chr'.

**serenaliao** · 05-21-2013, 09:18 AM

Originally posted by fkrueger View Post

Bismark takes whatever the fasta files had in the header until it hits the first white space, if you get '15' and 'chr9' in the output I would assume that these entries looked like '>15' and '>chr9' in the fasta files you used for the genome indexing process. I think it does replace '|' characters with underscores, but it would certainly not add or remove 'chr'.

Thanks fkrueger,

You are right. This happened to my FASTA file.(some are chr<number> and some are <number>) Is there any convenient way to add "chr" before the chromosome number in SAM file(third column) if there is no chr? Thanks!

**serenaliao** · 05-21-2013, 09:42 AM

Originally posted by serenaliao View Post

Thanks fkrueger,

You are right. This happened to my FASTA file.(some are chr<number> and some are <number>) Is there any convenient way to add "chr" before the chromosome number in SAM file(third column) if there is no chr? Thanks!

Just to follow up, I used awk '{if($3!~/^chr/){$3="chr"$3} print($0)}' filename. Does this sound reasonable?

**fkrueger** · 05-21-2013, 11:51 AM

I am no expert with awk but it looks ok, should be easy enough to test (maybe on a few lines first). Any clues why your fasta files have mixed chromosome names?

Topics	Statistics	Last Post
New Method for DNA Sequence Amplification by seqadmin Started by seqadmin, Today, 08:18 AM	0 responses 8 views 0 likes	Last Post by seqadmin Today, 08:18 AM
New Tools Enhance Single-Molecule DNA Analysis with Minimal Samples by seqadmin Started by seqadmin, Today, 08:04 AM	0 responses 10 views 0 likes	Last Post by seqadmin Today, 08:04 AM
SIX2 Protein Identified as a Key Player in Prostate Cancer Treatment Resistance by seqadmin Started by seqadmin, 06-03-2024, 06:55 AM	0 responses 13 views 0 likes	Last Post by seqadmin 06-03-2024, 06:55 AM
Genetic Mosaicism More Prevalent Than Previously Thought by seqadmin Started by seqadmin, 05-30-2024, 03:16 PM	0 responses 27 views 0 likes	Last Post by seqadmin 05-30-2024, 03:16 PM

Seqanswers Leaderboard Ad

Announcement

Wierd SAM format chromosome column

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News