i downloaded some pairwise alignment files from UCSC in the axtnet format and converted these to the MAF format the general header looks like this
a score=21045.000000
s chr10 1454 357 + 94855758 aataaaaattattggtccccattcctagtgattccaa
s chr14 106421274 333 - 108792865 aataaaaattctttggccccattcttagtgagtcc
I had run multiz using these files and then while running phastcons i found out that organisms had not been specified therefore i converted these headers to the appropriate format
a score=21045.000000
s organism1.chr10 1454 357 + 94855758 aataaaaattattggtccccattcctag
s organism2.chr14 106421274 333 - 108792865 aataaaaattctttggccccattctta
and when i run these files i get an error line 11 of organism1.organism2.maf : inconsistent row size
and this problem is common in all files where i have made this change
it would be really helpful if someone can point out the problem here.
the multiz command i use is
multiz chr1.organism1.organism2.maf chr1.organism1.organism3.maf chr1.unused > chr1.organism1.organism2.organism3.maf
and the command i used to change them was:
awk '/a score/{print;getline;gsub(/chr/,"organism1.chr",$0);print;getline;gsub(/chr/,"organism2.chr",$0);print} /#/{print;}' chr1.organism1.organism2.maf > chr1.organism1.organism2.maf2
a score=21045.000000
s chr10 1454 357 + 94855758 aataaaaattattggtccccattcctagtgattccaa
s chr14 106421274 333 - 108792865 aataaaaattctttggccccattcttagtgagtcc
I had run multiz using these files and then while running phastcons i found out that organisms had not been specified therefore i converted these headers to the appropriate format
a score=21045.000000
s organism1.chr10 1454 357 + 94855758 aataaaaattattggtccccattcctag
s organism2.chr14 106421274 333 - 108792865 aataaaaattctttggccccattctta
and when i run these files i get an error line 11 of organism1.organism2.maf : inconsistent row size
and this problem is common in all files where i have made this change
it would be really helpful if someone can point out the problem here.
the multiz command i use is
multiz chr1.organism1.organism2.maf chr1.organism1.organism3.maf chr1.unused > chr1.organism1.organism2.organism3.maf
and the command i used to change them was:
awk '/a score/{print;getline;gsub(/chr/,"organism1.chr",$0);print;getline;gsub(/chr/,"organism2.chr",$0);print} /#/{print;}' chr1.organism1.organism2.maf > chr1.organism1.organism2.maf2
Comment