Hello All,
I am new to bioinformatics and am (trying to) use Cufflinks for the first time. My input to Cufflinks is an accepted_hits.sam file with ~19 million reads - generated (apparently without error) from Tophat. When I run cufflinks (cufflinks -o Results accepted_hits.sam) I first get a "Counting hits in map" message, and then "Error:nonsense gene merge. Exiting". By an iteriative process of truncating my input file, I find that Cufflinks apparently does not like a line (~ 4 millionth) in the input file.
My accepted_hits.sam file at the error point looks like this (below). Shown are 9 reads - Cufflinks seems to generate the error message with 6th read shown. I have tried just eliminating this one line from the accepted_hits.sam file - I still get the same error (perhaps from some later line.) I have also tried Cufflinks with output from different files coming from Tophat - I consistently get this same error. (Cufflinks does run fine with the test file supplied...)
Thanks for any help with this
HWI-EAS288_8_2_20_941_1818_0 0 gi|51511724|ref|NC_000008.9|NC_000008 111450969 255 64M * 0 0 ATAGCATCTTCCC
AGCTTCCATCTCCCTACAGTCCATCNTATTCAAGTCTTTAGCTATTTTGGA B@BBBB@BBBBABABA@@@A@@AA@??B;=7:=@?>A;%;>AB@?6?:?==?=@?@>?=@?>>8 NM:i:2
HWI-EAS288_8_2_117_405_131_0 0 gi|51511724|ref|NC_000008.9|NC_000008 111450982 255 42M * 0 0 AGCTTCCATCTCC
CTACAGTCCATCATATTCAAGTCTTTAGC <:A>9=,/8;297=;=;1208778=:2-2650'462-3586? NM:i:0
HWI-EAS288_8_1_9_672_1871_0 0 gi|51511724|ref|NC_000008.9|NC_000008 111530035 255 62M * 0 0 GGGAAACATGGTG
AAACCCTGTTTCTACTAAAAATACAAAAATTAGCCAGCTGTGGTGGCAA 6CCBBCCCCCB>BBBCCBAC@BCCCBBBCB@A>BCBBB@BACBAB<ABBB>B<>@??;%8@; NM:i:1
HWI-EAS288_8_2_79_444_2024_0 0 gi|51511724|ref|NC_000008.9|NC_000008 111637724 1 22M17834N536870911M * 0 0 GCAGCAACAGCGGCAGCGGCA ABAAAB@@@>AAABAB?ABAA NM:i:2 XS:A:+ NS:i:2
HWI-EAS288_8_2_79_444_2024_0 0 gi|51511724|ref|NC_000008.9|NC_000008 111637724 1 22M17840N536870911M * 0 0 GCAGCAACAGCGGCAGCGGCA ABAAAB@@@>AAABAB?ABAA NM:i:2 XS:A:+ NS:i:2
HWI-EAS288_8_2_79_444_2024_0 0 gi|51511724|ref|NC_000008.9|NC_000008 111637724 1 22M5744N536870911M * 0 0 GCAGCAACAGCGGCAGCGGCA ABAAAB@@@>AAABAB?ABAA NM:i:2 XS:A:+ NS:i:2
HWI-EAS288_8_2_56_390_1555_0 0 gi|51511724|ref|NC_000008.9|NC_000008 111643298 255 42M * 0 0 CAGCAAACCACCA
TGGCCCACATTTACCTATGTAACAAATCA BCBCCCCCCCBBCA;>C73-?CCBC@CBBBBCACBCCACBCC NM:i:1
HWI-EAS288_8_2_50_1601_1261_0 0 gi|51511724|ref|NC_000008.9|NC_000008 111809773 255 53M * 0 0 CTATTCTATACCA
TTCCATTCCATTCCATTCCATTCCATTCCATGCCATTCCA BCBBCCBAC:CCCCBBACCBCBBBBBB@BBBBBB?BAAB@BA6>ABBAAABB1 NM:i:2
HWI-EAS288_8_1_100_1688_1517_0 16 gi|51511724|ref|NC_000008.9|NC_000008 111829914 255 76M * 0 0 AGCCTTCAGTCTG
TGGCCAAAGGCCCAAGGGTCCCCAGCGAACCACTGGTGTAAGTCCAAGAGTCCGAAGGCTGAG =+,:9===?9;AAA>?B=??A=>9>?A>A?A@BAAABBAAA?ABABBBAABABBBABBBBBBBBBABBBBBBBBBB NM:i:
0
I am new to bioinformatics and am (trying to) use Cufflinks for the first time. My input to Cufflinks is an accepted_hits.sam file with ~19 million reads - generated (apparently without error) from Tophat. When I run cufflinks (cufflinks -o Results accepted_hits.sam) I first get a "Counting hits in map" message, and then "Error:nonsense gene merge. Exiting". By an iteriative process of truncating my input file, I find that Cufflinks apparently does not like a line (~ 4 millionth) in the input file.
My accepted_hits.sam file at the error point looks like this (below). Shown are 9 reads - Cufflinks seems to generate the error message with 6th read shown. I have tried just eliminating this one line from the accepted_hits.sam file - I still get the same error (perhaps from some later line.) I have also tried Cufflinks with output from different files coming from Tophat - I consistently get this same error. (Cufflinks does run fine with the test file supplied...)
Thanks for any help with this
HWI-EAS288_8_2_20_941_1818_0 0 gi|51511724|ref|NC_000008.9|NC_000008 111450969 255 64M * 0 0 ATAGCATCTTCCC
AGCTTCCATCTCCCTACAGTCCATCNTATTCAAGTCTTTAGCTATTTTGGA B@BBBB@BBBBABABA@@@A@@AA@??B;=7:=@?>A;%;>AB@?6?:?==?=@?@>?=@?>>8 NM:i:2
HWI-EAS288_8_2_117_405_131_0 0 gi|51511724|ref|NC_000008.9|NC_000008 111450982 255 42M * 0 0 AGCTTCCATCTCC
CTACAGTCCATCATATTCAAGTCTTTAGC <:A>9=,/8;297=;=;1208778=:2-2650'462-3586? NM:i:0
HWI-EAS288_8_1_9_672_1871_0 0 gi|51511724|ref|NC_000008.9|NC_000008 111530035 255 62M * 0 0 GGGAAACATGGTG
AAACCCTGTTTCTACTAAAAATACAAAAATTAGCCAGCTGTGGTGGCAA 6CCBBCCCCCB>BBBCCBAC@BCCCBBBCB@A>BCBBB@BACBAB<ABBB>B<>@??;%8@; NM:i:1
HWI-EAS288_8_2_79_444_2024_0 0 gi|51511724|ref|NC_000008.9|NC_000008 111637724 1 22M17834N536870911M * 0 0 GCAGCAACAGCGGCAGCGGCA ABAAAB@@@>AAABAB?ABAA NM:i:2 XS:A:+ NS:i:2
HWI-EAS288_8_2_79_444_2024_0 0 gi|51511724|ref|NC_000008.9|NC_000008 111637724 1 22M17840N536870911M * 0 0 GCAGCAACAGCGGCAGCGGCA ABAAAB@@@>AAABAB?ABAA NM:i:2 XS:A:+ NS:i:2
HWI-EAS288_8_2_79_444_2024_0 0 gi|51511724|ref|NC_000008.9|NC_000008 111637724 1 22M5744N536870911M * 0 0 GCAGCAACAGCGGCAGCGGCA ABAAAB@@@>AAABAB?ABAA NM:i:2 XS:A:+ NS:i:2
HWI-EAS288_8_2_56_390_1555_0 0 gi|51511724|ref|NC_000008.9|NC_000008 111643298 255 42M * 0 0 CAGCAAACCACCA
TGGCCCACATTTACCTATGTAACAAATCA BCBCCCCCCCBBCA;>C73-?CCBC@CBBBBCACBCCACBCC NM:i:1
HWI-EAS288_8_2_50_1601_1261_0 0 gi|51511724|ref|NC_000008.9|NC_000008 111809773 255 53M * 0 0 CTATTCTATACCA
TTCCATTCCATTCCATTCCATTCCATTCCATGCCATTCCA BCBBCCBAC:CCCCBBACCBCBBBBBB@BBBBBB?BAAB@BA6>ABBAAABB1 NM:i:2
HWI-EAS288_8_1_100_1688_1517_0 16 gi|51511724|ref|NC_000008.9|NC_000008 111829914 255 76M * 0 0 AGCCTTCAGTCTG
TGGCCAAAGGCCCAAGGGTCCCCAGCGAACCACTGGTGTAAGTCCAAGAGTCCGAAGGCTGAG =+,:9===?9;AAA>?B=??A=>9>?A>A?A@BAAABBAAA?ABABBBAABABBBABBBBBBBBBABBBBBBBBBB NM:i:
0
Comment