When I using tophat to map SOLiD data (dataset: DRR013130) with the following parameters (tophat2 -m 2 -p 8 --color --quals --bowtie1 -G /home/mm10/mm10_GTF/genes.gtf -o /media/DRR013130_Oo_3_F3 /home/mm10/BowtieCIndex/genome /media/DRR013130_Oo_3_F3.csfasta /media/DRR013130_Oo_3_F3_QV.qual), I found out that each length of the quality of sequence in unmapped.fastq (which is converted from unmapped.bam using bam2fastx) is 1 length shorter than itself length. However, this problem does happen in accepted_hits.fastq (converted from accepted_hits.bam). How could it happen and how to solve it?
Thank you very much for your help!
-----------------------------------------------------------------------------------
>head unmapped.fastq
@DRR013130_Oo_3.3872899
TTGACCTTACGCTCGTGTCAATTGAACTCTTATGCTACCCCTACCGTCAGA 51 length
+
@==;@==?>@?@@@@?@@@@@?;@@2@:/66226?6/8/6?8;/=;;>8/ 50 length
@DRR013130_Oo_3.12296928
TGACCGTCTTAGACATATCTCCGTCGTAGGGATCCCCGGCTAACGGATCCG 51 length
+
@@@@@@@@@@@@@@@@@@@@@@@?@@@;=@@@?6?@@6//2//66//26/ 50 length
-----------------------------------------------------------------------------------
>head accepted_hits.fastq
@DRR013130_Oo_3.9803742
ACTTTTACAAGGCCTAATGGTGACTCCTACAGTGGTTGACACCGACTACC 50 length
+
@__]L@Q__UU]]___[[____]]^^___WW____UU__22___[QS]W8 50 length
@DRR013130_Oo_3.15175672
CAACCTAAAATAAAAACAACTAAAAAAGCTGACTCGTGAGGCAAAAAGAC 50 length
+
@________^^___________]]__________________]]__[[_@ 50 length
Thank you very much for your help!
-----------------------------------------------------------------------------------
>head unmapped.fastq
@DRR013130_Oo_3.3872899
TTGACCTTACGCTCGTGTCAATTGAACTCTTATGCTACCCCTACCGTCAGA 51 length
+
@==;@==?>@?@@@@?@@@@@?;@@2@:/66226?6/8/6?8;/=;;>8/ 50 length
@DRR013130_Oo_3.12296928
TGACCGTCTTAGACATATCTCCGTCGTAGGGATCCCCGGCTAACGGATCCG 51 length
+
@@@@@@@@@@@@@@@@@@@@@@@?@@@;=@@@?6?@@6//2//66//26/ 50 length
-----------------------------------------------------------------------------------
>head accepted_hits.fastq
@DRR013130_Oo_3.9803742
ACTTTTACAAGGCCTAATGGTGACTCCTACAGTGGTTGACACCGACTACC 50 length
+
@__]L@Q__UU]]___[[____]]^^___WW____UU__22___[QS]W8 50 length
@DRR013130_Oo_3.15175672
CAACCTAAAATAAAAACAACTAAAAAAGCTGACTCGTGAGGCAAAAAGAC 50 length
+
@________^^___________]]__________________]]__[[_@ 50 length
Comment