Hi,
when I align the following four identical Fastq entries (identical except for identifiers):
@C003-U0418-F000014195/1
GTTCCCATGCCTGGAGAAGCTAATGCCAACTCATCATGTGATAATTCAATTTGTACAATAAATTATGAACCTCAAAAAAAAAAAAAAAAAAAAAAAAAAAA
+
-@1ACGDF=88D9@ECGGCGCGGFFG<FGGFGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGFFGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG
@U0418-F000014195:1
GTTCCCATGCCTGGAGAAGCTAATGCCAACTCATCATGTGATAATTCAATTTGTACAATAAATTATGAACCTCAAAAAAAAAAAAAAAAAAAAAAAAAAAA
+
-@1ACGDF=88D9@ECGGCGCGGFFG<FGGFGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGFFGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG
@U0418-F000014195/1
GTTCCCATGCCTGGAGAAGCTAATGCCAACTCATCATGTGATAATTCAATTTGTACAATAAATTATGAACCTCAAAAAAAAAAAAAAAAAAAAAAAAAAAA
+
-@1ACGDF=88D9@ECGGCGCGGFFG<FGGFGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGFFGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG
@U0418-F000014195:2
GTTCCCATGCCTGGAGAAGCTAATGCCAACTCATCATGTGATAATTCAATTTGTACAATAAATTATGAACCTCAAAAAAAAAAAAAAAAAAAAAAAAAAAA
+
-@1ACGDF=88D9@ECGGCGCGGFFG<FGGFGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGFFGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG
@C003-U0418-F000014195-2
GTTCCCATGCCTGGAGAAGCTAATGCCAACTCATCATGTGATAATTCAATTTGTACAATAAATTATGAACCTCAAAAAAAAAAAAAAAAAAAAAAAAAAAA
+
-@1ACGDF=88D9@ECGGCGCGGFFG<FGGFGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGFFGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG
against Ensembl transcripts I get the following result:
C003-U0418-F000014195/1 4 * 0 0 * * 0 0 GTTCCCATGCCTGGAGAAGCTAATGCCAACTCATCATGTGATAATTCAATTTGTACAATAAATTATGAACCTCAAAAAAAAAAAAAAAAAAAAAAAAAAAA -@1ACGDF=88D9@ECGGCGCGGFFG<FGGFGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGFFGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG YT:Z:UU
U0418-F000014195:1 0 ENST00000369815 549 255 101M * 0 0 GTTCCCATGCCTGGAGAAGCTAATGCCAACTCATCATGTGATAATTCAATTTGTACAATAAATTATGAACCTCAAAAAAAAAAAAAAAAAAAAAAAAAAAA -@1ACGDF=88D9@ECGGCGCGGFFG<FGGFGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGFFGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG AS:i:-10 XN:i:0 XM:i:2 XO:i:0 XG:i:0 NM:i:2 MD:Z:72G0G27 YT:Z:UU
U0418-F000014195/1 0 ENST00000337003 483 255 101M * 0 0 GTTCCCATGCCTGGAGAAGCTAATGCCAACTCATCATGTGATAATTCAATTTGTACAATAAATTATGAACCTCAAAAAAAAAAAAAAAAAAAAAAAAAAAA -@1ACGDF=88D9@ECGGCGCGGFFG<FGGFGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGFFGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG AS:i:-10 XN:i:0 XM:i:2 XO:i:0 XG:i:0 NM:i:2 MD:Z:72G0G27 YT:Z:UU
U0418-F000014195:2 0 ENST00000369811 558 255 101M * 0 0 GTTCCCATGCCTGGAGAAGCTAATGCCAACTCATCATGTGATAATTCAATTTGTACAATAAATTATGAACCTCAAAAAAAAAAAAAAAAAAAAAAAAAAAA -@1ACGDF=88D9@ECGGCGCGGFFG<FGGFGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGFFGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG AS:i:-10 XN:i:0 XM:i:2 XO:i:0 XG:i:0 NM:i:2 MD:Z:72G0G27 YT:Z:UU
C003-U0418-F000014195-2 0 ENST00000337003 483 255 101M * 0 0 GTTCCCATGCCTGGAGAAGCTAATGCCAACTCATCATGTGATAATTCAATTTGTACAATAAATTATGAACCTCAAAAAAAAAAAAAAAAAAAAAAAAAAAA -@1ACGDF=88D9@ECGGCGCGGFFG<FGGFGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGFFGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG AS:i:-10 XN:i:0 XM:i:2 XO:i:0 XG:i:0 NM:i:2 MD:Z:72G0G27 YT:Z:UU
So one Fastq entry does not align and the other four entries align uniquely against three different transcripts although I explicitly allow multiple alignments. The Bowtie2 call is as follows:
@PG ID:bowtie2 PN:bowtie2 VN:2.2.6 CL:"/home/schuisv1/ngs/scRNA-seq/K562-Fluidigm-test/exon-pipeline-scripts/tools/bowtie2-align-s --wrapper basic-0 -p 4 --reorder -k 2000 --phred33 --score-min L,0,-0.1 -x /home/schuisv1/ngs/scRNA-seq/K562-Fluidigm-test/exon-pipeline-files/fasta-files/ensembl_rna_hs-bowtie2Index/bowtie2Index -U C003-U0418-F000014195.fastq"
Has anybody seen such a behavior of Bowtie2 before?
Best regards,
Sven
when I align the following four identical Fastq entries (identical except for identifiers):
@C003-U0418-F000014195/1
GTTCCCATGCCTGGAGAAGCTAATGCCAACTCATCATGTGATAATTCAATTTGTACAATAAATTATGAACCTCAAAAAAAAAAAAAAAAAAAAAAAAAAAA
+
-@1ACGDF=88D9@ECGGCGCGGFFG<FGGFGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGFFGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG
@U0418-F000014195:1
GTTCCCATGCCTGGAGAAGCTAATGCCAACTCATCATGTGATAATTCAATTTGTACAATAAATTATGAACCTCAAAAAAAAAAAAAAAAAAAAAAAAAAAA
+
-@1ACGDF=88D9@ECGGCGCGGFFG<FGGFGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGFFGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG
@U0418-F000014195/1
GTTCCCATGCCTGGAGAAGCTAATGCCAACTCATCATGTGATAATTCAATTTGTACAATAAATTATGAACCTCAAAAAAAAAAAAAAAAAAAAAAAAAAAA
+
-@1ACGDF=88D9@ECGGCGCGGFFG<FGGFGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGFFGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG
@U0418-F000014195:2
GTTCCCATGCCTGGAGAAGCTAATGCCAACTCATCATGTGATAATTCAATTTGTACAATAAATTATGAACCTCAAAAAAAAAAAAAAAAAAAAAAAAAAAA
+
-@1ACGDF=88D9@ECGGCGCGGFFG<FGGFGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGFFGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG
@C003-U0418-F000014195-2
GTTCCCATGCCTGGAGAAGCTAATGCCAACTCATCATGTGATAATTCAATTTGTACAATAAATTATGAACCTCAAAAAAAAAAAAAAAAAAAAAAAAAAAA
+
-@1ACGDF=88D9@ECGGCGCGGFFG<FGGFGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGFFGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG
against Ensembl transcripts I get the following result:
C003-U0418-F000014195/1 4 * 0 0 * * 0 0 GTTCCCATGCCTGGAGAAGCTAATGCCAACTCATCATGTGATAATTCAATTTGTACAATAAATTATGAACCTCAAAAAAAAAAAAAAAAAAAAAAAAAAAA -@1ACGDF=88D9@ECGGCGCGGFFG<FGGFGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGFFGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG YT:Z:UU
U0418-F000014195:1 0 ENST00000369815 549 255 101M * 0 0 GTTCCCATGCCTGGAGAAGCTAATGCCAACTCATCATGTGATAATTCAATTTGTACAATAAATTATGAACCTCAAAAAAAAAAAAAAAAAAAAAAAAAAAA -@1ACGDF=88D9@ECGGCGCGGFFG<FGGFGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGFFGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG AS:i:-10 XN:i:0 XM:i:2 XO:i:0 XG:i:0 NM:i:2 MD:Z:72G0G27 YT:Z:UU
U0418-F000014195/1 0 ENST00000337003 483 255 101M * 0 0 GTTCCCATGCCTGGAGAAGCTAATGCCAACTCATCATGTGATAATTCAATTTGTACAATAAATTATGAACCTCAAAAAAAAAAAAAAAAAAAAAAAAAAAA -@1ACGDF=88D9@ECGGCGCGGFFG<FGGFGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGFFGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG AS:i:-10 XN:i:0 XM:i:2 XO:i:0 XG:i:0 NM:i:2 MD:Z:72G0G27 YT:Z:UU
U0418-F000014195:2 0 ENST00000369811 558 255 101M * 0 0 GTTCCCATGCCTGGAGAAGCTAATGCCAACTCATCATGTGATAATTCAATTTGTACAATAAATTATGAACCTCAAAAAAAAAAAAAAAAAAAAAAAAAAAA -@1ACGDF=88D9@ECGGCGCGGFFG<FGGFGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGFFGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG AS:i:-10 XN:i:0 XM:i:2 XO:i:0 XG:i:0 NM:i:2 MD:Z:72G0G27 YT:Z:UU
C003-U0418-F000014195-2 0 ENST00000337003 483 255 101M * 0 0 GTTCCCATGCCTGGAGAAGCTAATGCCAACTCATCATGTGATAATTCAATTTGTACAATAAATTATGAACCTCAAAAAAAAAAAAAAAAAAAAAAAAAAAA -@1ACGDF=88D9@ECGGCGCGGFFG<FGGFGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGFFGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG AS:i:-10 XN:i:0 XM:i:2 XO:i:0 XG:i:0 NM:i:2 MD:Z:72G0G27 YT:Z:UU
So one Fastq entry does not align and the other four entries align uniquely against three different transcripts although I explicitly allow multiple alignments. The Bowtie2 call is as follows:
@PG ID:bowtie2 PN:bowtie2 VN:2.2.6 CL:"/home/schuisv1/ngs/scRNA-seq/K562-Fluidigm-test/exon-pipeline-scripts/tools/bowtie2-align-s --wrapper basic-0 -p 4 --reorder -k 2000 --phred33 --score-min L,0,-0.1 -x /home/schuisv1/ngs/scRNA-seq/K562-Fluidigm-test/exon-pipeline-files/fasta-files/ensembl_rna_hs-bowtie2Index/bowtie2Index -U C003-U0418-F000014195.fastq"
Has anybody seen such a behavior of Bowtie2 before?
Best regards,
Sven
Comment