Hi everyone,
I'm having the problem mentioned in the title above and it's not making any sense to me. In the RNA-Seq dataset that I have I run STAR, then I look at the left over transcripts, usually blast some of them or something. Often they are still mostly human (which get aligned to hg20 using bowtie2). I can't understand this at all, STAR being a spliced aligner should be aligning far more than bowtie2 does. I was thinking it could indicate human DNA contamination but even then shouldn't STAR still align continuous sequences? Here are two such reads that weren't aligned by STAR but are by Bowtie2 (They're not paired end, so this is two different reads). I'd hate to stop using STAR, love that speed.
TATCGTAGTTATCTACACGACGGGGAGTCAGGCAACTATGGATGAACGAAATAGACAGATCGCTGAGATAGGTGCCTCACTGATTAAGCATTGGTAACTGT
ACCTTCTAGTGGTGTTTACTTGAGACCTTTTGTCATTTAATGTGTGCTGAATAAATGCCAGCACCCCTGAGTAGAAAGCAATCATGTACCTGCAGATGGTC
Hopefully someone can point me in the right direction!
Thanks!
I'm having the problem mentioned in the title above and it's not making any sense to me. In the RNA-Seq dataset that I have I run STAR, then I look at the left over transcripts, usually blast some of them or something. Often they are still mostly human (which get aligned to hg20 using bowtie2). I can't understand this at all, STAR being a spliced aligner should be aligning far more than bowtie2 does. I was thinking it could indicate human DNA contamination but even then shouldn't STAR still align continuous sequences? Here are two such reads that weren't aligned by STAR but are by Bowtie2 (They're not paired end, so this is two different reads). I'd hate to stop using STAR, love that speed.
TATCGTAGTTATCTACACGACGGGGAGTCAGGCAACTATGGATGAACGAAATAGACAGATCGCTGAGATAGGTGCCTCACTGATTAAGCATTGGTAACTGT
ACCTTCTAGTGGTGTTTACTTGAGACCTTTTGTCATTTAATGTGTGCTGAATAAATGCCAGCACCCCTGAGTAGAAAGCAATCATGTACCTGCAGATGGTC
Hopefully someone can point me in the right direction!
Thanks!
Comment