Hi,
as mentioned here, we are mapping drosophila samples using the STAR aligner. To get an impression of the mapping quality we compared them also to tophat2 using these commands:
The mapping percentage varies between better and a lot better towards the STAR algorithm, but when comparing the splice junction files, tophat2 can identify 56058 junctions while STAR only 46855.
I have looked at the bam files with IGV (images attached below) and it is very clear, that tophat2 can identify a lot of very long splice junctions which STAR can't deal with.
As you can see (light blue lines in IGV), the short splice junctions are identified by both algorithms, but for the longer ones, tophat2 has a lot more of them.
Is there a way to adjust the STAR parameters so that i can also find these junctions?
thanks
Assa
as mentioned here, we are mapping drosophila samples using the STAR aligner. To get an impression of the mapping quality we compared them also to tophat2 using these commands:
Code:
~/software/STAR-STAR_2.4.1c/STAR --runThreadN 15 --genomeDir genomes/Drosophila_melanogaster/STARindex/Dmel/ --readFilesIn $file --readFilesCommand zcat --sjdbGTFfile genes.gtf --outFilterType BySJout --outFilterMultimapNmax 1 --alignSJoverhangMin 8 --outFileNamePrefix $NEW_FILE.STAR. --outSAMtype BAM Unsorted --outReadsUnmapped Fastx --outFilterMismatchNoverLmax 0.05 --outFilterScoreMinOverLread 0 --outFilterMatchNminOverLread 0 --alignIntronMax 1 tophat2 -p 15 -g 1 -G genes.gtf -o $NEW_FILE.tophat.out genomes/Drosophila_melanogaster/Ensembl/BDGP6.80/bowtie2index/genome $file
I have looked at the bam files with IGV (images attached below) and it is very clear, that tophat2 can identify a lot of very long splice junctions which STAR can't deal with.
As you can see (light blue lines in IGV), the short splice junctions are identified by both algorithms, but for the longer ones, tophat2 has a lot more of them.
Is there a way to adjust the STAR parameters so that i can also find these junctions?
thanks
Assa
Comment