I used TopHat to run the same RNA-Seq data with different -r/--mate-inner-dist and --mate-std-dev.
Here are the parameters:
1. -r 160, --mate-std-dev (default) 20
2. -r (default) 50, --mate-std-dev (default) 20
3. -r 0, --mate-std-dev 60
After the TopHat runned, I used the samtools flagstat to estimates the results.
The results are listed below in order:
1.-r 160, --mate-std-dev (default) 20
2.-r (default) 50, --mate-std-dev (default) 20
3. -r 0, --mate-std-dev 60
As the total input reads of the sample were 31387112, so at first I felt confusing about the result 3, because the total output reads of accepted_hits.bam were much more than the total input reads.
After I checked the bam file, I found there were lots of repeats because of the multihits.
So the results I've got from the samtools flagstat were not that accurate.
Is there any way to estimates the mapping rates and unique mapping rates or anything else?
Hoping for your help!
Here are the parameters:
1. -r 160, --mate-std-dev (default) 20
2. -r (default) 50, --mate-std-dev (default) 20
3. -r 0, --mate-std-dev 60
After the TopHat runned, I used the samtools flagstat to estimates the results.
The results are listed below in order:
1.-r 160, --mate-std-dev (default) 20
Code:
27139030 + 0 in total (QC-passed reads + QC-failed reads) 0 + 0 duplicates 27139030 + 0 mapped (100.00%:-nan%) 27139030 + 0 paired in sequencing 14171642 + 0 read1 12967388 + 0 read2 22063409 + 0 properly paired (81.30%:-nan%) 24154960 + 0 with itself and mate mapped 2984070 + 0 singletons (11.00%:-nan%) 516422 + 0 with mate mapped to a different chr 217580 + 0 with mate mapped to a different chr (mapQ>=5) 4141901 + 2091 in total (QC-passed reads + QC-failed reads) 0 + 0 duplicates 0 + 0 mapped (0.00%:0.00%) 4141901 + 2091 paired in sequencing 1533088 + 997 read1 2608813 + 1094 read2 0 + 0 properly paired (0.00%:0.00%) 0 + 0 with itself and mate mapped 0 + 0 singletons (0.00%:0.00%) 0 + 0 with mate mapped to a different chr 0 + 0 with mate mapped to a different chr (mapQ>=5)
Code:
27639199 + 0 in total (QC-passed reads + QC-failed reads) 0 + 0 duplicates 27639199 + 0 mapped (100.00%:-nan%) 27639199 + 0 paired in sequencing 14422450 + 0 read1 13216749 + 0 read2 21085751 + 0 properly paired (76.29%:-nan%) 24654856 + 0 with itself and mate mapped 2984343 + 0 singletons (10.80%:-nan%) 706460 + 0 with mate mapped to a different chr 215918 + 0 with mate mapped to a different chr (mapQ>=5) 4142842 + 2091 in total (QC-passed reads + QC-failed reads) 0 + 0 duplicates 0 + 0 mapped (0.00%:0.00%) 4142842 + 2091 paired in sequencing 1533869 + 997 read1 2608973 + 1094 read2 0 + 0 properly paired (0.00%:0.00%) 0 + 0 with itself and mate mapped 0 + 0 singletons (0.00%:0.00%) 0 + 0 with mate mapped to a different chr 0 + 0 with mate mapped to a different chr (mapQ>=5)
Code:
41145664 + 0 in total (QC-passed reads + QC-failed reads) 0 + 0 duplicates 41145664 + 0 mapped (100.00%:-nan%) 41145664 + 0 paired in sequencing 21422982 + 0 read1 19722682 + 0 read2 22975306 + 0 properly paired (55.84%:-nan%) 37774543 + 0 with itself and mate mapped 3371121 + 0 singletons (8.19%:-nan%) 10967682 + 0 with mate mapped to a different chr 207758 + 0 with mate mapped to a different chr (mapQ>=5) 2934463 + 2091 in total (QC-passed reads + QC-failed reads) 0 + 0 duplicates 0 + 0 mapped (0.00%:0.00%) 2934463 + 2091 paired in sequencing 906826 + 997 read1 2027637 + 1094 read2 0 + 0 properly paired (0.00%:0.00%) 0 + 0 with itself and mate mapped 0 + 0 singletons (0.00%:0.00%) 0 + 0 with mate mapped to a different chr 0 + 0 with mate mapped to a different chr (mapQ>=5)
After I checked the bam file, I found there were lots of repeats because of the multihits.
So the results I've got from the samtools flagstat were not that accurate.
Is there any way to estimates the mapping rates and unique mapping rates or anything else?
Hoping for your help!
Comment