Hi,
I am using StringTie for transcriptome reconstruction and identification of new isoforms.
While I was exploring the "[file]-transcripts.gtf" output file from stringtie I found something that intrigued me... in the example below I show three isoforms resulting from the same gene ("STRG.14686") and the last one was present in the reference annotation. However the start and end coordinates do not match. The first two isoforms end at 394180 bp and 389422 bp, respectively, while the third starts at 396257 bp...
Why is StringTie "clustering" these isoforms in the same gene?
I am using StringTie for transcriptome reconstruction and identification of new isoforms.
While I was exploring the "[file]-transcripts.gtf" output file from stringtie I found something that intrigued me... in the example below I show three isoforms resulting from the same gene ("STRG.14686") and the last one was present in the reference annotation. However the start and end coordinates do not match. The first two isoforms end at 394180 bp and 389422 bp, respectively, while the third starts at 396257 bp...
scaffold_96 StringTie transcript 383404 394180 1000 + .gene_id "STRG.14686"; transcript_id "STRG.14686.1"; cov "72.829010"; FPKM "6.506798"; TPM "8.058224";
scaffold_96 StringTie transcript 383404 389422 1000 + .gene_id "STRG.14686"; transcript_id "STRG.14686.2"; cov "61.675678"; FPKM "5.510321"; TPM "6.824155";
scaffold_96 StringTie transcript 396257 398001 1000 + .gene_id "STRG.14686"; transcript_id "STRG.14686.3"; reference_id "scaffold_96.g39603.t1"; ref_gene_id "scaffold_96.g39603"; cov "2963.938721"; FPKM "264.808624"; TPM "327.947357";
scaffold_96 StringTie transcript 383404 389422 1000 + .gene_id "STRG.14686"; transcript_id "STRG.14686.2"; cov "61.675678"; FPKM "5.510321"; TPM "6.824155";
scaffold_96 StringTie transcript 396257 398001 1000 + .gene_id "STRG.14686"; transcript_id "STRG.14686.3"; reference_id "scaffold_96.g39603.t1"; ref_gene_id "scaffold_96.g39603"; cov "2963.938721"; FPKM "264.808624"; TPM "327.947357";
Comment