Hello, there,
I did de novo RNA-seq cufflinks assembly. I then merged transcripts.gtf files from different replicates to obtain merged.gtf in the absence of reference annotation.
This merged.gtf showed that the transcripts with the same tss_id have different leftmost exons:
Here is the example:
chr2 Cufflinks exon 25289899 25290661 . . . gene_id "XLOC_005458"; transcript_id "TCONS_00010739"; exon_number "1"; oId "CUFF.5451.1"; tss_id "TSS7438";
chr2 Cufflinks exon 25290738 25290883 . . . gene_id "XLOC_005458"; transcript_id "TCONS_00010739"; exon_number "2"; oId "CUFF.5451.1"; tss_id "TSS7438";
chr2 Cufflinks exon 25290976 25291190 . . . gene_id "XLOC_005458"; transcript_id "TCONS_00010739"; exon_number "3"; oId "CUFF.5451.1"; tss_id "TSS7438";
chr2 Cufflinks exon 25289938 25290082 . . . gene_id "XLOC_005458"; transcript_id "TCONS_00010740"; exon_number "1"; oId "CUFF.5451.2"; tss_id "TSS7438";
chr2 Cufflinks exon 25290388 25291177 . . . gene_id "XLOC_005458"; transcript_id "TCONS_00010740"; exon_number "2"; oId "CUFF.5451.2"; tss_id "TSS7438";
this gene: XLOC_005458, has two transcript isoforms: TCONS_00010739 and TCONS_00010740, and those two isoforms starts with the same transcription start site since the tss_ids are the same: TSS7438
However, transcript TCONS_00010739 apparently starts at 25289899 (exon1 left coordinate) on chr2, and transcript TCONS_00010740 starts at 25289938 (exon1 left coordinate) on chr2
How should I explain this? I mean if the tss_ids are the same, their leftmost exons should start at the same coordinates.
Thank you very much.
C.
I did de novo RNA-seq cufflinks assembly. I then merged transcripts.gtf files from different replicates to obtain merged.gtf in the absence of reference annotation.
This merged.gtf showed that the transcripts with the same tss_id have different leftmost exons:
Here is the example:
chr2 Cufflinks exon 25289899 25290661 . . . gene_id "XLOC_005458"; transcript_id "TCONS_00010739"; exon_number "1"; oId "CUFF.5451.1"; tss_id "TSS7438";
chr2 Cufflinks exon 25290738 25290883 . . . gene_id "XLOC_005458"; transcript_id "TCONS_00010739"; exon_number "2"; oId "CUFF.5451.1"; tss_id "TSS7438";
chr2 Cufflinks exon 25290976 25291190 . . . gene_id "XLOC_005458"; transcript_id "TCONS_00010739"; exon_number "3"; oId "CUFF.5451.1"; tss_id "TSS7438";
chr2 Cufflinks exon 25289938 25290082 . . . gene_id "XLOC_005458"; transcript_id "TCONS_00010740"; exon_number "1"; oId "CUFF.5451.2"; tss_id "TSS7438";
chr2 Cufflinks exon 25290388 25291177 . . . gene_id "XLOC_005458"; transcript_id "TCONS_00010740"; exon_number "2"; oId "CUFF.5451.2"; tss_id "TSS7438";
this gene: XLOC_005458, has two transcript isoforms: TCONS_00010739 and TCONS_00010740, and those two isoforms starts with the same transcription start site since the tss_ids are the same: TSS7438
However, transcript TCONS_00010739 apparently starts at 25289899 (exon1 left coordinate) on chr2, and transcript TCONS_00010740 starts at 25289938 (exon1 left coordinate) on chr2
How should I explain this? I mean if the tss_ids are the same, their leftmost exons should start at the same coordinates.
Thank you very much.
C.