Seqanswers Leaderboard Ad

**GenoMax** · 03-12-2015, 03:53 AM

What format is your data in (GTF)?

**illinu** · 03-12-2015, 05:12 AM

The blast report is a tab separated file, the transcritps in fasta, but I am also producing a GTF3 file with another annotator

**GenoMax** · 03-12-2015, 05:51 AM

Depending on what your files look like you may be able use some combination of unix utilities (grep/sort/uniq/awk etc) but otherwise this may need custom code.

Have you tried doing some basic sorting in excel on the GTF file?

**illinu** · 03-12-2015, 06:05 AM

Yes, when I sort by the subject hit I clearly see when two or more isoforms are duplicated or not (when the subject start-end position overlaps between the two isoforms) or multiassembled (when each isoform hits a different part of the subject), but there are 40k isoforms and I want to automate the analysis.

Using grep/sort/uniq/awk would be great. I'v done all the previous filtering with those tools but I don't see how I could use them with what I want to do now.

Il.

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 31 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 33 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 28 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 53 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad

Announcement

identify duplicated transcripts from blast report

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News