Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • Edoardo
    Junior Member
    • Mar 2016
    • 2

    Problem using SnpEff after Transdecoder: large number of warnings

    Hi,
    I am a PhD student with very little experience in bioinformatics (or very little experience at all, I started two months ago).
    I’m having some problems getting Snpeff to work with gff coordinates obtained by Transdecoder.
    I was given by a group with which I am collaborating the assembly of a genome and a gtf file with transcript information derived from RNAseq. I used Transdecoder following the instructions, with the –single_best_orf option, and I got the cds file and a gff3.
    I used the gff3 to build a database for snpeff, because I have to evaluate the effect of some SNPs on the genome. Howevere, when I launched Snpeff eff, I received a great number of warnings:

    INFO_REALIGN_3_PRIME 1
    WARNING_TRANSCRIPT_NO_START_CODON 202855
    WARNING_TRANSCRIPT_NO_START_CODON&INFO_REALIGN_3_PRIME 2
    WARNING_TRANSCRIPT_NO_STOP_CODON 17281

    Protein coding transcripts : 2426
    # Length errors : 0 ( 0,00% )
    # STOP codons in CDS errors : 0 ( 0,00% )
    # START codon errors : 686 ( 28,28% )
    # STOP codon warnings : 183 ( 7,54% )
    # UTR sequences : 2409 ( 99,30% )
    # Total Errors : 686 ( 28,28% )

    Given the low number of transcripts, this amount of warnings seems to be extremely high. Is it normal?
    Also, I checked the CDSs obtained by Transdecoder and, even if not all of them start with ATG, all of them have a start codon near the beginning of the sequence, so I really cannot explain this number of warnings.
    Do you have any suggestions?
    May the life of he/she who comes to my aid be filled with cakes and pizzas.
    Best Regards
    Edoardo

Latest Articles

Collapse

  • GATTACAT
    Reply to Nine Things a Sample Prep Scientist Thinks About Before Sequencing
    by GATTACAT
    Love this - good data definitely starts from good input, and poor input can only give relatively poor data. I particularly like the mention of Nanodrop/absorbance based methods for quantification. It's such a toss up if you'll get an accurate reading or what amounts to a randomly generated number, and a lot of library/sequencing related issues can be traced back to poor quant.
    Yesterday, 11:43 AM
  • SEQadmin2
    Nine Things a Sample Prep Scientist Thinks About Before Sequencing
    by SEQadmin2


    I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.

    Here are nine questions we think about, in roughly the order they matter, before...
    06-18-2026, 07:11 AM
  • SEQadmin2
    From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
    by SEQadmin2


    Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


    The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
    ...
    06-02-2026, 10:05 AM

ad_right_rmr

Collapse

News

Collapse

Topics Statistics Last Post
Started by SEQadmin2, 06-30-2026, 05:37 AM
0 responses
11 views
0 reactions
Last Post SEQadmin2  
Started by SEQadmin2, 06-26-2026, 11:10 AM
0 responses
18 views
0 reactions
Last Post SEQadmin2  
Started by SEQadmin2, 06-17-2026, 06:09 AM
0 responses
52 views
0 reactions
Last Post SEQadmin2  
Started by SEQadmin2, 06-09-2026, 11:58 AM
0 responses
111 views
0 reactions
Last Post SEQadmin2  
Working...