Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • orionzhou
    Member
    • Sep 2009
    • 14

    Augustus output - how is the incompatible hint groups determined?

    Hi, I have a question about the augustus (v3.0.2, v3.0.1) output - specifically when it reports the number of "hint groups fully obeyed" and "incompatible hint groups".

    I used a test genomic sequence (~3kb) which has a typical two-exon gene, the sequence is available here.

    I know exactly where the exons begin and end, so I put these information in a hint file (I also pasted it below since it's really straightforward to understand).

    Code:
    scaffold_0      .       start   101     103     10      .       .       pri=2;grp=start;src=M
    scaffold_0      .       CDSpart 101     379     10      .       .       pri=2;grp=cds1;src=M
    scaffold_0      .       CDSpart 3040    3210    10      .       .       pri=2;grp=cds2;src=M
    scaffold_0      .       intronpart      380     3039    10      .       .       pri=2;grp=intron;src=M
    scaffold_0      .       stop    3208    3210    10      .       .       pri=2;grp=stop;src=M
    Then I ran augustus:

    Code:
    augustus --species=arabidopsis --hintsfile=hints.gff --gff3=on test.fas
    which gives the following output:

    Code:
    ##gff-version 3
    # This output was generated with AUGUSTUS (version 3.0.2).
    # ----- prediction on sequence number 1 (length = 3232, name = scaffold_0) -----
    #
    # Predicted genes for sequence number 1 on both strands
    # start gene g1
    scaffold_0    AUGUSTUS    gene    76    3232    0.08    +    .    ID=g1
    scaffold_0    AUGUSTUS    transcript    76    3232    0.08    +    .    ID=g1.t1;Parent=g1
    scaffold_0    AUGUSTUS    transcription_start_site    76    76    .    +    .    Parent=g1.t1
    scaffold_0    AUGUSTUS    exon    76    379    .    +    .    Parent=g1.t1
    scaffold_0    AUGUSTUS    start_codon    101    103    .    +    0    Parent=g1.t1
    scaffold_0    AUGUSTUS    intron    380    3039    1    +    .    Parent=g1.t1
    scaffold_0    AUGUSTUS    CDS    101    379    1    +    0    ID=g1.t1.cds;Parent=g1.t1
    scaffold_0    AUGUSTUS    CDS    3040    3210    1    +    0    ID=g1.t1.cds;Parent=g1.t1
    scaffold_0    AUGUSTUS    exon    3040    3232    .    +    .    Parent=g1.t1
    scaffold_0    AUGUSTUS    stop_codon    3208    3210    .    +    0    Parent=g1.t1
    scaffold_0    AUGUSTUS    transcription_end_site    3232    3232    .    +    .    Parent=g1.t1
    ......
    
    # Evidence for and against this transcript:
    # % of transcript supported by hints (any source): 60
    # CDS exons: 2/2
    #      M:   2
    # CDS introns: 1/1
    #      M:   1
    # 5'UTR exons and introns: 0/1
    # 3'UTR exons and introns: 0/1
    # hint groups fully obeyed: 1
    #      M:   1 (intron)
    # incompatible hint groups: 4
    #      M:   4 (start,cds1,cds2,stop)
    # end gene g1
    ###
    As you can see augustus is predicting exactly the same gene structure as the hint file. However, it is claiming that it is following only 1 hint (intron) and "incompatible" with all other hints (start codon, stop codon, plus two CDS). How could this be possible?

    PS: I changed "CDSpart" to "CDS" and augustus does seem to obey the rule - but what if I don't know the exact boundary of a CDS and have to put in "CDSpart"?

    PPS: I have tried augustus v3.0.2, v3.0.1, v2.7.1 and problem all exists, however, v2.5.5 gave the right summary report though.

    Any comments would be greatly appreciated!

Latest Articles

Collapse

  • GATTACAT
    Reply to Nine Things a Sample Prep Scientist Thinks About Before Sequencing
    by GATTACAT
    Love this - good data definitely starts from good input, and poor input can only give relatively poor data. I particularly like the mention of Nanodrop/absorbance based methods for quantification. It's such a toss up if you'll get an accurate reading or what amounts to a randomly generated number, and a lot of library/sequencing related issues can be traced back to poor quant.
    07-01-2026, 11:43 AM
  • SEQadmin2
    Nine Things a Sample Prep Scientist Thinks About Before Sequencing
    by SEQadmin2


    I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.

    Here are nine questions we think about, in roughly the order they matter, before...
    06-18-2026, 07:11 AM

ad_right_rmr

Collapse

News

Collapse

Topics Statistics Last Post
Started by SEQadmin2, Yesterday, 11:08 AM
0 responses
7 views
0 reactions
Last Post SEQadmin2  
Started by SEQadmin2, 06-30-2026, 05:37 AM
0 responses
11 views
0 reactions
Last Post SEQadmin2  
Started by SEQadmin2, 06-26-2026, 11:10 AM
0 responses
19 views
0 reactions
Last Post SEQadmin2  
Started by SEQadmin2, 06-17-2026, 06:09 AM
0 responses
53 views
0 reactions
Last Post SEQadmin2  
Working...