Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • RTG Core 3.4 Release / non-commercial availability, including source code

    Just in time for the holidays, we are pleased to announce our new
    release! (with uploaded URLs)

    We are especially excited to be making RTG Core available
    for non-commercial academic research under improved terms
    (see LICENCE) in response to the feedback we have been receiving.
    The main highlights are:

    * Free for non-commercial academic research

    * Unlimited duration (no license key file required)

    * Source code available on github at:
    https://github.com/RealTimeGenomics/rtg-core

    * The non-commercial release is available now via the following links:

    rtg-core-3.4-non-commercial-linux-x64.zip (60.0 MB)

    rtg-core-3.4-non-commercial-nojre.zip (13.4 MB)

    rtg-core-3.4-non-commercial-windows-x64.zip (54.1 MB)

    If you have any problems or questions, you can contact us at
    [email protected] and we'll do our best to help you out.
    If you require a license for commercial use, or wish to purchase
    commercial support, contact us via [email protected].

    Below are the release notes for RTG Core 3.4. We aim
    to produce an updated release of RTG Tools but couldn't fit it in just
    yet -- look for that in the new year.

    === Release Notes for RTG Core 3.4 ===

    Below are the release notes for RTG Core, upon which RTG Core 3.4
    is built. Not all features described below may be included in this
    product.

    RTG Core 3.4 (2014-12-20)
    -------------------------

    Major features of this release:

    * Added the ability to run variant calling only on a list of regions
    provided via BED file. This results in a large speed improvement
    when performing exome variant calling, by avoiding computation
    associated with off-target locations, as well as permitting fast
    variant calling of target sites from whole genome data, or running
    variant calling in haploid mode in areas of loss-of-heterozygosity.

    * Added the ability to perform variant calling for sites where the
    reference is unknown but where reads have been mapped. This can be
    used to fill in gaps in draft reference assemblies. This includes
    both sites where an N is observed in the reference, larger N-blocks
    where reads have been mapped spanning the N block, and large
    N-blocks where reads are anchored on one side by known reference.

    * Workflow improvements to human pipeline processing to identify
    mislabelled samples or incorrect pedigree. At the end of read
    mapping, average coverage levels across chromosomes are examined and
    a warning is issued if there appear to be gross chromosomal
    abnormalities or if the coverage levels do not match expected levels
    for the sex of the individual specified. A standalone tool for this
    is also provided. Similarly, the mendelian analysis tool now
    computes concordance with pedigree and issues a warning if low
    concordance indicates a parent or child is inconsistent with the
    supplied pedigree. In addition we have added two commands for
    manipulating, extracting information from, and summarizing pedigree
    files.

    * New commands for metagenomics taxonomy and reference database
    management. Previously using metagenomics databases other than those
    pre-built by RTG was difficult and error-prone. Three commands have
    been added to allow taxonomy construction starting from a NCBI
    taxonomy dump, filtering the taxonomy based on user criteria, and
    validating the structure of a metagenomics species reference
    database.


    Detailed changes are listed below by area. Please read these through
    fully, as some command-line flags have changed, so updates to your
    pipeline scripts may be required. For more information on new
    features, see the RTG Operations Manual.


    == Basic Formatting and Mapping

    * map/cgmap/mapf: As an alternative to supplying --sex to specify the
    sex of the individual being mapped, you may specify a pedigree file
    containing the sex information for the sample. This requires you to
    have either formatted the read set with read-group information or to
    supply read group information at mapping time (the advantage of this
    feature is that it lets you minimize the number of command-line
    differences for each sample being mapped).

    * map/cgmap: When mapping using a reference containing sex chromosome
    information, average per-chromosome coverage information is used to
    issue warnings when it is likely that the incorrect mapping sex has
    been specified or if any autosomes have abnormal coverage levels
    (perhaps indicating a chromosomal abnormality). This feature
    requires you to be using a reference genome SDF containing chromosome
    information, as described in the RTG Operations Manual.

    * chrstats: New command to perform standalone average coverage
    reporting and checking against expected coverage levels from
    calibrated mapping files. This is essentially the same check that is
    performed during mapping, but allows multiple mapping files to be
    provided (either if multiple mapping runs were performed for a
    single sample, or for batch reporting for multiple samples).

    * calibrate: New option --merge to allow merging multiple alignment
    files into a single output file while performing calibration. For
    example, this can reduce the number of I/O operations needed to go
    from multiple, uncalibrated, unindexed third party input files to a
    single calibrated indexed BAM file.

    * calibrate: New option --threads to allow calibration of multiple files to
    use multiple cores. (Currently this option only takes effect when
    used with the --merge option, not regular multi-file calibration)


    === Variant Calling

    * snp/family/population/somatic: New flag --bed-regions, adds the
    ability to only perform calling on the regions specified via a BED
    file. This is more efficient than applying BED filtering via
    --filter-bed. However note that the results can sometimes differ,
    due to edge effects of complex calling regions that cross region
    boundaries.

    * snp/family/population/somatic: Implemented variant calling across
    N's in the reference. (This was previously occurring in some cases
    where mappings across the N contain indels, but has now been fully
    implemented). Calls where the reference is not a valid allele due to
    containing an N are annotated with an NREF INFO tag for easy
    filtering, and neither contain QUAL or GL values.

    * snp: As an alternative to supplying --sex to specify the sex of the
    individual for variant calling, you may specify a pedigree file
    containing the sex information for the sample. This can reduce the
    number of command-line differences when processing multiple samples.

    * family/population/somatic: Better error handling when input mappings
    contain a record that does not correspond to one of the samples
    being called.

    * snp/family/population/somatic: Fixed a hang that could occur when
    trying to clean up after an out-of-memory error.

    * snp/family/population/somatic: Fixed a rare crash that could occur
    at the end of chromosomes.

    * somatic: Previously stored a somatic score indicating the likelihood
    of the variant being a somatic variant in the QUAL field. This is
    not strictly according to the VCF spec, so this score has been moved
    to the new NCS INFO field.

    * vcfannotate: The --fill-ac-an flag now does not add an AC annotation
    when no ALTs are present in a record.

    * vcffilter: New flag --region to extract and filter only the variants
    contained within a single specified region.

    * vcffilter: New flag --bed-regions to extract and filter only
    variants contained within the regions contained in a BED file.

    * vcffilter: Better error handling when applying criteria that require
    GT be present to files that are missing the GT field.

    * vcfmerge: The default behaviour has changed when merging variants at
    the same position where the ALTs are different and the variants
    contain FORMAT fields that cannot be automatically be merged
    (Number=A,G,R, or the special case of the AD FORMAT field). Now
    these FORMAT fields are removed to allow the merge to proceed. There
    is a new flag --preserve-formats to instead output separate variants
    that keep those FORMAT fields.

    * vcfeval: New flag --baseline-tp that allows additionally outputing
    the baseline version of true positive variants (the regular tp.vcf
    contains the called representation of true positive variants).

    * vcfeval: --squash-ploidy treats heterozygous calls in baseline and
    calls as homozygous ALT to allow a lenient comparison. Note that
    genotypes at multi-allelic sites where neither allele is REF simply
    choose the ALT with the highest index.

    * vcfeval: Fixed an exception that could occur when processing variant
    missing GT information for some samples.

    * vcfeval: Fixed an exception that could occur when provided variants
    that were outside the bounds of the supplied reference genome

    * vcfeval: Fixed an inconsistency when handling ROC files in locales
    where ',' is the decimal separator.

    * mendelian: The default is now to perform checks only on non-failing
    variants. The --pass flag has been removed, and a new flag added
    --all-records in order to obtain the behaviour of checking all
    variant records regardless of filters.

    * mendelian: Now performs concordance checking to detect sample
    mislabelling and incorrect pedigree.

    * mendelian: Removed --male and --female flag, which were only needed
    for VCFs produced by versions of RTG prior to 2.7. If required,
    alternative pedigree information can be supplied via the --pedigree
    flag.


    === Metagenomics

    * ncbi2tax: New tool to generate an RTG taxonomy file from NCBI
    taxonomy dump.

    * taxfilter: New tool for the custom filtering of taxonomy files and
    metagenomic reference SDFs containing taxonomy information.

    * taxstats: New tool for verifying the contents of a metagenomic
    reference SDF.


    === Other

    * sdfsubseq: The output sequence name is the same as the input
    sequence if the coordinates are unchanged.

    * many: Added the ability to read BED from stdin by specifying '-' as
    the BED file name (this is not supported in cases where a region
    restriction is also being applied to the file, as this would require
    the BED to be tabix indexed)

    * many: Added the ability to read VCF from stdin by specifying '-' as
    the VCF file name (not supported in cases where a region restriction
    is also being applied to the file, as this would require the VCF to
    be tabix indexed)

    * many: Users of linux bash can enable command and flag
    completion. See the file rtg-bash-completion in the scripts
    directory for more information.

    * bgzip: New flag --no-terminate allows the omission the block gzip
    termination block. This permits advanced users to compress multiple
    files for later fast concatenation (the termination block should be
    present on the final file only).

    * bgzip: New flag --compression-level allows altering the degree of
    compression (thus speed) from 1 (least but fast) to 9 (best but
    slow).

    * rocplot: GUI mode has better error handling when there is no
    graphical environment.

    * rocplot: PNG output mode will attempt to use headless mode to
    prevent an error when the graphical environment is unavailable.

    * popsim: Speed improvements.

    * readsim/cgsim: Added the --sam-rg flag to set the read group
    information to be stored in the output SDF. Removed --diploid-input
    as the recommended way to simulate diploid genomes is to use
    samplereplay or the --output-sdf option of
    samplesim/childsim/denovosim.

    * readsimeval: New command for evaluating the accuracy of mapping reads
    generated by readsim.

    * pedfilter: New command for pedigree file filtering and simple
    manipulation and conversion between pedigree PED files and
    pedigree-augmented VCF headers.

    * pedstats: New command for extracting information and summarizing
    information contained in a pedigree file.

    * aview: The flag --dont-display-dots has been renamed to
    --no-dots for consistency.
    Last edited by Stuart Inglis; 01-06-2015, 05:42 PM. Reason: Fixed the URLs
    Stuart Inglis, Ph.D.
    Real Time Genomics
    www.realtimegenomics.com

  • #2
    I have updated the URLs in the previous post.
    Stuart Inglis, Ph.D.
    Real Time Genomics
    www.realtimegenomics.com

    Comment

    Latest Articles

    Collapse

    • seqadmin
      Current Approaches to Protein Sequencing
      by seqadmin


      Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
      04-04-2024, 04:25 PM
    • seqadmin
      Strategies for Sequencing Challenging Samples
      by seqadmin


      Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
      03-22-2024, 06:39 AM

    ad_right_rmr

    Collapse

    News

    Collapse

    Topics Statistics Last Post
    Started by seqadmin, 04-11-2024, 12:08 PM
    0 responses
    27 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 04-10-2024, 10:19 PM
    0 responses
    31 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 04-10-2024, 09:21 AM
    0 responses
    27 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 04-04-2024, 09:00 AM
    0 responses
    52 views
    0 likes
    Last Post seqadmin  
    Working...
    X