Seqanswers Leaderboard Ad

**scordes** · 11-30-2011, 08:15 AM

The Complete Genomics variation or var file is one of the key results provided in a data package that is delivered as part of our sequencing service. The entire data package is generated from our proprietary Complete Genomics Analysis Pipeline software that we are running internally and is currently not available for customers to run themselves. The master variation or masterVar file is based on the var file and provides a reformatted and highly annotated list of variant calls. Both var and masterVar files are standard files that Complete Genomics provides with each sequenced genome. Both files also contain a haplinkID that indicates phasing of nearby (within ~700 kb) variants. The algorithm that is being used to generate phase information populated in the HapLink field is a component of the Analysis Pipeline, and thus, is not customer accessible.

As you have discovered, we do provide CGA Tools, a suite of open source tools that support the downstream analysis, manipulation, and format conversion of the files that we deliver to customers. Complete Genomics reads and mapping files can be converted into SAM/BAM format using the map2sam and evidence2sam CGA Tools (http://cgatools.sourceforge.net).

A paper describing the computational techniques underlying our mapping, assembly, and variant calling process used in the Analysis Pipeline software was recently accepted in the Journal of Computational Biology (Carnevali, et. al., in press). If you have specific questions that we can address ahead of its publication, please feel free to contact us at [email protected].

**agel** · 12-02-2011, 09:23 AM

Originally posted by Marie_Noir View Post

There is also samtools phase, however it produces two files which have to be brought together again which is kind of cumbersome (and don't know yet if it handles indels).

How does samtools phase work, exactly?

Another option is Haplotype Improver. It uses a combination of paired-end data and older algorithms like phase and fastPhase. Not sure about indels either though.

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 37 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 41 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 35 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 54 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad

Announcement

cga tools and phasing

Comment

Comment

Latest Articles

ad_right_rmr

News