I have a question about syndip dataset : https://github.com/lh3/CHM-eval . I'm struggling to find the syndip vcf.
In the release ( https://github.com/lh3/CHM-eval/releases ), we have a file named : rep2.37.broad.hc.raw.vcf.gz, that i don't know what it is. And we have a file named CHM-evalkit-20180222.tar wich contain full.37m.vcf and other files ( bed, eval ...). So i did my search and according to this file: they mentionned that full.37m.vcf is the truth dataset. ( https://www.biorxiv.org/content/bior...1/456103-1.pdf Page 16).
The problem is that the file rep2.37.broad.hc.raw.vcf.gz contain variants with MQ, DP, GQ ... that i need to extract. But the full.37m.vcf doesn't contain this information.. ( just Chrom pos ref alt and QUAL.)
So i tried to intersect rep2.37.broad.hc.raw.vcf.gz with full.37m.vcf and take the variant that present in two files, with the DP MQ GQ in rep2.37.broad.hc.raw.vcf.gz. Is that okay ? Since I don't know what is rep2.37.broad.hc.raw.vcf.gz.
And i also noticed that the QUAL in the full.37m.vcf is always 30 .. Is it normal ? Thank's
Header Leaderboard Ad
Collapse
syndip dataset for benchmark variant
Collapse
Announcement
Collapse
No announcement yet.
X
Latest Articles
Collapse
-
by seqadmin
Amplicon sequencing is a targeted approach that allows researchers to investigate specific regions of the genome. This technique is routinely used in applications such as variant identification, clinical research, and infectious disease surveillance. The amplicon sequencing process begins by designing primers that flank the regions of interest. The DNA sequences are then amplified through PCR (typically multiplex PCR) to produce amplicons complementary to the targets. RNA targets...-
Channel: Articles
03-21-2023, 01:49 PM -
-
by seqadmin
Targeted sequencing is an effective way to sequence and analyze specific genomic regions of interest. This method enables researchers to focus their efforts on their desired targets, as opposed to other methods like whole genome sequencing that involve the sequencing of total DNA. Utilizing targeted sequencing is an attractive option for many researchers because it is often faster, more cost-effective, and only generates applicable data. While there are many approaches...-
Channel: Articles
03-10-2023, 05:31 AM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, Yesterday, 11:44 AM
|
0 responses
8 views
0 likes
|
Last Post
by seqadmin
Yesterday, 11:44 AM
|
||
Started by seqadmin, 03-24-2023, 02:45 PM
|
0 responses
18 views
0 likes
|
Last Post
by seqadmin
03-24-2023, 02:45 PM
|
||
Started by seqadmin, 03-22-2023, 12:26 PM
|
0 responses
18 views
0 likes
|
Last Post
by seqadmin
03-22-2023, 12:26 PM
|
||
Started by seqadmin, 03-17-2023, 12:32 PM
|
0 responses
19 views
0 likes
|
Last Post
by seqadmin
03-17-2023, 12:32 PM
|