Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • gsgs
    Senior Member
    • Oct 2009
    • 139

    databases

    I'm looking for downloadable data to make lists/tables/graphs
    from mutations for better understanding the evolution

    usually I download the big files from genbank ftp and then filter
    for what I need.

    currently I'm looking for human Y-chromosomes,
    but these are large files, so I want to download a database
    of compressed human whole y-chromosomes and a utility that creates
    the chromosomes from the list in fasta-format

    (alternatively a new compressed format, but it should be
    well documented and easy and clearly,reasonably defined)


    I already did this for human(+partially primate) mtDNA,
    where they have ~16000 full mtDNAs at genbank,
    which I can send, if someone is interested.

    but I can't find good y-chr data



    found this paper:



    Online databases for mtDNA and Y chromosome polymorphisms in human populations
    Alessandra Congiu1,2,§, Paolo Anagnostou1,4,§, Nicola Milia1,2,§, Marco
    Capocasa3,4, Francesco Montinaro4,5 & Giovanni Destro Bisol

    GenBank, European Nucleotide Archive and DNA Data Bank of Japan,
    usually referred to as “primary databases”

    PopSet makes downloading of population data easie

    14 mtDNA online databases, three of which also contain Y chromosome data ´
    (DNA-Fingerprint, Family Tree DNA and SMGF)

    Of the 12 databases for which this information was obtainable, only six have been
    updated in the course of 2012. A reference paper and online help is available only
    for 10 databases. We were able to list 7 Y chromosome databases,

    Only three databases were found to have been updated in 2012

    Family Tree DNA is the largest archive for mtDNA sequences (mainly unpublished) both at
    low (HVR-1 and II) and high resolution (com-plete mtDNA or coding region) (see Appendix
    1A). Phylotree and mtDNA Community provide the largest wealth of published whole genome
    sequences, with figures (14508 and 13492, respectively) not far from GenBank (16414).

    The largest number of Y chromosome STR haplotypes is available in
    Family Tree DNA (236302), Ysearch (112513) and YHRD databases (101055)

    The former is also the greatest source of SNP/STR combined haplo-
    types (62795). Data from scientific literature are used in YHRD. By contrast, US Y-STR database
    seems to contains most, if not only, haplotypes submitted from forensic laboratories and institu-
    tions. It is noteworthy that, unlike with mtDNA, GenBank does not give access to Y chromosome
    population data in the haplotypic form.What databases make it possible to retrieve/sha

    Unrestricted downloading is possible from 9 mtDNA databases, whereas three of them (Family
    Tree DNA, DNA-Fingerprint and mtDNA man-ager) make it possible to retrieve only a part of the data

    Data can be downloaded from only one Y chromosome database (Ysearch), whereas
    another two allow a partial retrieval (Family Tree DNA and DNA-Fingerprint

    Phylotree contains the largest number of complete mtDNA genomes,

    A slightly lower number of mtDNA genomes is available in the recently published mtDNA
    Community (679 not available in GenBank),

    The number of sequences available in GenBank outnumbers these data-bases.

    Unfortunately, retrieving data from the relevant papers or (for unpublished data) obtaining them
    from corresponding authors is not always an easy task.

    YHRD contains a large number of high quality data for both STR and SNP loci. However, it
    cannot be directly accessed

    GenBank was found to include a total 16,414 complete DNA sequences (Database accessed
    on 20/09/2012).

Latest Articles

Collapse

  • SEQadmin2
    Nine Things a Sample Prep Scientist Thinks About Before Sequencing
    by SEQadmin2


    I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.

    Here are nine questions we think about, in roughly the order they matter, before...
    06-18-2026, 07:11 AM
  • SEQadmin2
    From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
    by SEQadmin2


    Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


    The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
    ...
    06-02-2026, 10:05 AM

ad_right_rmr

Collapse

News

Collapse

Topics Statistics Last Post
Started by SEQadmin2, 06-17-2026, 06:09 AM
0 responses
34 views
0 reactions
Last Post SEQadmin2  
Started by SEQadmin2, 06-09-2026, 11:58 AM
0 responses
97 views
0 reactions
Last Post SEQadmin2  
Started by SEQadmin2, 06-05-2026, 10:09 AM
0 responses
117 views
0 reactions
Last Post SEQadmin2  
Started by SEQadmin2, 06-04-2026, 08:59 AM
0 responses
112 views
0 reactions
Last Post SEQadmin2  
Working...