Dear seqanswers,
I am new to genomics and bioinformatics. In my current study, we have
sequenced the genomes of tens of accessions of a plant, using Illumina
next generation sequencer. The short reads of a specific accession
have been aligned to the reference. The SNPs and shor indels have been
predicted for a specific accession genome to the reference. We have gotten the separate files for SNPs like the following format (in text file, the column
names were listed to each accession, the accession name will not change for a specific accession):
<accession names> <chromosome><position><reference base><cons
base><quality><support><concordance><avg_hits>
But usually, we need to align all the accessions in the following
format for classical population genetic analysis:
<accessions><SNP_1><SNP_2><SNP_3><SNP_...>
accession_1, a,t,g,,,
accession_2, a,t,c,,,
accession_3, t,a,c,,,
accession_,,,,,,,,,,,,,
I need to get helps, suggestions on how to do this format conversion,
or if there are any alternative choices for me, by using R and
bioconductor? If it need database operations, and how to do that?
Thanks in advance.
I am new to genomics and bioinformatics. In my current study, we have
sequenced the genomes of tens of accessions of a plant, using Illumina
next generation sequencer. The short reads of a specific accession
have been aligned to the reference. The SNPs and shor indels have been
predicted for a specific accession genome to the reference. We have gotten the separate files for SNPs like the following format (in text file, the column
names were listed to each accession, the accession name will not change for a specific accession):
<accession names> <chromosome><position><reference base><cons
base><quality><support><concordance><avg_hits>
But usually, we need to align all the accessions in the following
format for classical population genetic analysis:
<accessions><SNP_1><SNP_2><SNP_3><SNP_...>
accession_1, a,t,g,,,
accession_2, a,t,c,,,
accession_3, t,a,c,,,
accession_,,,,,,,,,,,,,
I need to get helps, suggestions on how to do this format conversion,
or if there are any alternative choices for me, by using R and
bioconductor? If it need database operations, and how to do that?
Thanks in advance.