Header Leaderboard Ad


Find SNPs in related strains



No announcement yet.
  • Filter
  • Time
  • Show
Clear All
new posts

  • Find SNPs in related strains

    Hello everyone,

    I am extremely new at bioinformatics, genome sequencing and working with the output data, so please excuse any naive questions (I also just leanred working in Linux for samtools/bcftools).
    Our lab has recently sequenced the genome of a laboratory strain from which the type strain genome is known. The genome was sequenced using illumina and output was already processed for us using the DRAGEN pipeline.
    I have received all output from the sequencing, including .bam and .vcf files. I am starting to figure out what these files are, what kind of information they contain and how to work with them (yes, I am still at this level, sorry )

    Our end goal here is to first of all have a complete consensus sequence of the genome of our lab strain. Secondly, we would like to identify SNPs and identify their position compared to the annotated genome of our reference strain.

    I have already been able to use IGV, input the genome of our reference strain and import the vcf file to find the SNPs. I know there are 60 SNPs/indels. Is there some "easy" automated way to get a list of all variations without me having to scroll through the IGV and going over them one by one?
    I also tried using bcftools to get a consensus sequence using the a reference .fasta and the .bam file from the sequencing, but I get a sequence that is much smaller than my genome. I followed this guide: http://samtools.github.io/bcftools/h...-sequence.html

    Is there an easy basic guide that could first of all explain the file formats, where they come from and how they are connected to eachother? I think understanding this would get me started using samtools/bcftools more easily, since its tutorials assume knowledge about these things. Other nice information sources concerning my problems and goals are always welcome.

Latest Articles


  • seqadmin
    A Brief Overview and Common Challenges in Single-cell Sequencing Analysis
    by seqadmin

    ​​​​​​The introduction of single-cell sequencing has advanced the ability to study cell-to-cell heterogeneity. Its use has improved our understanding of somatic mutations1, cell lineages2, cellular diversity and regulation3, and development in multicellular organisms4. Single-cell sequencing encompasses hundreds of techniques with different approaches to studying the genomes, transcriptomes, epigenomes, and other omics of individual cells. The analysis of single-cell sequencing data i...

    01-24-2023, 01:19 PM
  • seqadmin
    Introduction to Single-Cell Sequencing
    by seqadmin
    Single-cell sequencing is a technique used to investigate the genome, transcriptome, epigenome, and other omics of individual cells using high-throughput sequencing. This technology has provided many scientific breakthroughs and continues to be applied across many fields, including microbiology, oncology, immunology, neurobiology, precision medicine, and stem cell research.

    The advancement of single-cell sequencing began in 2009 when Tang et al. investigated the single-cell transcriptomes
    01-09-2023, 03:10 PM