Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • vcf to consensus call reference instead of N

    Hi all

    I have a few genomes sequenced using illumina. I have used samtools and vcfutils to make a consensus for each. All pretty standard stuff. However using vctutils to make the consensus gives me far to many N nucleotides to be happy with.

    Is there a way that i can create this consensus but where it currently has provided an N nucleotide it actually inserts the reference nucleotide in its place?

    Your help would be much appreciated. Any ideas?

  • #2
    Have you tried samtools mpileup?

    Comment


    • #3
      Sorry thats what ive used in the process. I pretty much used this exact command:

      samtools mpileup -uf ref.fa aln1.bam aln2.bam | bcftools view -bvcg - > var.raw.bcf
      bcftools view var.raw.bcf | vcfutils.pl varFilter -D100 > var.flt.vcf

      Anything i can change there to get it to call the ref seq rather than N's?

      Comment


      • #4
        Normally you get those Ns when you don't specify the reference or specify a different version of the reference file in the command line.

        One thing I would check is if the mpileup is producing the reference bases at each position and not Ns without piping the first output to bcftools.

        Comment


        • #5
          Thanks for the response i'll test the mpileup tonight.

          I'm pretty sure i have the reference sorted. I've used snpEff to analyse the SNPs in the genomes and for each SNP it has had the correct reference sequence.

          Could be be a problem of coverage? as in the sequencing reads might not cover this area?

          Comment


          • #6
            Could be coverage, but if that's the case then it shouldn't be making a variant call at that site right with a certain degree of confidence right?

            Comment

            Latest Articles

            Collapse

            • seqadmin
              Genetic Variation in Immunogenetics and Antibody Diversity
              by seqadmin



              The field of immunogenetics explores how genetic variations influence immune responses and susceptibility to disease. In a recent SEQanswers webinar, Oscar Rodriguez, Ph.D., Postdoctoral Researcher at the University of Louisville, and Ruben Martínez Barricarte, Ph.D., Assistant Professor of Medicine at Vanderbilt University, shared recent advancements in immunogenetics. This article discusses their research on genetic variation in antibody loci, antibody production processes,...
              11-06-2024, 07:24 PM
            • seqadmin
              Choosing Between NGS and qPCR
              by seqadmin



              Next-generation sequencing (NGS) and quantitative polymerase chain reaction (qPCR) are essential techniques for investigating the genome, transcriptome, and epigenome. In many cases, choosing the appropriate technique is straightforward, but in others, it can be more challenging to determine the most effective option. A simple distinction is that smaller, more focused projects are typically better suited for qPCR, while larger, more complex datasets benefit from NGS. However,...
              10-18-2024, 07:11 AM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by seqadmin, 11-08-2024, 11:09 AM
            0 responses
            57 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 11-08-2024, 06:13 AM
            0 responses
            37 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 11-01-2024, 06:09 AM
            0 responses
            34 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 10-30-2024, 05:31 AM
            0 responses
            23 views
            0 likes
            Last Post seqadmin  
            Working...
            X