Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Consensus Calling BCFTOOLS PacBio gaps

    Dear community,
    I mapped reads from targeted sequencing (PacBio ccs) to a reference (BMA mem - local alignment because I got inverse PCR reads). Most often the complete reference is not covered. I'd like to get one consensus sequence, since the second variant will never show up (diploid organism). My aim is to get something like this:
    ATTTGATTTAGG-ATTGT-----------------ATGCTTCGTAT-T
    At the moment
    1. Seems like gaps are filled by the reference which I like to avoid
    2. looking at one locus in IGV I found that in one position the reference has an A, my reads are 172 gap, 2 C, 2 A, 1 T and mpileup is calling a T???!

    Here are the commands I use:
    Code:
    system "bcftools mpileup -x -P Pacbio -e 5 -f $ARGV[0].fasta $ARGV[0].sort.bam|bcftools call -m -Oz -o $ARGV[0].TESTvcf.gz";
    #-M output site where REF allele is N
    system "tabix $ARGV[0].TESTvcf.gz";
    system "cat $ARGV[0].fasta | bcftools consensus $ARGV[0].vcf.gz > $ARGV[0].cns.TESTfa"
    Could anybody help me which parameters I have to change (I also tried -M)? I tried a few but nothing gave me the output I expected.

    Thanks in advance!

  • #2
    Please help me people, is the question to simple or do the 120 readers have no idea as well?

    Comment


    • #3
      Okay, I now might know why bcftools fails (to many real gaps in long reads which might be sorted out then). Any other suggestions? I do not find an up to date best practice for GATK3/4 or did anyone of you ever used freebayes for issues like that?

      Comment

      Latest Articles

      Collapse

      • seqadmin
        Multiomics Techniques Advancing Disease Research
        by seqadmin


        New and advanced multiomics tools and technologies have opened new avenues of research and markedly enhanced various disciplines such as disease research and precision medicine1. The practice of merging diverse data from various ‘omes increasingly provides a more holistic understanding of biological systems. As Maddison Masaeli, Co-Founder and CEO at Deepcell, aptly noted, “You can't explain biology in its complex form with one modality.”

        A major leap in the field has
        ...
        02-08-2024, 06:33 AM
      • seqadmin
        The 3D Genome: New Technologies and Emerging Insights
        by seqadmin


        The study of three-dimensional (3D) genomics explores the spatial structure of genomes and their role in processes like gene expression and DNA replication. By employing innovative technologies, researchers can study these arrangements to discover their role in various biological processes. Scientists continue to find new ways in which the organization of DNA is involved in processes like development1 and disease2.

        Basic Organization and Structure
        Understanding...
        01-22-2024, 03:25 PM

      ad_right_rmr

      Collapse

      News

      Collapse

      Topics Statistics Last Post
      Started by seqadmin, Today, 08:57 AM
      0 responses
      9 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 02-14-2024, 09:19 AM
      0 responses
      42 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 02-12-2024, 03:37 PM
      0 responses
      402 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 02-09-2024, 03:36 PM
      0 responses
      646 views
      0 likes
      Last Post seqadmin  
      Working...
      X