hi, according to the Novoalign NGS Quick Start Tutorial, I ued the following command :
######################################
#! /bin/sh
if [ $# -lt 3 ]; then
echo [ query.fa ][reference.fa] [head]
exit
fi
/var/data/novocraft/novoindex $2.index $2
/var/data/novocraft/novoalign -o SAM -f $1 -d $2.index > $3.aln.sam
/var/data/samtools-0.1.7_x86_64-linux/samtools view -bS $3.aln.sam > $3.aln.bam
/var/data/samtools-0.1.7_x86_64-linux/samtools sort $3.aln.bam $3.aln.sort
/var/data/samtools-0.1.7_x86_64-linux/samtools rmdup -s $3.aln.sort.bam read1.rmpup.bam
/var/data/samtools-0.1.7_x86_64-linux/samtools view -u -q 20 read1.rmpup.bam > read1.rmpup.q20.bam
/var/data/samtools-0.1.7_x86_64-linux/samtools pileup -vcf $2 read1.rmpup.q20.bam > $3.raw.txt
perl /var/data/samtools-0.1.7_x86_64-linux/samtools.pl varFilter -D 1000 $3.raw.txt > $3.flt.txt
awk '($3=="*"&&$6>=50)||($3!="*"&&$6>=30)' $3.flt.txt > $3.final.txt
##################################################
something puzzled me is that read1.rmpup.q20.bam is bigger than read1.rmpup.bam, since -q 20 is a limitation.
another interesting result is -D in varFilter has great effect on the flt.txt. when i use -D 1000, all of the result in flt.txt are indel. while -D 100, most of result are snp.
any advise will be great appreciated.
######################################
#! /bin/sh
if [ $# -lt 3 ]; then
echo [ query.fa ][reference.fa] [head]
exit
fi
/var/data/novocraft/novoindex $2.index $2
/var/data/novocraft/novoalign -o SAM -f $1 -d $2.index > $3.aln.sam
/var/data/samtools-0.1.7_x86_64-linux/samtools view -bS $3.aln.sam > $3.aln.bam
/var/data/samtools-0.1.7_x86_64-linux/samtools sort $3.aln.bam $3.aln.sort
/var/data/samtools-0.1.7_x86_64-linux/samtools rmdup -s $3.aln.sort.bam read1.rmpup.bam
/var/data/samtools-0.1.7_x86_64-linux/samtools view -u -q 20 read1.rmpup.bam > read1.rmpup.q20.bam
/var/data/samtools-0.1.7_x86_64-linux/samtools pileup -vcf $2 read1.rmpup.q20.bam > $3.raw.txt
perl /var/data/samtools-0.1.7_x86_64-linux/samtools.pl varFilter -D 1000 $3.raw.txt > $3.flt.txt
awk '($3=="*"&&$6>=50)||($3!="*"&&$6>=30)' $3.flt.txt > $3.final.txt
##################################################
something puzzled me is that read1.rmpup.q20.bam is bigger than read1.rmpup.bam, since -q 20 is a limitation.
another interesting result is -D in varFilter has great effect on the flt.txt. when i use -D 1000, all of the result in flt.txt are indel. while -D 100, most of result are snp.
any advise will be great appreciated.
Comment