Attached is the original output from bedtools (zipped file) and a custom file using the awk code below in which the average reads per bait are calculated (average.txt)
Is it possible to output the length of the bait, average # of reads, and the # calculated for 40x coverage? I am not sure how to do this.
The awk will do the first 2 things (length of bait and average # of reads), but I am not sure how to accomplish the last part (calculate the 40x coverage). Thank you .
Code:
awk '{if(len==0){last=$4;total=$6;len=1;getline}if($4!=last){printf("%s\t%f\n", last, total/len);last=$4;total=$6;len=1}else{total+=$6;len+=1}}END{printf("%s\t%f\n", last, total/len)}' output.bam.hist.txt > average.txt
The awk will do the first 2 things (length of bait and average # of reads), but I am not sure how to accomplish the last part (calculate the 40x coverage). Thank you .