I am interested in searching for a specific sequence in both my RNA-seq and WGS data, and the sequence is quite a bit above the read lengths for either experiment. I have access to all BAM files, some VCF files for WGS, raw fastq files, and everything else you can imagine coming from the sequencing. I want to see if a sequence is present in the data, and if it is, if it's present in the aligned or unaligned BAM files.
The background to my question would be that the sequence in question is a sequence that I believe would not be successfully mapped to the reference, but might still exist in the data/reads. I am unsure of how to go about this, or if it's even something that can be done.
My initial idea was to create some kind of consensus sequence from the RNA-seq BAM-files (both unaligned and aligned), and simply search the resulting sequencing against my sequence of interest. This, however, has proven to be hard, as there seems to be numerous ways of doing it according to Google, and none being the best (the "best" of which involving vcftools, which I for the life of me I cannot get to install on my Mac; no make files, although the documentation says there should be!)
In essence, I just want to find my sequence in my data. How do I do this?
The background to my question would be that the sequence in question is a sequence that I believe would not be successfully mapped to the reference, but might still exist in the data/reads. I am unsure of how to go about this, or if it's even something that can be done.
My initial idea was to create some kind of consensus sequence from the RNA-seq BAM-files (both unaligned and aligned), and simply search the resulting sequencing against my sequence of interest. This, however, has proven to be hard, as there seems to be numerous ways of doing it according to Google, and none being the best (the "best" of which involving vcftools, which I for the life of me I cannot get to install on my Mac; no make files, although the documentation says there should be!)
In essence, I just want to find my sequence in my data. How do I do this?
Comment