Greetings,
I have MiSeq data containing 5M paired end reads of my sample. The sample originally contained some 30,000 individual fragments of up to 3kb each containing a bar-code on both ends. The reads were sheared to 150bp and sequenced in Ilumina Miseq. I am trying the following.
1. /grep the original Fastaq file for the reads containing the barcode and recording the ID for ~300,000 reads (only about 10,000 unique reads). --Complete
2. Generate .bam file of the entire fastaq file aligned to the genome --Complete
3. Pull out the genomic start location of each of the aligned reads that match the ID from from step 1.--Incomplete
4. Plot locations of barcoded reads across reference. --Incomplete
I have limited programming experience. So far, i'm using Matlab to parse out the bar-coded regions, but am stuck at trying to match the bar-code ID's to the aligned reads from the .BAM file. How can I make a file containing the position of reads (or even the entire read entry) in my .bam file based on another file containing read ID? Would it be easiest to do a new alignment with my much smaller set of bar-coded fastq files?
Thank you,
Ben
I have MiSeq data containing 5M paired end reads of my sample. The sample originally contained some 30,000 individual fragments of up to 3kb each containing a bar-code on both ends. The reads were sheared to 150bp and sequenced in Ilumina Miseq. I am trying the following.
1. /grep the original Fastaq file for the reads containing the barcode and recording the ID for ~300,000 reads (only about 10,000 unique reads). --Complete
2. Generate .bam file of the entire fastaq file aligned to the genome --Complete
3. Pull out the genomic start location of each of the aligned reads that match the ID from from step 1.--Incomplete
4. Plot locations of barcoded reads across reference. --Incomplete
I have limited programming experience. So far, i'm using Matlab to parse out the bar-coded regions, but am stuck at trying to match the bar-code ID's to the aligned reads from the .BAM file. How can I make a file containing the position of reads (or even the entire read entry) in my .bam file based on another file containing read ID? Would it be easiest to do a new alignment with my much smaller set of bar-coded fastq files?
Thank you,
Ben