I am hunting for transcripts missing in a mutant phenotype known to be a ~1mb genomic deletion. However, the genome assembly in this region is very poor. I have 4 lanes of RNAseq reads for both the mutant and WT siblings, as well as a lane of whole-genome sequencing 1x50bp for each.
I have already run the WT sibling RNAseq data through Trinity and come up with a de-novo transcriptome, and I have mapped the WGS for both mutant and WT reads to the transcriptome with bowtie2 under --local. I got 16% exactly 1 alignment, which seems reasonable to me for mapping genome to transcriptome.
My question is twofold - is this the best approach? Would a different aligner, under different settings, be better? I am working somewhat under the assumption that most of the Trinity contigs will be 3' UTR regions in the range of a few hundred base pairs and so are not likely to be spliced.
My second question is how best to mine these alignments for potentially deleted transcripts. My first thought is to generate coverage wiggle plots across each transcript and then try different arbitrary difference cutoffs between the two plots and see what I come up with. My second thought is to use a differential expression program such as edgeR and see what it comes up with. Anyone have any better ideas?
I have already run the WT sibling RNAseq data through Trinity and come up with a de-novo transcriptome, and I have mapped the WGS for both mutant and WT reads to the transcriptome with bowtie2 under --local. I got 16% exactly 1 alignment, which seems reasonable to me for mapping genome to transcriptome.
My question is twofold - is this the best approach? Would a different aligner, under different settings, be better? I am working somewhat under the assumption that most of the Trinity contigs will be 3' UTR regions in the range of a few hundred base pairs and so are not likely to be spliced.
My second question is how best to mine these alignments for potentially deleted transcripts. My first thought is to generate coverage wiggle plots across each transcript and then try different arbitrary difference cutoffs between the two plots and see what I come up with. My second thought is to use a differential expression program such as edgeR and see what it comes up with. Anyone have any better ideas?