Dear all,
I have 12 sequenced libraries of illumina hiseq data, 6 biological replicates each for two conditions. The paired end reads contained a large amount of primer sequences, so i trimmed the data with trimmomatic and it turned out about half my reads lost their read mate.
That means for each of my original paired end libraries, I now have ~15m intact paired end reads and ~15m single end reads that have lost their mate. I then mapped them separately to the genome using tophat and now want to do differential analysis.
For DESeq/edgeR, can i just add the read counts for the PE and SE reads from the same source library back together?
For cuffdiff, should i just merge the bam files, sort by coordinate, and then carry out the analysis?
I have 12 sequenced libraries of illumina hiseq data, 6 biological replicates each for two conditions. The paired end reads contained a large amount of primer sequences, so i trimmed the data with trimmomatic and it turned out about half my reads lost their read mate.
That means for each of my original paired end libraries, I now have ~15m intact paired end reads and ~15m single end reads that have lost their mate. I then mapped them separately to the genome using tophat and now want to do differential analysis.
For DESeq/edgeR, can i just add the read counts for the PE and SE reads from the same source library back together?
For cuffdiff, should i just merge the bam files, sort by coordinate, and then carry out the analysis?