We recently ran our FIRST RNAseq run on our new Illumina NextSeq. We ran a paired-end run with 12 indexed samples. The data generated comes in 8 fastq files for each sample (4 lanes, Read 1, Read 2) for a total of 96 fastq files. We are planning on using the following pipeline to analyze our data for differential gene expression.
Trimmomatic ---> Rockhopper
We are running that analysis on a Windows 7 machine (I don't have any other options), and I have Geneious installed as well.
So with all that background here is my question:
At what point to I combine the 8 fastq files for each sample into 2 fastq files (R1, R2) in the pipeline?
I was thinking that I would combine the files before I run Trimmomatic, so as to save myself MANY repetitions of the same analysis steps. The only way I have figured out how to combine fastq files on my Windows 7 machine is to use Geneious (which gives me a warning that some of the meta data may be lost).
I ran a side by side comparison of the 2 combined fastq files with the 8 separate fastq files using my workflow and the output in Rockhopper said I had differential gene expression present in ~35% of genes (which doesn't make ANY sense since these are EXACTLY the same samples, just one set has been combined and the other has not).
Any guidance would be greatly appreciated!!!!!
Trimmomatic ---> Rockhopper
We are running that analysis on a Windows 7 machine (I don't have any other options), and I have Geneious installed as well.
So with all that background here is my question:
At what point to I combine the 8 fastq files for each sample into 2 fastq files (R1, R2) in the pipeline?
I was thinking that I would combine the files before I run Trimmomatic, so as to save myself MANY repetitions of the same analysis steps. The only way I have figured out how to combine fastq files on my Windows 7 machine is to use Geneious (which gives me a warning that some of the meta data may be lost).
I ran a side by side comparison of the 2 combined fastq files with the 8 separate fastq files using my workflow and the output in Rockhopper said I had differential gene expression present in ~35% of genes (which doesn't make ANY sense since these are EXACTLY the same samples, just one set has been combined and the other has not).
Any guidance would be greatly appreciated!!!!!
Comment