Hello All,
I am very new to bioinformatics. I am a wet-lab biologist trying to teach myself about RNA-seq.
Using the sra toolkit, I looked at an RNA-seq study on the GEO database. I downloaded the data as fastq files using "fastq-dump --split-files thedata"
I ended up with thedata_1.fastq and thedata_2.fastq, when I ran these through fastqc, the 1 file had a sequence length of 75, while the 2 file had a sequence length of 25.
Is this a mistake I made? I couldn't find any previous topics that covered this. I assumed using --split-files would show me if it was paired end reads, and if so they should be the same size.
If not, is the data still usable? Since I am just trying to teach myself how to work with this data, it would not be a big deal to abandon it, but that would also not exactly fulfill the goal.
Thanks for any help/advice
I am very new to bioinformatics. I am a wet-lab biologist trying to teach myself about RNA-seq.
Using the sra toolkit, I looked at an RNA-seq study on the GEO database. I downloaded the data as fastq files using "fastq-dump --split-files thedata"
I ended up with thedata_1.fastq and thedata_2.fastq, when I ran these through fastqc, the 1 file had a sequence length of 75, while the 2 file had a sequence length of 25.
Is this a mistake I made? I couldn't find any previous topics that covered this. I assumed using --split-files would show me if it was paired end reads, and if so they should be the same size.
If not, is the data still usable? Since I am just trying to teach myself how to work with this data, it would not be a big deal to abandon it, but that would also not exactly fulfill the goal.
Thanks for any help/advice
Comment