Hi,
First time doing RNA-seq data from HiSeq 2000. After looking at the FastQC reports from the sequencing facility and reading some of the older posts here, I still have some questions and need helps.
There seems to be some quality checks that all of my samples consistently failed (or warned)
1. GC/sequence content. All samples failed this category. See figures 1&2. Does this suggest that I need to trim off the first ~13 bases? GC content supposed to be stable, what does this bias suggest?
2. Duplication. Figure 3. From what I read, ideally, all reads should have only one representation. Should I remove all redundant reads?
3. over representation. figure 4. Does this mean I have adaptor sequences that haven't been removed? How should I address this problem?
4. Kmer content. Still not quite sure what this means, but received warning on this.
Many thanks! Happy holiday!
First time doing RNA-seq data from HiSeq 2000. After looking at the FastQC reports from the sequencing facility and reading some of the older posts here, I still have some questions and need helps.
There seems to be some quality checks that all of my samples consistently failed (or warned)
1. GC/sequence content. All samples failed this category. See figures 1&2. Does this suggest that I need to trim off the first ~13 bases? GC content supposed to be stable, what does this bias suggest?
2. Duplication. Figure 3. From what I read, ideally, all reads should have only one representation. Should I remove all redundant reads?
3. over representation. figure 4. Does this mean I have adaptor sequences that haven't been removed? How should I address this problem?
4. Kmer content. Still not quite sure what this means, but received warning on this.
Many thanks! Happy holiday!
Comment