Header Leaderboard Ad

Collapse

BBDuk quality filtering not producing expected result

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • BBDuk quality filtering not producing expected result

    I'm trying to trim/filter low quality reads from paired-end exome-seq data, using BBDuk.

    I used the command:

    ```
    for ea in $files;
    do
    R1="$ea"
    R2=$(echo $R1 | sed "s/R1/R2/")
    /home/shared/programs/bbmap/bbduk.sh -Xmx1g in1=$R1 in2=$R2 \
    out1="$(echo $ea | sed s/.fastq.gz/_trimmed_filtered.fastq.gz/)" \
    out2="$(echo $(echo $ea | sed s/R1/R2/) | sed s/.fastq.gz/_trimmed_filtered.fastq.gz/)" \
    ref=/home/shared/programs/bbmap/resources/adapters.fa \
    t=10 ktrim=r k=23 kmin=11 hdist=1 maq=10 minlen=60 tpe tbo
    done;
    ```

    After running fastqc on the output of this, I'm seeing that R2 files have some reads with low quality scores (see per sequence quality score), and the overrepresented sequence "NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN".

    Looking at these reads in the fastq:
    ```
    @HISEQ:525:HMFYNBCXX:1:1101:1380:2167 2:N:0:CAGATC
    NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
    +
    !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
    @HISEQ:525:HMFYNBCXX:1:1101:1276:2219 2:N:0:CAGATC
    NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
    +
    !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
    @HISEQ:525:HMFYNBCXX:1:1101:1238:2328 2:N:0:CAGATC
    NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
    ```

    Shouldn't these reads have been filtered out?


    Any help here would be much appreciated.
    Last edited by reliscu; 10-13-2021, 09:15 AM.

Latest Articles

Collapse

  • seqadmin
    How RNA-Seq is Transforming Cancer Studies
    by seqadmin



    Cancer research has been transformed through numerous molecular techniques, with RNA sequencing (RNA-seq) playing a crucial role in understanding the complexity of the disease. Maša Ivin, Ph.D., Scientific Writer at Lexogen, and Yvonne Goepel Ph.D., Product Manager at Lexogen, remarked that “The high-throughput nature of RNA-seq allows for rapid profiling and deep exploration of the transcriptome.” They emphasized its indispensable role in cancer research, aiding in biomarker...
    09-07-2023, 11:15 PM
  • seqadmin
    Methods for Investigating the Transcriptome
    by seqadmin




    Ribonucleic acid (RNA) represents a range of diverse molecules that play a crucial role in many cellular processes. From serving as a protein template to regulating genes, the complex processes involving RNA make it a focal point of study for many scientists. This article will spotlight various methods scientists have developed to investigate different RNA subtypes and the broader transcriptome.

    Whole Transcriptome RNA-seq
    Whole transcriptome sequencing...
    08-31-2023, 11:07 AM

ad_right_rmr

Collapse

News

Collapse

Topics Statistics Last Post
Started by seqadmin, Yesterday, 06:18 AM
0 responses
5 views
0 likes
Last Post seqadmin  
Started by seqadmin, 09-20-2023, 09:17 AM
0 responses
8 views
0 likes
Last Post seqadmin  
Started by seqadmin, 09-19-2023, 09:23 AM
0 responses
25 views
0 likes
Last Post seqadmin  
Started by seqadmin, 09-19-2023, 09:14 AM
0 responses
7 views
0 likes
Last Post seqadmin  
Working...
X