Header Leaderboard Ad

Collapse

Unknown sequences in R2 that couldn't be mapped

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Unknown sequences in R2 that couldn't be mapped

    I've run Bismark to align a set of BS-Seq data. Some (not all) of the samples had low mapping efficiency (~20%). I then tried mapping R1 and R2 separately and found that R1 mapped at >70% while R2 mapped at ~30% (both in undirectional mode). Then I tried bsseeker and it reported a 72.2% mapping rate. By checking the CIGAR, I saw that most of the R2 reads contained a not short soft clipping in the ends (e.g. 91M60S). An examples of these reads is:

    A00437:548:HN5NMDSX3:1:1101:24939:1344 (aligned by bsseeker but not bismark. CIGAR: 60M91S; POS: chr1:159204290)
    AAGTTTTTTATATATAGATATGTGTATAATGATATATAGTAAATGTATATAGAGTTTAGTGTGAGAGTGGGAGGGTTGGGGTGGTTGTTGAGGTTGTATAATGAAGTTATTTTAGGGAGTTATTGGGTGTTTGTTTAGTTATTTATGGGTT

    The bolded part was soft-clipped, while the front part mapped to chr1:159204290-159204349 (60nt) if converting all Cs to Ts in the reference.
    I checked the fastqc of these reads but didn't see adaptor contamination or over-represented sequences in R2, so it's a mystery what these clipped sequences are and why they occur only in R2. Does anyone have any ideas? Thanks.

Latest Articles

Collapse

ad_right_rmr

Collapse

News

Collapse

Topics Statistics Last Post
Started by seqadmin, 05-26-2023, 09:22 AM
0 responses
8 views
0 likes
Last Post seqadmin  
Started by seqadmin, 05-24-2023, 09:49 AM
0 responses
15 views
0 likes
Last Post seqadmin  
Started by seqadmin, 05-23-2023, 07:14 AM
0 responses
30 views
0 likes
Last Post seqadmin  
Started by seqadmin, 05-18-2023, 11:36 AM
0 responses
116 views
0 likes
Last Post seqadmin  
Working...
X