Seqanswers Leaderboard Ad

**fkrueger** · 11-17-2016, 05:22 AM

Hi Naveed,

~58% doesn't sound too bad to me, I just checked a few sample we had run over here (100bp single-end) and we also got mapping efficiencies around 53%. I did notice however that the number of ambiguous alignments was fairly large (around 37%) compared to human or mouse samples, so that is certainly one aspect that can bring mapping efficiencies down. Other factors may include read-length, software version and parameters used, single-end or paired-end libraries, or contamination with other organisms or spike-ins (you could test this out using FastQ screen). If you've got any specific questions just drop me a line. Cheers, Felix

**naveed.jhamat** · 11-17-2016, 07:10 AM

Thanks Felix

Please find here the more detail about my output and guide.
I had about 40 million PE reads in the sample (WGBS) with 126 bp length.
I executed trimgalore with following script:-
trim_galore -o testing --paired sample_L001_R1_001.fastq.gz \
sample_L001_R2_001.fastq.gz
Every thing was clear in FastQC screen except Kmer content and per base sequence content.
I downloaded the latest version of Bovine genome (Bos_taurus8.UMD.3.1.fa) from Ensemble for bismark_genome_preparation.

Then I executed following bismark script and got 57.6% mapping efficiecy:-
bismark --path_to_bowtie /path/ --samtools_path /path/ -p 4 -N 1 --bowtie2 \
--unmapped --basename output /path_to_ref_genome/ \
-1 sample_L001_R1_001_val_1.fq.gz \
-2 sample_L001_R2_001_val_2.fq.gz

Here is the text report generated by bismark that you can see and guide me to proceed further. Thanking you in anticipation. / Naveed Jhamat.

Final Alignment report
======================
Sequence pairs analysed in total: 40502929
Number of paired-end alignments with a unique best hit: 23323660
Mapping efficiency: 57.6%
Sequence pairs with no alignments under any condition: 9511716
Sequence pairs did not map uniquely: 7667553
Sequence pairs which were discarded because genomic sequence could not be extracted: 0

Number of sequence pairs with unique best (first) alignment came from the bowtie output:
CT/GA/CT: 12000211 ((converted) top strand)
GA/CT/CT: 0 (complementary to (converted) top strand)
GA/CT/GA: 0 (complementary to (converted) bottom strand)
CT/GA/GA: 11323449 ((converted) bottom strand)

Number of alignments to (merely theoretical) complementary strands being rejected in total: 0

Final Cytosine Methylation Report
=================================
Total number of C's analysed: 1091274815

Total methylated C's in CpG context: 64749692
Total methylated C's in CHG context: 5352112
Total methylated C's in CHH context: 8195938
Total methylated C's in Unknown context: 2884

Total unmethylated C's in CpG context: 23570501
Total unmethylated C's in CHG context: 277948464
Total unmethylated C's in CHH context: 711458108
Total unmethylated C's in Unknown context: 7633

C methylated in CpG context: 73.3%
C methylated in CHG context: 1.9%
C methylated in CHH context: 1.1%
C methylated in unknown context (CN or CHN): 27.4%

**fkrueger** · 11-17-2016, 08:30 AM

Hi Naveed,

That looks all very sensible to me. If I were you I would a) make sure you were using the latest version (0.16.3), b) maybe lose the parameter -N 1 so that the default of 0 is used (I am still investigating reports that -N 1 sometimes does weird things...). And finally c) seeing that you have specified --unmapped you could try to align just the unmapped Read 1 in single-end mode (with default parameters) to see if a substantial fraction of unmapped pairs would align as single-end file. And yea maybe run FastQ_Screen on the data to see if there are any contaminations. If you wanted you could also email me say 100K reads and I run FastQ_Screen for you over here where we have set it up already. Cheers, Felix

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 18 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 22 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 17 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 49 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad

Announcement

Low mapping efficiency of WGBS data for DNA methylation

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News