Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • fkrueger
    replied
    Originally posted by Julien Roux View Post
    Hi Felix
    Actually I was running bismark on a cluster and I requested 32g of memory. I have now run it with 48g and I'm still encountering the same issue...
    Do you have any idea of what else could go wrong? There is no apparent error message printed in the report files
    Thanks
    Julien
    This is somewhat weird... You should not need more than 12GB memory or so, maybe your cluster farms one alignment thread out and can't read from it anymore? Could you try to login to a compute node and run it locally on there (or some other machine for that matter)?

    If you could send me a few reads (e.g. the first 100K or so reads) via email I could take a quick look here what is going on.

    Leave a comment:


  • fkrueger
    replied
    @pengchy,
    It will probably work fine for the singleton reads but this won't guarantee the strandedness of paired-end reads, i.e. read 1 and read 2 would most likely be regarded as coming from OT and and CTOT (instead of OT only) or similarly from OB and CTOB (instead of OB only). In terms of the methylation information it should still be fine though.

    To be on the safe side I would probably still treat them separately and then merge the results at the level of the final methylation extractor output.

    Leave a comment:


  • pengchy
    replied
    Hi fkrueger,

    When I use the bismark_methylation_extractor, it need to define single-end or paired-end. I have filter the original reads using Trimmomatic, which produced paired-end and single-end simultaneously. Is it possible to merge the two bam files together using samtools and bismark_methylation_extractor later with -p parameter?

    Thank you.
    P

    Leave a comment:


  • Julien Roux
    replied
    Hi Felix
    Actually I was running bismark on a cluster and I requested 32g of memory. I have now run it with 48g and I'm still encountering the same issue...
    Do you have any idea of what else could go wrong? There is no apparent error message printed in the report files
    Thanks
    Julien

    Leave a comment:


  • fkrueger
    replied
    Hi Julien,
    This does indeed look like the first instance of Bowtie (OT) is running out of memory... Does your machine have fairly low RAM or are you running many instances of Bismark concurrently? Could you run the analysis on a more powerful machine and see what happens there?

    Leave a comment:


  • Julien Roux
    replied
    Thanks Felix for your answer.

    I am now facing another problem: when I run bismark of some of my samples, reads end up mapped to only the top or the bottom strand... This is not happening for all samples, but it is happening repeatedly on given samples. Is this a memory issue?

    Here is a an example of report file:
    Code:
    Bismark report for: ./C3K1_trimmed.fq.gz (version: v0.7.12)
    Option '--directional' specified: alignments to complementary strands will be ignored (i.e. not performed!)
    Bowtie was run against the bisulfite genome of panTro3_nonrandom+Lambda_prepared_bismark/ with the specified options: -q --phred64-quals -n 1 -k 2 --best --chunkmbs 512
    
    Final Alignment report
    ======================
    Sequences analysed in total:    50619505
    Number of alignments with a unique best hit from the different alignments:      19828727
    Mapping efficiency:     39.2%
    Sequences with no alignments under any condition:       23787863
    Sequences did not map uniquely: 7002915
    Sequences which were discarded because genomic sequence could not be extracted: 0
    
    Number of sequences with unique best (first) alignment came from the bowtie output:
    CT/CT:  0       ((converted) top strand)
    CT/GA:  19828727        ((converted) bottom strand)
    GA/CT:  0       (complementary to (converted) top strand)
    GA/GA:  0       (complementary to (converted) bottom strand)
    
    Number of alignments to (merely theoretical) complementary strands being rejected in total:     0
    
    Final Cytosine Methylation Report
    =================================
    Total number of C's analysed:   185057365
    
    Total methylated C's in CpG context:     6829013
    Total methylated C's in CHG context:    216805
    Total methylated C's in CHH context:    741531
    
    Total C to T conversions in CpG context:        2672385
    Total C to T conversions in CHG context:        41885198
    Total C to T conversions in CHH context:        132712433
    
    C methylated in CpG context:    71.9%
    C methylated in CHG context:    0.5%
    C methylated in CHH context:    0.6%
    Thanks for your help
    Julien

    Leave a comment:


  • fkrueger
    replied
    Hi Julien,
    I am afraid you would have to go via the temporary trimmed file because Bismark uses the input file to determine output file names etc. You might want to take a quick look at Trim Galore which also uses Cutadapt for trimming with a set of stringent and useful parameters that are ideally suited for bisulfite applications.

    Leave a comment:


  • Julien Roux
    replied
    Dear Felix,
    I am wondering if Bismark can be fed with an input stream in a pipeline.
    For example would that work?
    Code:
    cutadapt -b ACTGCTCG input_file.fastq | bismark -n 1 --solexa1.3-quals --bam indexed_genome/ -
    Or this?
    Code:
    cutadapt -b ACTGCTCG input_file.fastq | bismark -n 1 --solexa1.3-quals --bam indexed_genome/ stdin
    Or do I have to create an intermediate file?
    Code:
    cutadapt -b ACTGCTCG input_file.fastq > temp_file.fastq
    bismark -n 1 --solexa1.3-quals --bam indexed_genome/ temp_file.fastq
    Thanks for your help
    Julien

    Leave a comment:


  • fkrueger
    replied
    Originally posted by shadow19c View Post
    Hello,
    I have a question the option of bismark, concerning the bowtie 2 reporting options --most_valid_alignments.
    If I'm not wrong the option -M is not available now in Bowtie2, so how can you ask to the program to keep only valid aligments only unique aligments?
    Bismark determines if there are any other alignments with the same alignment score. If there are the read is not unique and discarded, otherwise it is kept.

    Leave a comment:


  • shadow19c
    replied
    Hello,
    I have a question the option of bismark, concerning the bowtie 2 reporting options --most_valid_alignments.
    If I'm not wrong the option -M is not available now in Bowtie2, so how can you ask to the program to keep only valid aligments only unique aligments?

    Leave a comment:


  • fkrueger
    replied
    Hi pengchy,

    To 1) It is true that Bismark appends segment numbers to the end of read. This is because Bowtie or Bowtie2 tend to delete these tags internally while aligning, and to make it more difficult they don't do it in the same way. To properly keep track of which read is doing what I had to do this change (btw also white spaces or tab characters are being replaced by _ in the read ID.

    To 2) Bismark does not report singleton alignments for paired-end data but only reports paired alignments. In the Bismark help you can find:
    Code:
    --no-mixed               This option disables Bowtie 2's behavior to try to find alignments for the individual mates if
                             it cannot find a concordant or discordant alignment for a pair. This option is invariable and
                             and on by default.
    
    --no-discordant          Normally, Bowtie 2 looks for discordant alignments if it cannot find any concordant alignments.
                             A discordant alignment is an alignment where both mates align uniquely, but that does not
                             satisfy the paired-end constraints (--fr/--rf/--ff, -I, -X). This option disables that behavior
                             and it is on by default.
    If you wanted to look for singleton alignments for reads that do not produce valid paired-end alignments you could always write out unaligned reads and re-align them in single-end mode, but I would probably not advise doing this since comparing SE and PE alignments can have its own pitfalls.

    To 3): In order to determine the sequence context of a read Bismark is extracting 2 extra basepairs at the start or the end of a read (where appropriate). If a read happens to align to the very end of a chromosome, Bismark can't extract 2 additional bp from the chromosomal sequence (because there is no more sequence), throws this warning message and moves on. This happens mostly for the MT, and it is normally fine to just ignore these warnings.

    Leave a comment:


  • pengchy
    replied
    Originally posted by fkrueger View Post
    We have just released a new version of Bismark (v0.6.4) to address a few minor issues.

    The changes include:

    - Adjusted the options -u and -s so that only the non-skipped part of the input file will be transcribed and analysed. This allows splitting up very large files into smaller chunks to allow parallel processing, e.g -s 10000000 -u 20000000 would analyse sequences 10000001 to 20000000. The alignment report will be based on this reduced number of reads analysed
    - In paired-end mode, the options --unmapped and --ambiguous do now output unaligned or multiply aligned reads, respectively, to their correct output files as intended
    - Sequences in FastA format do now receive Phred score qualities of 40 throughout (ASCII 'I') to prevent the SAM to BAM conversion in SAMtools from failing
    - If a genomic sequence could not be extracted it will now also be counted and reported for use with Bowtie 1
    - Suppressed debugging warning meassages that were printed in error for Bowtie2 alignments (single-end mode only)

    Bismark is available here.
    Hi fkrueger,
    In the report file of bismark, one line is:
    Code:
    Sequence pairs which were discarded because genomic sequence could not be extracted:    592
    I cann't understand this term, what do you mean that the genomic sequence coud not be extracted?
    thank you.

    Leave a comment:


  • pengchy
    replied
    Hi all,

    I have two questions for bismark.
    1. the read ids in the bam is not same as in the original fastq file.
    The original read ids were like:
    Code:
    HISEQ700708:127:C1LUKACXX:3:1101:1153:42732/1
    HISEQ700708:127:C1LUKACXX:3:1101:1153:42732/2
    After bismark alignment, the read ids in the bam file were like:
    Code:
    HISEQ700708:127:C1LUKACXX:3:1101:1153:42732/1/1
    HISEQ700708:127:C1LUKACXX:3:1101:1153:42732/1/2
    2. In the report file, No information about how many reads were mapped with only one end of the paired-end data.

    Leave a comment:


  • luuloi
    replied
    Originally posted by luuloi View Post
    Hi Felix,
    Can I run Bismark, bowtie1 in multi threads -p option to tune the performance faster? I did it with bowtie2, but as you memtioned bowtie2 seems to be slow than bowtie1 with your experience. I have been waiting it for 4 days with size of .Bam file is 21M, it is so slow. BTW, when you will release multi thread Bismark? I have really looking forward to it. I have 14 WGBS samples for it
    It has been resolved, thanks a lot Felix! Anyone encouter it, please just download the new version of Bismark v0.7.12

    Leave a comment:


  • fkrueger
    replied
    If read 1 always aligns to the original strands you can just run it in default mode and do not need to specify --non_directional.

    Leave a comment:

Latest Articles

Collapse

  • seqadmin
    Recent Advances in Sequencing Analysis Tools
    by seqadmin


    The sequencing world is rapidly changing due to declining costs, enhanced accuracies, and the advent of newer, cutting-edge instruments. Equally important to these developments are improvements in sequencing analysis, a process that converts vast amounts of raw data into a comprehensible and meaningful form. This complex task requires expertise and the right analysis tools. In this article, we highlight the progress and innovation in sequencing analysis by reviewing several of the...
    05-06-2024, 07:48 AM
  • seqadmin
    Essential Discoveries and Tools in Epitranscriptomics
    by seqadmin




    The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
    04-22-2024, 07:01 AM

ad_right_rmr

Collapse

News

Collapse

Topics Statistics Last Post
Started by seqadmin, Today, 07:03 AM
0 responses
9 views
0 likes
Last Post seqadmin  
Started by seqadmin, 05-10-2024, 06:35 AM
0 responses
20 views
0 likes
Last Post seqadmin  
Started by seqadmin, 05-09-2024, 02:46 PM
0 responses
26 views
0 likes
Last Post seqadmin  
Started by seqadmin, 05-07-2024, 06:57 AM
0 responses
21 views
0 likes
Last Post seqadmin  
Working...
X