Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • fkrueger
    replied
    Hi Magdalena,

    Code:
    bismark -1 Non-overlappling_reads_1.fastq -2 Non–overlappling_reads_2.fastq Merged_overlappling_reads.fastq
    does not work in the way you think it will as it would really only do PE alignments of the overlapping reads. So if you wanted to split this up manually (but why?) then you can run the PE on non-overlapping reads first and the SE on the overlapping reads, and then run a PE and SE methylation extraction separately. If you wanted to you could then merge the data again for the bismark2bedGraph step, just feed in all the CpG* files from both PE and SE mapping.

    I am not quite sure however if merging and making things complicated isn’t exactly doing exactly what the methylation extractor is doing anyway: mapping non-overlapping reads and getting the information from both reads, and only getting the information once from overlapping reads because of the --no_verlap option (isn’t this the same as your ‘merghing’ step?)

    Leave a comment:


  • MagdalenaZ
    replied
    Mapping SE and PE reads

    Hi,

    I have PE reads, some 20% of which overlap. I usually overlap these reads before mapping, so that I have the following files:

    Non-overlappling_reads_1.fastq
    Non-overlappling_reads_2.fastq
    Merged_overlappling_reads.fastq

    I would submit those reads to bismark as:

    bismark -1 Non-overlappling_reads_1.fastq -2 Non–overlappling_reads_2.fastq Merged_overlappling_reads.fastq

    ….but then my resulting BAM-file contains both SE and PE reads, so do I use the –p or –s flag on bismark_methylation_extractor ?

    Also, from the results it doesn't really look like the Merged_overlappling_reads.fastq actually gets read.

    Or I could run bismark AND bismark_methylation_extractor twice,; once for SE and one for PE – but then at what point do I merge the results?
    Last option – just not do the overlap, but feed in all PE as is, and use: bismark_methylation_extractor —include_overlap and hope that all will be well. But then I loose the SE reads.

    So many options, but hopefully only one optimal solution!
    Very grateful for your advice!

    Cheers,
    Magdalena

    Leave a comment:


  • fkrueger
    replied
    Thanks for reporting this, I have filed an issue on the Bismark GitHub page and will address it as soon as I find some time. Cheers, Felix

    Leave a comment:


  • biocomputer
    replied
    When I try to run bismark2bedGraph from a directory that doesn't directly contain the input file it fails to find the input file, even though I've specified the path to the input file. For example, if I'm in the directory that contains the input file and use this command it works properly (note that I've taken the output from the methylation extractor and split it by chromosome for batch processing but the same thing happens if I use the original unsplit file):

    bismark2bedGraph -o ./output/chr10.bg ./CpG_chr10.txt
    If I move up a directory and run this command it doesn't work:

    bismark2bedGraph -o ./directory/output/chr10.bg ./directory/CpG_chr10.txt
    The programs ends with:

    Using the following files as Input:
    CpG_chr10.txt

    Writing bedGraph to file: ./directory/output/chr10.bg.gz
    Also writing out a coverage file including counts methylated and unmethylated residues to file: ./directory/output/chr10.bg.gz.bismark.cov.gz

    Couldn't find file 'CpG_chr10.txt': No such file or directory

    Leave a comment:


  • fkrueger
    replied
    Originally posted by Tlexander View Post
    I think there is some bugs in the option of --remove_spaces in bismark_methylation_extractor.


    The error is as the following:



    Changed directory to /media/LTS_33T/SG_LTS33T/monod/haib/methyfreq/
    Now replacing whitespaces in the sequence ID field of the Bismark methylation extractor output /media/LTS_33T/SG_LTS33T/monod/haib/methyfreq/CpG_context_ENCFF000LWP_trimmed.fq_bismark_bt2.txt prior to bedGraph conversion

    Couldn't write to file /media/LTS_33T/SG_LTS33T/monod/haib/methyfreq/CpG_context_ENCFF000LWP_trimmed.fq_bismark_bt2.txt.spaces_removed.txt: No such file or directory
    Finished BedGraph conversion ...
    Thanks for reporting this, could you please also post the exact command you used when you called the methylation extractor so I can reproduce it more easily?

    Leave a comment:


  • Tlexander
    replied
    Bismark Bug v0.14.4 --remove_spaces in bismark_methylation_extractor.

    I think there is some bugs in the option of --remove_spaces in bismark_methylation_extractor.


    The error is as the following:



    Changed directory to /media/LTS_33T/SG_LTS33T/monod/haib/methyfreq/
    Now replacing whitespaces in the sequence ID field of the Bismark methylation extractor output /media/LTS_33T/SG_LTS33T/monod/haib/methyfreq/CpG_context_ENCFF000LWP_trimmed.fq_bismark_bt2.txt prior to bedGraph conversion

    Couldn't write to file /media/LTS_33T/SG_LTS33T/monod/haib/methyfreq/CpG_context_ENCFF000LWP_trimmed.fq_bismark_bt2.txt.spaces_removed.txt: No such file or directory
    Finished BedGraph conversion ...

    Leave a comment:


  • bmartinez
    replied
    problems with bismark2bedGraph and coverage2cytosine to get methylations extracted

    Hi Felix and everyone,

    Thanks a lot for your help, Felix, the problem is solved.
    Otherwise, I have and additional issue related to the sorting of the CpG_context and Non_CpG_context files in my operating system. I am reporting it in case someone else can be affected and to ask for your opinion.

    I get an error when sorting these files (this is done within the script bismark2bedgraph). The line that does the sorting in the script is the following one:

    open $ifh, "sort -S $sort_size -T $sort_dir -k3,3V -k4,4n $in |" or die "Input file could not be sorted. $!\n";

    I have solved the issue sorting just with "3,3". Otherwise, I am still trying to confirm that this is not affecting the final results. If anyone can give a hint on this, I would greatly appreciate.

    Begoña

    Leave a comment:


  • fkrueger
    replied
    It appears that GZIP-compressed input files were streamed directly into the Unix sort command (e.g. when using the option --scaffolds/--gazillion), but sort cannot read compressed files and thus it would produce an empty output. I have opened an issue of Github for that (https://github.com/FelixKrueger/Bismark/issues/9) and fixed the way GZIP compressed files are streamed to sort and it seems to work fine on my end. The latest version can be cloned straight from Github.

    Leave a comment:


  • fkrueger
    replied
    Hi Begona,

    It looks like the bismark2bedGraph step is somehow failing, can you check the error logs (or what appears on screen) to see what is going wrong exactly?

    The CpG report simply puts the coverage file into genomic context, so if the coverage file is empty then the CpG report will show 0 0 only as well.

    I'm happy to look at this in more detail, you can also send me email with the error logs. Best, Felix

    Leave a comment:


  • bmartinez
    replied
    problems with bismark2bedGraph and coverage2cytosine to get methylations extracted

    Hi everyone,

    I am analysing WGBS data with Bismark v0.14.5. I have trimmed, aligned the data with Bowtie and deduplicated with no issues. However, I am having problems to extract the methylations. My genome is in 47,100 scaffolds. The command I am using is:

    bismark_methylation_extractor -p --comprehensive --merge_non_CpG --samtools_path /opt/samtools-0.1.19 --genome_folder /path_to_genome/Bisulfite_Genome_BowtieOne --buffer_size 10G --report --bedGraph --cytosine_report --scaffolds --gzip --multicore 3 -o /path/file.fastq.gz_bismark_pe.deduplicated.sam

    I get proper CpG_context_file.fastq.gz_bismark_pe.deduplicated.txt.gz and Non_CpG_context_file.fastq.gz_bismark_pe.deduplicated.txt.gz files, see the first lines for an example:

    Bismark methylation extractor version v0.14.4
    HWI-ST539:249:C7BDRACXX:6:1101:1460:1903_1:N:0:GCCAAT + scaffold.s31344 999331 Z
    HWI-ST539:249:C7BDRACXX:6:1101:1460:1903_1:N:0:GCCAAT - scaffold.s31344 999181 z
    HWI-ST539:249:C7BDRACXX:6:1101:1586:1973_1:N:0:GCCAAT - scaffold.s10570 115561 z
    HWI-ST539:249:C7BDRACXX:6:1101:1586:1973_1:N:0:GCCAAT + scaffold.s10570 115578 Z
    HWI-ST539:249:C7BDRACXX:6:1101:1586:1973_1:N:0:GCCAAT + scaffold.s10570 115590 Z
    HWI-ST539:249:C7BDRACXX:6:1101:1586:1973_1:N:0:GCCAAT + scaffold.s10570 115616 Z
    HWI-ST539:249:C7BDRACXX:6:1101:1586:1973_1:N:0:GCCAAT + scaffold.s10570 115624 Z
    HWI-ST539:249:C7BDRACXX:6:1101:1586:1973_1:N:0:GCCAAT + scaffold.s10570 115639 Z
    HWI-ST539:249:C7BDRACXX:6:1101:1586:1973_1:N:0:GCCAAT + scaffold.s10570 115632 Z

    However, it does not convert properly into bed files (I get and empty file) and the cytosine reports I get is like this, with no methylation at all (columns 3 and 4 are all 0’s):

    scaffold.s00001 45 + 0 0 CG CGC
    scaffold.s00001 46 - 0 0 CG CGT
    scaffold.s00001 49 + 0 0 CG CGG
    scaffold.s00001 50 - 0 0 CG CGT
    scaffold.s00001 1095 + 0 0 CG CGT
    scaffold.s00001 1096 - 0 0 CG CGG
    scaffold.s00001 1481 + 0 0 CG CGC
    scaffold.s00001 1482 - 0 0 CG CGG
    scaffold.s00001 1560 + 0 0 CG CGC
    scaffold.s00001 1561 - 0 0 CG CGG



    I have tried to do things step by step, but I get the same result. I have been working on this for more than a week now and I do not find were is the error. Could someone help me with this?

    Thanks a lot in advance!

    Begoña

    Leave a comment:


  • fkrueger
    replied
    Originally posted by daanum View Post
    Hi,

    I am unable to run the bismark_genome_preparation step yet.
    I get an error "Command not found'.
    Any idea? I am trying since yesterday, not sure what am i doing wrong?

    I admire your perseverance but you might want to consider doing a basic Linux operation tutorial, I think you might benefit.

    Here you've got a couple of options:
    1) either you move to the folder containing the Bismark installation and then run ./bismark_genome_preparation (./ prepends the path to the current genome)
    2) you can type /path/to/Bismark/bismark_genome_preparation which should work from anywhere.

    Leave a comment:


  • daanum
    replied
    Hi,

    I am unable to run the bismark_genome_preparation step yet.
    I get an error "Command not found'.
    Any idea? I am trying since yesterday, not sure what am i doing wrong?

    Leave a comment:


  • fkrueger
    replied
    Bismark just needs to be extracted as is outlined step by step in the manual (http://www.bioinformatics.babraham.a...User_Guide.pdf). I believe Bowtie 2 also only needs to be unzipped, then either you place the bowtie2 executable in the PATH (just google how to do this), or you specify the path with --path_to_bowtie in Bismark.

    All other steps including the genome preparation (
    Code:
    bismark_genome_preparation [options] <path_to_genome_folder>
    ) are explained in detail in the manual, this protocol, or this methylation analysis course. Good luck, Felix

    Leave a comment:


  • daanum
    replied
    genome preparation

    hi,
    I am trying to run bismark genome preparation but unable to do so.
    I have bismark v 14.5 unzipped folder on server and have bowtie-2.2.2.6 version unzipped folder and genome files for human grch38- all these three folders in one folder. Do i need to run any installation step for bismark/bowtie before i run genome preparation ?

    I am new to methylation analysis so will be great if you could please help.

    thanks in advance.

    Leave a comment:


  • fkrueger
    replied
    Oh it seems I need to update the manual because we very recently changed the default aligner to Bowtie 2, and the command in the manual still refers to bowtie1 (if you use --bowtie1 you can use the command as in the manual). I'll have this changed soon, thanks for spotting this.

    If you want to run the test dataset just leave out all options and try using the defaults. Best, Felix

    Leave a comment:

Latest Articles

Collapse

  • seqadmin
    Essential Discoveries and Tools in Epitranscriptomics
    by seqadmin




    The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
    04-22-2024, 07:01 AM
  • seqadmin
    Current Approaches to Protein Sequencing
    by seqadmin


    Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
    04-04-2024, 04:25 PM

ad_right_rmr

Collapse

News

Collapse

Topics Statistics Last Post
Started by seqadmin, Today, 08:06 AM
0 responses
11 views
0 likes
Last Post seqadmin  
Started by seqadmin, 04-30-2024, 12:17 PM
0 responses
14 views
0 likes
Last Post seqadmin  
Started by seqadmin, 04-29-2024, 10:49 AM
0 responses
19 views
0 likes
Last Post seqadmin  
Started by seqadmin, 04-25-2024, 11:49 AM
0 responses
26 views
0 likes
Last Post seqadmin  
Working...
X