Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • seq_GA
    Senior Member
    • Feb 2009
    • 124

    PeakSeq

    Anyone has used Peakseq tool for chip-seq experiments?
    How to run the parallel version of compile.py? I get to see the following error when I try to run as below inside the directory where all *.fa files are saved for hg18 build without PBS mode.

    python ../ParallelCompile.py 36
    Traceback (most recent call last):
    File "../ParallelCompile.py", line 2, in ?
    import sys, glob, re, subprocess, os, nws.sleigh, time
    ImportError: No module named nws.sleigh
    python version is 2.4.3
    Last edited by seq_GA; 10-04-2009, 11:38 PM.
  • dawe
    Senior Member
    • Apr 2009
    • 258

    #2
    Originally posted by seq_GA View Post
    Anyone has used Peakseq tool for chip-seq experiments?
    How to run the parallel version of compile.py? I get to see the following error when I try to run as below inside the directory where all *.fa files are saved for hg18 build without PBS mode.



    python version is 2.4.3
    Never used Peakseq but it seems you are missing a standard python distribution. Also, check the minimum python version as subprocess module should be part of python 2.5+

    d

    Comment

    • seq_GA
      Senior Member
      • Feb 2009
      • 124

      #3
      Hi D, Thx for your response.


      Is there any HOWTO document for this tool? I have downloaded Mappability Map code and chip-seq scoring (perl code) from http://www.gersteinlab.org/proj/PeakSeq/.


      Regards

      Comment

      • maubp
        Peter (Biopython etc)
        • Jul 2009
        • 1544

        #4
        Originally posted by dawe View Post
        Also, check the minimum python version as subprocess module should be part of python 2.5+
        The subprocess module was included in Python 2.4 onwards, but I agree, check the version of Python they expect you to have.

        The nws.sleigh import error means you are lacking a 3rd party Python library, in this case NetWorkSpaces for Python:


        Double check you have installed all the documented dependencies...

        Comment

        • Bioinfo
          Member
          • Jul 2010
          • 15

          #5
          Hi all,
          Has anyone used Peakseq for chipseq data analysis. Can anyone suggest which eland alignment file (export/sorted) Peakseq accepts.
          Does anyone knows how to convert eland alignment files (export/sorted) to s*_eland_extended.txt or s*_eland_results.txt format.
          Any help would be appreciated.

          Comment

          • ldong
            Member
            • May 2010
            • 15

            #6
            Hi, Bioinfo,
            I use bowtie for alignment. You can use samtools to convert different formats of alignments. Here is what I did. Best, ldong

            Comment

            • dnusol
              Senior Member
              • Jul 2009
              • 136

              #7
              Hi ldong,

              I am also having problems running PeakSeq. I checked your website but still have some questions.
              I downloaded the mappability.txt file for mouse. I also tried to create the mappability file but the "compile.py" script run in the directory containing the .fa files just throws the following error:

              usage compile.py <merlen>
              what is this merlen parameter about?

              in the website you mention, you use a config.txt file in step_3, did you create it yourself??
              Also, is the create_signal_map_new1.pl script included in the PeakSeq installation? or is it part of the modifications you comment at the beginning of the webpage?

              Thanks for your help

              Comment

              • ldong
                Member
                • May 2010
                • 15

                #8
                Hi, dnusol,
                merlen is the length of your reads. compile.py uses the number to get a piece of sequence of that length and map back to genome to see if there are same sequence elsewhere. So if your read length is 36bp, you should call the command:
                compile.py 36

                Yes, I made up the config.txt file to make it easier to run the software. All perl scipts with new.pl were modified based on the original ones. Hope his helps. Ldong
                Last edited by ldong; 11-02-2010, 07:55 AM.

                Comment

                • dnusol
                  Senior Member
                  • Jul 2009
                  • 136

                  #9
                  Thanks Ldong for your quick answer. Just two more questions (I hope!)
                  Do you know what the SGR files needed by score_hits_PolII.pl are?
                  and is it possible to use Bowtie-aligned reads without modifying the script?

                  Comment

                  • ldong
                    Member
                    • May 2010
                    • 15

                    #10
                    Ok. The sgr files created by creat_signial.pl will be used by score_hits.pl to look for potential peaks. The original creat_signal.pl read eland output, you need modify it to read bowtie output. It is very easy, just change:
                    while (<IN>) {
                    chomp;

                    my ($t1, $seq, $map, $t3, $t4, $t5, $chrt, $pos, $str, @rest) = split /\t/, $_;
                    my $read_length = length $seq;

                    if ($str eq "F") {
                    $data{$pos} += 1;
                    $data{$pos+$L} += -1;
                    }
                    elsif ($str eq "R") {
                    my $start = $pos + $read_length - $L;
                    $start = 1 if ($start < 1);
                    my $stop = $pos + $read_length;
                    $data{$start} += 1;
                    $data{$stop} += -1;
                    }
                    else {
                    print "PROBLEM\n";
                    }
                    }

                    close IN;

                    to:

                    while (<IN>) {
                    chomp;
                    my ($t1, $str, $t3, $pos, $seq, @rest) = split /\t/, $_;
                    my $read_length = length $seq;

                    if ($str eq "+") {
                    $data{$pos} += 1;
                    $data{$pos+$L} += -1;
                    }
                    elsif ($str eq "-") {
                    my $start = $pos + $read_length - $L;
                    $start = 1 if ($start < 1);
                    my $stop = $pos + $read_length;
                    $data{$start} += 1;
                    $data{$stop} += -1;
                    }
                    else {
                    print "PROBLEM\n";
                    }
                    }

                    close IN;
                    Let me if it is not clear.

                    Comment

                    • AL_B
                      Junior Member
                      • Aug 2013
                      • 5

                      #11
                      PeakSeq - need some help

                      Hi,
                      I am not from the area of bio-informatics, yet I need to learn how to run PeakSeq.
                      so I have some basic questions:
                      I went to this website

                      and I am trying to use PeakSeq evrsion 1.1
                      I don't understand what is the relation between the .fa files and the mappability map text file.
                      How do I get the .fa files and what do I do with it? according to the README file, the program should
                      get as an argument a mappability map text file, and there is no mentioning of .fa files.
                      So what should I do?

                      Thanks (I hope)

                      Comment

                      • ldong
                        Member
                        • May 2010
                        • 15

                        #12
                        Hi, AL_B,
                        My understand is that the script, compile.py reads all the and .fa files, and extract many short sequences with length of 'merlen", then map the short sequences back to the fa files, so that the software know if a certain sequence shows up in other places. The results are saved in file mappablility.txt. So you need generator new mappablity file for different read length. Hope this helps.

                        Comment

                        • AL_B
                          Junior Member
                          • Aug 2013
                          • 5

                          #13
                          PeakSeq - need some help

                          Thank you so much for your fast reply.
                          I have another question.
                          Do you know some message "for reading.n chr_id_list.txt" message?
                          This is some output that I am getting when I am running the second part of PeakSeq
                          ./PeakSeq -peak_select config.dat

                          Comment

                          • priya
                            Member
                            • Apr 2013
                            • 57

                            #14
                            Originally posted by ldong View Post
                            Hi, AL_B,
                            My understand is that the script, compile.py reads all the and .fa files, and extract many short sequences with length of 'merlen", then map the short sequences back to the fa files, so that the software know if a certain sequence shows up in other places. The results are saved in file mappablility.txt. So you need generator new mappablity file for different read length. Hope this helps.
                            Hi,
                            I have short sequence data of various read lengths for eg: 43bp,51bp and 52 bp. Do I need to generate the new mappibilty files for individual read lengths or can I use 50 bp as read length and use it for everything .

                            Comment

                            Latest Articles

                            Collapse

                            • SEQadmin2
                              From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
                              by SEQadmin2


                              Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


                              The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
                              ...
                              Yesterday, 10:05 AM
                            • SEQadmin2
                              Single-Cell Sequencing at an Inflection Point: Early Impacts of New Platforms and Emerging Trends
                              by SEQadmin2


                              With the launch of new single-cell sequencing platforms in 2026, the field stands at an exciting inflection point. This article surveys the most impactful advances in the field and discusses how they’re reshaping research in cancer, immunology, and beyond.


                              Introduction

                              Single-cell sequencing technologies have undergone remarkable advances over the past decade, transitioning from low-throughput experimental approaches to highly scalable platforms capable of...
                              05-22-2026, 06:42 AM
                            • SEQadmin2
                              Environmental Genomics in the Age of NGS: From Microbes to Conservation Strategies
                              by SEQadmin2

                              Studying ecosystems means dealing with complex, multi-species communities that are hard to observe at scale. This complexity, however, hides many important questions to be answered, from how biogeochemical cycles work and how climate change can affect species distribution to how conservation strategies can work best.


                              Genomics, particularly since the expansion of NGS, has transformed ecosystem ecology. By sequencing environmental DNA, we can now assess biodiversity without direct...
                              05-06-2026, 09:04 AM

                            ad_right_rmr

                            Collapse

                            News

                            Collapse

                            Topics Statistics Last Post
                            Started by SEQadmin2, Yesterday, 12:03 PM
                            0 responses
                            19 views
                            0 reactions
                            Last Post SEQadmin2  
                            Started by SEQadmin2, Yesterday, 11:40 AM
                            0 responses
                            14 views
                            0 reactions
                            Last Post SEQadmin2  
                            Started by SEQadmin2, 05-28-2026, 11:40 AM
                            0 responses
                            29 views
                            0 reactions
                            Last Post SEQadmin2  
                            Started by SEQadmin2, 05-26-2026, 10:12 AM
                            0 responses
                            31 views
                            0 reactions
                            Last Post SEQadmin2  
                            Working...