Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • captainentropy
    replied
    @rlh60 (and for anyone else interested), FWIW, I'm trying out your scripts (starting with #2) and when I run it I get this error (YMMV):

    Using a hash as a reference is deprecated at ./bedToBedIntermediate.pl line 48.
    Line 48 is:
    for $k2 (sort {$a<=>$b} keys %{%steps->{$k1}}){
    Correcting that to:
    for $k2 (sort {$a<=>$b} keys %{$steps{$k1}}){
    ...and it now works fine. That hash was causing the problem for me.

    Leave a comment:


  • frewise
    replied
    Originally posted by rlh60 View Post
    I use a couple of perl scripts to first convert sam files (produced in bowtie) to bed, then to convert the bed files to wig (in 2 steps):

    1) Sam to bed:
    ******************

    #!/usr/bin/perl -w
    use strict;

    my $input = $ARGV[0];

    # open input file
    open IN, "$input" || die "can't open .sam file";
    open OUT, ">$ARGV[1]";

    print "Do you want to keep strand information? (y/n):";
    my $keep_strand = <STDIN>;
    chomp $keep_strand;

    my $lines = 0;

    if ($keep_strand eq "n")
    {

    while(<IN>){
    chomp $_;
    $lines++;

    my @bits = split /\t/;
    my $chr = $bits[2];
    #print "chr = $chr\n";
    my $start = $bits[3];
    #print "start = $start\n";
    my $strand = $bits[1];
    #print "strand = $strand\n";
    my ($start_new, $end_new);

    # correct for alignments off the chromosome ends
    if( $start =~ m/^\-/ )
    {
    $start = 0;
    }

    if($strand eq "-") #reverse
    {
    $start_new = $start - 200;
    if ($start_new =~ m/^\-/)
    {
    $start_new = 0;
    }
    $end_new = $start;
    }
    else #forward
    {
    $start_new = $start;
    $end_new = $start + 200;
    }

    print OUT "$chr\t$start_new\t$end_new\n";
    # give some feedback
    print STDERR "$lines processed\n" if(!($lines % 100000));
    }



    }
    elsif ($keep_strand eq "y")
    {
    while(<IN>){
    chomp $_;
    $lines++;

    my @bits = split /\t/;
    my $chr = $bits[2];
    #print "chr = $chr\n";
    my $start = $bits[3];
    #print "start = $start\n";
    my $strand = $bits[1];
    #print "strand = $strand\n";
    my ($start_new, $end_new);

    # correct for alignments off the chromosome ends
    if( $start =~ m/^\-/ )
    {
    $start = 0;
    }

    if($strand eq "-") #reverse
    {
    $start_new = $start - 200;
    if ($start_new =~ m/^\-/)
    {
    $start_new = 0;
    }
    $end_new = $start;
    }
    else #forward
    {
    $start_new = $start;
    $end_new = $start + 200;
    }

    print OUT "$chr\t$start_new\t$end_new\t$strand\n";
    # give some feedback
    print STDERR "$lines processed\n" if(!($lines % 100000));
    }
    }


    close IN;
    close OUT;


    2) bed to intermediate file (coverage) for wig:
    *******************************************************

    #!/usr/bin/perl
    # Program to calculate steps across genome prior to creating a density plot.


    use strict;
    my ( $j, $d, $e, $f ) = ( 0, 0, 0, 0 ); my $chr; my %steps = (); my $output = $ARGV[1]; my $user_input = $ARGV[2];
    open OUT, ">$output";

    if ( $#ARGV != 2 ) { print "\nPlease provide: <Input> <Output> <Species(m/h)>\n"; exit; }
    #print "Human (h) or Mouse (m) ?\n\n";
    #my $user_input = <STDIN>;
    chomp $user_input;
    if ( $user_input eq "h" ) { $j = 24; $d = 23; $e = 24; $f = 22; }
    elsif ( $user_input eq "m" ) { $j = 21; $d = 20; $e = 21; $f = 19; }
    else { print "Error.\n\n"; exit; }


    for ( my $i = 1; $i <= $j; $i++ ) {
    %steps = ();
    if ( $i <= $f ) {$chr = "chr$i";}
    elsif ( $i == $d ) {$chr = "chrX";}
    elsif ( $i == $e ) {$chr = "chrY";}

    open FILE, "<$ARGV[0]"; my $j = 0;
    while ( <FILE> ) {
    my $line = $_;

    chomp $line;

    if ($line =~ /^([^\t]+)\t([^\t]+)\t([^\t]+)/){
    #if ($line =~ /^(chr\S+)\t(\d+)\t(\d+)/){

    my ( $chromo, $start, $end ) = ( $1, $2, $3 );
    #print "\$chromo, \$start, \$end = $chromo, $start, $end\n";
    if ( $start < 0 ) { $start = 0; }
    if ( $chromo eq $chr ) {
    $j++;
    if ( $j == 1 ) { print "Checking $chromo\n"; }
    $steps{$chr}{$start}{'sum'} = $steps{$chr}{$start}{'sum'} + 1;

    $steps{$chr}{$end}{'sum'} = $steps{$chr}{$end}{'sum'} - 1;
    }}}
    close ( FILE );
    my $l = 0; my $k1 = 0; my $k2 = 0;

    for $k1 (keys %steps){

    for $k2 (sort {$a<=>$b} keys %{%steps->{$k1}}){
    $l++;
    if ( $l == 1 ) { print "Printing lines for $chr.\n"; }

    print OUT "$k1\t$k2\t$steps{$k1}{$k2}{'sum'}\n";

    }}
    print "Chr$i Processed.\n";
    }
    close OUT;


    3) Intermediate 'steps' file to wig:
    *****************************************

    #!/usr/bin/perl -w
    # Program to take the steps information and create a density wig file.


    use strict;

    my $sum = 0;

    my $oldsum = 0;

    my $oldpos = 0;


    if ( $#ARGV != 3 ) { print "Please provide: a) Input file\nb) Output file\nc)Track Name\nd) Colour pref.\n"; exit; }



    open FILE, "$ARGV[0]";#removed <
    open OUT, ">$ARGV[1]";#changed from >>

    my $header ="track type=bedGraph name=\"".$ARGV[2]."\" visibility=full color=".$ARGV[3]." priority=20\n";



    print OUT $header;

    while (defined (my $line = <FILE>)) {

    chomp $line;

    if ( $line =~ /^([^\t]+)\t([^\t]+)\t([^\t]+)/ ){

    if ( $oldpos ne 0 && $oldsum ne 0){

    print OUT "$1\t$oldpos\t$2\t$oldsum\n";

    }

    $oldpos = $2;

    $oldsum = $oldsum + $3;

    }

    }

    close FILE;
    close OUT;


    ****************************************

    Then use the UCSC wigToBigWig binary to convert to bigwig!

    Beccy
    Hi rlh60, could you please explain why use minus to the stop point :$steps{$chr}{$end}{'sum'} = $steps{$chr}{$end}{'sum'} - 1;
    Thanks!

    Leave a comment:


  • biowiz
    replied
    Originally posted by rlh60 View Post
    I use a couple of perl scripts to first convert sam files (produced in bowtie) to bed, then to convert the bed files to wig (in 2 steps):

    1) Sam to bed:
    ******************

    #!/usr/bin/perl -w
    use strict;

    my $input = $ARGV[0];

    # open input file
    open IN, "$input" || die "can't open .sam file";
    open OUT, ">$ARGV[1]";

    print "Do you want to keep strand information? (y/n):";
    my $keep_strand = <STDIN>;
    chomp $keep_strand;

    my $lines = 0;

    if ($keep_strand eq "n")
    {

    while(<IN>){
    chomp $_;
    $lines++;

    my @bits = split /\t/;
    my $chr = $bits[2];
    #print "chr = $chr\n";
    my $start = $bits[3];
    #print "start = $start\n";
    my $strand = $bits[1];
    #print "strand = $strand\n";
    my ($start_new, $end_new);

    # correct for alignments off the chromosome ends
    if( $start =~ m/^\-/ )
    {
    $start = 0;
    }

    if($strand eq "-") #reverse
    {
    $start_new = $start - 200;
    if ($start_new =~ m/^\-/)
    {
    $start_new = 0;
    }
    $end_new = $start;
    }
    else #forward
    {
    $start_new = $start;
    $end_new = $start + 200;
    }

    print OUT "$chr\t$start_new\t$end_new\n";
    # give some feedback
    print STDERR "$lines processed\n" if(!($lines % 100000));
    }



    }
    elsif ($keep_strand eq "y")
    {
    while(<IN>){
    chomp $_;
    $lines++;

    my @bits = split /\t/;
    my $chr = $bits[2];
    #print "chr = $chr\n";
    my $start = $bits[3];
    #print "start = $start\n";
    my $strand = $bits[1];
    #print "strand = $strand\n";
    my ($start_new, $end_new);

    # correct for alignments off the chromosome ends
    if( $start =~ m/^\-/ )
    {
    $start = 0;
    }

    if($strand eq "-") #reverse
    {
    $start_new = $start - 200;
    if ($start_new =~ m/^\-/)
    {
    $start_new = 0;
    }
    $end_new = $start;
    }
    else #forward
    {
    $start_new = $start;
    $end_new = $start + 200;
    }

    print OUT "$chr\t$start_new\t$end_new\t$strand\n";
    # give some feedback
    print STDERR "$lines processed\n" if(!($lines % 100000));
    }
    }


    close IN;
    close OUT;


    2) bed to intermediate file (coverage) for wig:
    *******************************************************

    #!/usr/bin/perl
    # Program to calculate steps across genome prior to creating a density plot.


    use strict;
    my ( $j, $d, $e, $f ) = ( 0, 0, 0, 0 ); my $chr; my %steps = (); my $output = $ARGV[1]; my $user_input = $ARGV[2];
    open OUT, ">$output";

    if ( $#ARGV != 2 ) { print "\nPlease provide: <Input> <Output> <Species(m/h)>\n"; exit; }
    #print "Human (h) or Mouse (m) ?\n\n";
    #my $user_input = <STDIN>;
    chomp $user_input;
    if ( $user_input eq "h" ) { $j = 24; $d = 23; $e = 24; $f = 22; }
    elsif ( $user_input eq "m" ) { $j = 21; $d = 20; $e = 21; $f = 19; }
    else { print "Error.\n\n"; exit; }


    for ( my $i = 1; $i <= $j; $i++ ) {
    %steps = ();
    if ( $i <= $f ) {$chr = "chr$i";}
    elsif ( $i == $d ) {$chr = "chrX";}
    elsif ( $i == $e ) {$chr = "chrY";}

    open FILE, "<$ARGV[0]"; my $j = 0;
    while ( <FILE> ) {
    my $line = $_;

    chomp $line;

    if ($line =~ /^([^\t]+)\t([^\t]+)\t([^\t]+)/){
    #if ($line =~ /^(chr\S+)\t(\d+)\t(\d+)/){

    my ( $chromo, $start, $end ) = ( $1, $2, $3 );
    #print "\$chromo, \$start, \$end = $chromo, $start, $end\n";
    if ( $start < 0 ) { $start = 0; }
    if ( $chromo eq $chr ) {
    $j++;
    if ( $j == 1 ) { print "Checking $chromo\n"; }
    $steps{$chr}{$start}{'sum'} = $steps{$chr}{$start}{'sum'} + 1;

    $steps{$chr}{$end}{'sum'} = $steps{$chr}{$end}{'sum'} - 1;
    }}}
    close ( FILE );
    my $l = 0; my $k1 = 0; my $k2 = 0;

    for $k1 (keys %steps){

    for $k2 (sort {$a<=>$b} keys %{%steps->{$k1}}){
    $l++;
    if ( $l == 1 ) { print "Printing lines for $chr.\n"; }

    print OUT "$k1\t$k2\t$steps{$k1}{$k2}{'sum'}\n";

    }}
    print "Chr$i Processed.\n";
    }
    close OUT;


    3) Intermediate 'steps' file to wig:
    *****************************************

    #!/usr/bin/perl -w
    # Program to take the steps information and create a density wig file.


    use strict;

    my $sum = 0;

    my $oldsum = 0;

    my $oldpos = 0;


    if ( $#ARGV != 3 ) { print "Please provide: a) Input file\nb) Output file\nc)Track Name\nd) Colour pref.\n"; exit; }



    open FILE, "$ARGV[0]";#removed <
    open OUT, ">$ARGV[1]";#changed from >>

    my $header ="track type=bedGraph name=\"".$ARGV[2]."\" visibility=full color=".$ARGV[3]." priority=20\n";



    print OUT $header;

    while (defined (my $line = <FILE>)) {

    chomp $line;

    if ( $line =~ /^([^\t]+)\t([^\t]+)\t([^\t]+)/ ){

    if ( $oldpos ne 0 && $oldsum ne 0){

    print OUT "$1\t$oldpos\t$2\t$oldsum\n";

    }

    $oldpos = $2;

    $oldsum = $oldsum + $3;

    }

    }

    close FILE;
    close OUT;


    ****************************************

    Then use the UCSC wigToBigWig binary to convert to bigwig!

    Beccy
    hey rlh60,,, could you help me out with implementing the perl script.. im afraid im very new to programming... i saved the code in a .pl file and am implementing it though command prompt in windows,,, how do i get the script to take my input file,,,

    Leave a comment:


  • biznatch
    replied
    I'm also not using BEDTools, and thinking I should look in to it...

    But in the meantime, I've been using Perl to convert bowtie output (.map) to either fixedStep wig or bedgraph. Anyways, I don't think bedtools can do everything I want.
    Last edited by biznatch; 02-01-2011, 11:06 AM.

    Leave a comment:


  • Michael.James.Clark
    replied
    I feel like I'm the only person in genomics who isn't using BEDTools yet. I guess I should really start.

    I did what Beccy did--wrote my own scripts for this. However, rather than going from bed to wig, I go directly from pileup to wig, cutting out that middleman.

    Leave a comment:


  • rlh60
    replied
    'Sam to bed' and 'bed to wig'

    I use a couple of perl scripts to first convert sam files (produced in bowtie) to bed, then to convert the bed files to wig (in 2 steps):

    1) Sam to bed:
    ******************

    #!/usr/bin/perl -w
    use strict;

    my $input = $ARGV[0];

    # open input file
    open IN, "$input" || die "can't open .sam file";
    open OUT, ">$ARGV[1]";

    print "Do you want to keep strand information? (y/n):";
    my $keep_strand = <STDIN>;
    chomp $keep_strand;

    my $lines = 0;

    if ($keep_strand eq "n")
    {

    while(<IN>){
    chomp $_;
    $lines++;

    my @bits = split /\t/;
    my $chr = $bits[2];
    #print "chr = $chr\n";
    my $start = $bits[3];
    #print "start = $start\n";
    my $strand = $bits[1];
    #print "strand = $strand\n";
    my ($start_new, $end_new);

    # correct for alignments off the chromosome ends
    if( $start =~ m/^\-/ )
    {
    $start = 0;
    }

    if($strand eq "-") #reverse
    {
    $start_new = $start - 200;
    if ($start_new =~ m/^\-/)
    {
    $start_new = 0;
    }
    $end_new = $start;
    }
    else #forward
    {
    $start_new = $start;
    $end_new = $start + 200;
    }

    print OUT "$chr\t$start_new\t$end_new\n";
    # give some feedback
    print STDERR "$lines processed\n" if(!($lines % 100000));
    }



    }
    elsif ($keep_strand eq "y")
    {
    while(<IN>){
    chomp $_;
    $lines++;

    my @bits = split /\t/;
    my $chr = $bits[2];
    #print "chr = $chr\n";
    my $start = $bits[3];
    #print "start = $start\n";
    my $strand = $bits[1];
    #print "strand = $strand\n";
    my ($start_new, $end_new);

    # correct for alignments off the chromosome ends
    if( $start =~ m/^\-/ )
    {
    $start = 0;
    }

    if($strand eq "-") #reverse
    {
    $start_new = $start - 200;
    if ($start_new =~ m/^\-/)
    {
    $start_new = 0;
    }
    $end_new = $start;
    }
    else #forward
    {
    $start_new = $start;
    $end_new = $start + 200;
    }

    print OUT "$chr\t$start_new\t$end_new\t$strand\n";
    # give some feedback
    print STDERR "$lines processed\n" if(!($lines % 100000));
    }
    }


    close IN;
    close OUT;


    2) bed to intermediate file (coverage) for wig:
    *******************************************************

    #!/usr/bin/perl
    # Program to calculate steps across genome prior to creating a density plot.


    use strict;
    my ( $j, $d, $e, $f ) = ( 0, 0, 0, 0 ); my $chr; my %steps = (); my $output = $ARGV[1]; my $user_input = $ARGV[2];
    open OUT, ">$output";

    if ( $#ARGV != 2 ) { print "\nPlease provide: <Input> <Output> <Species(m/h)>\n"; exit; }
    #print "Human (h) or Mouse (m) ?\n\n";
    #my $user_input = <STDIN>;
    chomp $user_input;
    if ( $user_input eq "h" ) { $j = 24; $d = 23; $e = 24; $f = 22; }
    elsif ( $user_input eq "m" ) { $j = 21; $d = 20; $e = 21; $f = 19; }
    else { print "Error.\n\n"; exit; }


    for ( my $i = 1; $i <= $j; $i++ ) {
    %steps = ();
    if ( $i <= $f ) {$chr = "chr$i";}
    elsif ( $i == $d ) {$chr = "chrX";}
    elsif ( $i == $e ) {$chr = "chrY";}

    open FILE, "<$ARGV[0]"; my $j = 0;
    while ( <FILE> ) {
    my $line = $_;

    chomp $line;

    if ($line =~ /^([^\t]+)\t([^\t]+)\t([^\t]+)/){
    #if ($line =~ /^(chr\S+)\t(\d+)\t(\d+)/){

    my ( $chromo, $start, $end ) = ( $1, $2, $3 );
    #print "\$chromo, \$start, \$end = $chromo, $start, $end\n";
    if ( $start < 0 ) { $start = 0; }
    if ( $chromo eq $chr ) {
    $j++;
    if ( $j == 1 ) { print "Checking $chromo\n"; }
    $steps{$chr}{$start}{'sum'} = $steps{$chr}{$start}{'sum'} + 1;

    $steps{$chr}{$end}{'sum'} = $steps{$chr}{$end}{'sum'} - 1;
    }}}
    close ( FILE );
    my $l = 0; my $k1 = 0; my $k2 = 0;

    for $k1 (keys %steps){

    for $k2 (sort {$a<=>$b} keys %{%steps->{$k1}}){
    $l++;
    if ( $l == 1 ) { print "Printing lines for $chr.\n"; }

    print OUT "$k1\t$k2\t$steps{$k1}{$k2}{'sum'}\n";

    }}
    print "Chr$i Processed.\n";
    }
    close OUT;


    3) Intermediate 'steps' file to wig:
    *****************************************

    #!/usr/bin/perl -w
    # Program to take the steps information and create a density wig file.


    use strict;

    my $sum = 0;

    my $oldsum = 0;

    my $oldpos = 0;


    if ( $#ARGV != 3 ) { print "Please provide: a) Input file\nb) Output file\nc)Track Name\nd) Colour pref.\n"; exit; }



    open FILE, "$ARGV[0]";#removed <
    open OUT, ">$ARGV[1]";#changed from >>

    my $header ="track type=bedGraph name=\"".$ARGV[2]."\" visibility=full color=".$ARGV[3]." priority=20\n";



    print OUT $header;

    while (defined (my $line = <FILE>)) {

    chomp $line;

    if ( $line =~ /^([^\t]+)\t([^\t]+)\t([^\t]+)/ ){

    if ( $oldpos ne 0 && $oldsum ne 0){

    print OUT "$1\t$oldpos\t$2\t$oldsum\n";

    }

    $oldpos = $2;

    $oldsum = $oldsum + $3;

    }

    }

    close FILE;
    close OUT;


    ****************************************

    Then use the UCSC wigToBigWig binary to convert to bigwig!

    Beccy

    Leave a comment:


  • JohnK
    replied
    Originally posted by dawe View Post
    Nice! I'm one of the CARPET developers...
    BTW, if you haven't figured this out, you may install BEDTools and do something like this:

    Code:
    $ sortBed -i filein.bed | slopBed -i stdin -s -r XXX -l YYY |genomeCoverageBed | genomeCoverageBed -bg -i stdin -g chromsizes.tab > fileout.bdg
    and, since I've just answered to a wig to bigWig post by you, you can add wigToBigWig to your pipeline:

    Code:
    $ sortBed -i filein.bed | slopBed -i stdin -s -r XXX -l YYY |genomeCoverageBed -bg -i stdin -g chromsizes.tab | wigToBigWig -clip stdin chromsizes.tab fileout.bw
    where
    • filein.bed is your bed file (this can be piped from awk + gff to extract intervals
    • chromsizes.tab is a tab file with chromosome sizes
    • XXX and YYY are sizes you should extend each interval to cover the fragment you sequenced (i.e. if you use Illumina and fragmented 200bp and read 36bp you should set XXX=180 YYY=0)


    HTH
    That's awesome. I had no idea you could pipe to bed tools programs and use them as filters with the -i option. still trying to crack wigToBigWig. I really need this so i can use the bloody genome browser without my hair turning grey.

    Leave a comment:


  • dawe
    replied
    Originally posted by JohnK View Post
    forget it. I'm not installing galaxy. any ideas, anyone?
    Nice! I'm one of the CARPET developers...
    BTW, if you haven't figured this out, you may install BEDTools and do something like this:

    Code:
    $ sortBed -i filein.bed | slopBed -i stdin -s -r XXX -l YYY |genomeCoverageBed | genomeCoverageBed -bg -i stdin -g chromsizes.tab > fileout.bdg
    and, since I've just answered to a wig to bigWig post by you, you can add wigToBigWig to your pipeline:

    Code:
    $ sortBed -i filein.bed | slopBed -i stdin -s -r XXX -l YYY |genomeCoverageBed -bg -i stdin -g chromsizes.tab | wigToBigWig -clip stdin chromsizes.tab fileout.bw
    where
    • filein.bed is your bed file (this can be piped from awk + gff to extract intervals
    • chromsizes.tab is a tab file with chromosome sizes
    • XXX and YYY are sizes you should extend each interval to cover the fragment you sequenced (i.e. if you use Illumina and fragmented 200bp and read 36bp you should set XXX=180 YYY=0)


    HTH

    Leave a comment:


  • JohnK
    replied
    Originally posted by JohnK View Post
    ah- nevermind- found a possible resource called 'CARPET'. I will post a follow-up:

    http://host13.bioinfo3.ifom-ieo-campus.it:8080/

    forget it. I'm not installing galaxy. any ideas, anyone?

    Leave a comment:


  • JohnK
    replied
    ah- nevermind- found a possible resource called 'CARPET'. I will post a follow-up:

    Leave a comment:


  • JohnK
    started a topic gff/bed/anything to .wig conversion?

    gff/bed/anything to .wig conversion?

    Anyone,

    To save me some time, does anyone have a conversion to .wig script or know of a resource? I have gff and bed formats. Thanks!

    John

Latest Articles

Collapse

  • seqadmin
    Understanding Genetic Influence on Infectious Disease
    by seqadmin




    During the COVID-19 pandemic, scientists observed that while some individuals experienced severe illness when infected with SARS-CoV-2, others were barely affected. These disparities left researchers and clinicians wondering what causes the wide variations in response to viral infections and what role genetics plays.

    Jean-Laurent Casanova, M.D., Ph.D., Professor at Rockefeller University, is a leading expert in this crossover between genetics and infectious...
    09-09-2024, 10:59 AM
  • seqadmin
    Addressing Off-Target Effects in CRISPR Technologies
    by seqadmin






    The first FDA-approved CRISPR-based therapy marked the transition of therapeutic gene editing from a dream to reality1. CRISPR technologies have streamlined gene editing, and CRISPR screens have become an important approach for identifying genes involved in disease processes2. This technique introduces targeted mutations across numerous genes, enabling large-scale identification of gene functions, interactions, and pathways3. Identifying the full range...
    08-27-2024, 04:44 AM

ad_right_rmr

Collapse

News

Collapse

Topics Statistics Last Post
Started by seqadmin, Today, 06:39 AM
0 responses
5 views
0 likes
Last Post seqadmin  
Started by seqadmin, 09-11-2024, 02:44 PM
0 responses
13 views
0 likes
Last Post seqadmin  
Started by seqadmin, 09-06-2024, 08:02 AM
0 responses
147 views
0 likes
Last Post seqadmin  
Started by seqadmin, 09-03-2024, 08:30 AM
0 responses
153 views
0 likes
Last Post seqadmin  
Working...
X