Hi Everyone,
I am in the midst of renaming some fasta headers to include the chromosomes they have been mapped to (from a tab delimited txt file). I don't have any problems setting up the hash for the scaffold and chromosome locations...the problem is that I have some short strings contained in longer strings (scaffold1 and scaffold10 and scaffold100). Usually I would add "$" to the end of my search term (/scaffold1$/) and thereby indicate the end of the search string, but I'm not sure how to make use of this when the search term is a variable ($hash{$scaff/$/}). Advice?
Below is my script so far...and TIA!
I am in the midst of renaming some fasta headers to include the chromosomes they have been mapped to (from a tab delimited txt file). I don't have any problems setting up the hash for the scaffold and chromosome locations...the problem is that I have some short strings contained in longer strings (scaffold1 and scaffold10 and scaffold100). Usually I would add "$" to the end of my search term (/scaffold1$/) and thereby indicate the end of the search string, but I'm not sure how to make use of this when the search term is a variable ($hash{$scaff/$/}). Advice?
Below is my script so far...and TIA!
Code:
#!/bin/bash/perl
#mod-header2include-chrom.pl
#This script is intended to read in a fasta file and a tab-delimited file and use the information from the tab-delimited file to modify the header.
#In this case, we are appending to the header (e.g. "scaffold671") the chromosome to which it has been mapped and the number of genes on this scaffold.
use strict;
use warnings;
open (DATA, "<genome-assoc-chromosomes.txt") or die "Could not open genome chromosome mapping data: $!\n";
open (FASTA, "<scafSeq.FG.fill") or die "Could not open Fasta file: $!\n";
my %hash;
while (<DATA>)
{
chomp;
if ($_ =~ "scafold"){ #skip header - scafold is spelled incorrectly on purpose;
next;
}
else {
my ($key, $chrom, $len, $gene) = split /\t/;
$hash{$key} = $chrom;
}
}
while (<FASTA>){
my $line = $_;
chomp ($line);
if ( defined $hash{$line} ) {
print "$line-$hash{$line}";
}
else { print $line; }
}
Comment