I would like to get the help of computer experts, in editing my GFF file
Note that following from third row, the input has 12 ($1 to $12) columns, tab separated. The last line of the file has 16 ($1 to $16) columns, tab separated. ## lines should be ignored.
I want to take $14 of the last line (interval="1960862) only the number (1960862) add to $4 column (161,2194,2848..4729) i.e (161+1960862=1961023,2194+1960862=1963056..) and to $5 column (317,2280,2951..4756) i.e (317+1960862=1961179,2280+1960862=1963142), ignore the last line.
The output should look like this:
I have used perl script as below.But it has some errors, it outputs the same result
Any help in perl is appreciated.
Code:
##dsfsd2 ##sdf-sdf sasg 5.6.3 gi34_ex Gen CDS 161 317 . + . Name=Xm ZAK;created by=User gi56_ex Gen CDS 2194 2280 . + . Name=Xm ZAK;created by=User gi37_ex Gen CDS 2848 2951 . + . Name=Xm ZAK;created by=User gi37_ex Gen CDS 4554 4619 . + . Name=Xm ZAK;created by=User gi37_ex Gen CDS 4729 4756 . + . Name=Xm ZAK;created by=User gi37_ex Gen extracted region 1 11677 . + . Name=Extracted region from gi|371443185|gb|JH556675.1|;Extracted interval="1960862 -> 1972538"
I want to take $14 of the last line (interval="1960862) only the number (1960862) add to $4 column (161,2194,2848..4729) i.e (161+1960862=1961023,2194+1960862=1963056..) and to $5 column (317,2280,2951..4756) i.e (317+1960862=1961179,2280+1960862=1963142), ignore the last line.
The output should look like this:
Code:
##dsfsd2 ##sdf-sdf sasg 5.6.3 gi34_ex Gen CDS 1961023 1961179 . + . Name=Xm ZAK;created by=User gi56_ex Gen CDS 1963056 1963142 . + . Name=Xm ZAK;created by=User gi37_ex Gen CDS 1963710 1963813 . + . Name=Xm ZAK;created by=User gi37_ex Gen CDS 1965416 1965481 . + . Name=Xm ZAK;created by=User gi37_ex Gen CDS 1965591 1965618 . + . Name=Xm ZAK;created by=User gi37_ex Gen extracted region 1 11677 . + . Name=Extracted region from gi|371443185|gb|JH556675.1|;Extracted interval="1960862 -> 1972538"
I have used perl script as below.But it has some errors, it outputs the same result
Code:
#!/usr/bin/perl use warnings; use strict; open my $IN, '<', '1.in' or die $!; my $line; $line = $_ while <$IN>; # Remember the last line. my $last = $.; # Remember the number of the last line. my $interval = (split /\t/, $line)[13]; # Extract the 14th column. $interval =~ s/[^0-9]+//; # Keep only the number. seek $IN, 0, 0; # Rewind to the beginning of the input. $. = 0; # Restart the line counter. my $start = 1; # Flag to skip first lines. while (<$IN>) { my @columns = split /\t/; /^##/ or undef $start; # Unset start if the header is over. if (not ($start or $. == $last)) { # Not header or last line? $_ += $interval for @columns[3, 4]; } print join "\t", @columns; }