@crazyhottommy
Try this script. It gives you the fastq file instead of the sam/bam file
#!/usr/bin/perl -w
use warnings;
use strict;
my $datafile = $ARGV[0];
my $outfile = $ARGV[1];
open (IN, $datafile) or die "can't open the datafile: $datafile\n";
open (OUT, ">$outfile") or die "can't open the outputfile: $outfile\n";
while(my $line=<IN>){
chomp $line;
my @i = split(/\t/, $line);
print OUT "@".$i[0].":".$i[1].$i[2].":".$i[3].":".$i[4].":".$i[5]."#".$i[6]."/".$i[7]."\n".$i[8]."\n"."+"."\n".$i[9]."\n";
}
close IN;
close OUT;
Seqanswers Leaderboard Ad
Collapse
Announcement
Collapse
No announcement yet.
X
-
Originally posted by GenoMax View Post@crazyhottommy: You should clarify on your blog post that your modifications are specifically targeted for human (?) data. If someone else has a different genome it would be incorrect to follow your procedure, as is.
@Nino/@crazyhottommy: I am not sure what the downstream application is/was in your case but you have to account for the Q-scores probably being in non-sanger format (this is old data). Most new tools will expect them to be in sanger format.
@Nino: Check your PM. I sent you a script to recreate fastq sequence file yesterday. That may be a safer place to start. I can post it here if it works for you.
Leave a comment:
-
@GenoMax: I received your PM if you could please look at my response to see if the file I am working on is an alignment file. Also the data I am working with is from NCBI website which I downloaded to use, apparently they (people who uploaded the data) used the CASAVA Illumina pipeline (this is all the information that was given to me).
Leave a comment:
-
@crazyhottommy: You should clarify on your blog post that your modifications are specifically targeted for human (?) data. If someone else has a different genome it would be incorrect to follow your procedure, as is.
@Nino/@crazyhottommy: I am not sure what the downstream application is/was in your case but you have to account for the Q-scores probably being in non-sanger format (this is old data). Most new tools will expect them to be in sanger format.
@Nino: Check your PM. I sent you a script to recreate fastq sequence file yesterday. That may be a safer place to start. I can post it here if it works for you.
Leave a comment:
-
Originally posted by Nino View PostHey were you ever able to find a solution to your problem? I am currently running into the same issue as well.
[jhpce01 /amber3/feinbergLab/personal/sramazan/chip-seq]$ /amber3/feinbergLab/personal/sramazan/perl/scripts/export2sam.pl --read1=GSM1053091_mm9.nac.inp1.sorted.txt
@PG ID:export2sam.pl VN:2.3.1 CL:/amber3/feinbergLab/personal/sramazan/perl/scripts/export2sam.pl --read1=GSM1053091_mm9.nac.inp1.sorted.txt
ERROR: Unexpected number of fields in export record on line 1 of read1 export file. Found 16 fields but expected 22.
...erroneous export record:
HWI-EASXXX 1 2 35 1301 1347 0 1 ATGTAGCTAGAGACTTGAGCTCTGGGGGGTACTGGT aaa^]`aa`a_a_[_^`^`__`^^^][_XLQR[[]S chr10.fa 3003189 F 36 12
Leave a comment:
-
Hey were you ever able to find a solution to your problem? I am currently running into the same issue as well.
[jhpce01 /amber3/feinbergLab/personal/sramazan/chip-seq]$ /amber3/feinbergLab/personal/sramazan/perl/scripts/export2sam.pl --read1=GSM1053091_mm9.nac.inp1.sorted.txt
@PG ID:export2sam.pl VN:2.3.1 CL:/amber3/feinbergLab/personal/sramazan/perl/scripts/export2sam.pl --read1=GSM1053091_mm9.nac.inp1.sorted.txt
ERROR: Unexpected number of fields in export record on line 1 of read1 export file. Found 16 fields but expected 22.
...erroneous export record:
HWI-EASXXX 1 2 35 1301 1347 0 1 ATGTAGCTAGAGACTTGAGCTCTGGGGGGTACTGGT aaa^]`aa`a_a_[_^`^`__`^^^][_XLQR[[]S chr10.fa 3003189 F 36 12
Leave a comment:
-
Hi all,
I found http://genomewiki.ucsc.edu/index.php/ABRF2010_Tutorial
I downloaded the latest version of samtools
I use export2sam.pl --read1=chr21_export.txt \
| perl -wpe 's/(chr.*)\.fa/$1/' \
> chr21.sam
it says
ERROR: Unexpected number of fields in export record on line 1 of read1 export file. Found 16 fields but expected 22.
my sorted.txt file only contains 16 fields, and the program is complaining about that.
but the example file given in the link above also contains only 16 fields.
I though may be they changed the format. so I then downloaded the earlier version of samtools (0.1.8,0.1.9 etc...)
still, it gave me error like:
Use of uninitialized value $t[21] in string eq at export2sam.pl line 279,
earlier version just says die at liine 17.
Can anyone help me? Thanks!
Leave a comment:
Latest Articles
Collapse
-
by seqadmin
Non-coding RNAs (ncRNAs) do not code for proteins but play important roles in numerous cellular processes including gene silencing, developmental pathways, and more. There are numerous types including microRNA (miRNA), long ncRNA (lncRNA), circular RNA (circRNA), and more. In this article, we discuss innovative ncRNA research and explore recent technological advancements that improve the study of ncRNAs.
[Article Coming Soon!]...-
Channel: Articles
Today, 08:07 AM -
-
by seqadmin
Metagenomics has improved the way researchers study microorganisms across diverse environments. Historically, studying microorganisms relied on culturing them in the lab, a method that limits the investigation of many species since most are unculturable1. Metagenomics overcomes these issues by allowing the study of microorganisms regardless of their ability to be cultured or the environments they inhabit. Over time, the field has evolved, especially with the advent...-
Channel: Articles
09-23-2024, 06:35 AM -
-
by seqadmin
During the COVID-19 pandemic, scientists observed that while some individuals experienced severe illness when infected with SARS-CoV-2, others were barely affected. These disparities left researchers and clinicians wondering what causes the wide variations in response to viral infections and what role genetics plays.
Jean-Laurent Casanova, M.D., Ph.D., Professor at Rockefeller University, is a leading expert in this crossover between genetics and infectious...-
Channel: Articles
09-09-2024, 10:59 AM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, 10-02-2024, 04:51 AM
|
0 responses
13 views
0 likes
|
Last Post
by seqadmin
10-02-2024, 04:51 AM
|
||
Started by seqadmin, 10-01-2024, 07:10 AM
|
0 responses
23 views
0 likes
|
Last Post
by seqadmin
10-01-2024, 07:10 AM
|
||
Started by seqadmin, 09-30-2024, 08:33 AM
|
1 response
29 views
0 likes
|
Last Post
by EmiTom
Today, 06:46 AM
|
||
Started by seqadmin, 09-26-2024, 12:57 PM
|
0 responses
19 views
0 likes
|
Last Post
by seqadmin
09-26-2024, 12:57 PM
|
Leave a comment: