Hi all,
I just downloaded a public SOLiD whole genome dataset from the short read archive, and converted the .sra file to a fastq file. The fastq file is formated like this:
@ERR123473.934 1_25_1048 length=120
T100312003001020023222130331202112031020303121031301231232221G233113003011330303111102100131333312213333220011200121003033
+ERR123473.934 1_25_1048 length=120
!>@[email protected]@@[email protected]@[email protected]<@>[email protected][email protected]>@@:@>[email protected]@@[email protected]@@[email protected]=;@@@;@[email protected]@/@/@>[email protected]/?<[email protected]@[email protected]@@@@@@@@@@@@@@@@@@@@@@[email protected]@@@>@@@@<@@@[email protected]@@@@[email protected]@@@@@@@2>@@</@@
These data are a 1000 genomes sample sequenced with paired end, I believe with 2 X 60 reads. Why is it saying that the length is 120? Is the 'G' in the middle (highlighted in red), separating each read, and shouldn't they be as 2 separate entries in the file?
I am just a little confused on the format here.
Thanks for any help
I just downloaded a public SOLiD whole genome dataset from the short read archive, and converted the .sra file to a fastq file. The fastq file is formated like this:
@ERR123473.934 1_25_1048 length=120
T100312003001020023222130331202112031020303121031301231232221G233113003011330303111102100131333312213333220011200121003033
+ERR123473.934 1_25_1048 length=120
!>@[email protected]@@[email protected]@[email protected]<@>[email protected][email protected]>@@:@>[email protected]@@[email protected]@@[email protected]=;@@@;@[email protected]@/@/@>[email protected]/?<[email protected]@[email protected]@@@@@@@@@@@@@@@@@@@@@@[email protected]@@@>@@@@<@@@[email protected]@@@@[email protected]@@@@@@@2>@@</@@
These data are a 1000 genomes sample sequenced with paired end, I believe with 2 X 60 reads. Why is it saying that the length is 120? Is the 'G' in the middle (highlighted in red), separating each read, and shouldn't they be as 2 separate entries in the file?
I am just a little confused on the format here.
Thanks for any help