Seqanswers Leaderboard Ad

**Yang Ding** · 02-08-2014, 08:00 PM

I'm a little confused that I received the following reply from GenoMax by e-mail while there's none on the forum. Anyway, here's the reply:

Originally posted by GenoMax

SRA toolkit error messages can be benign, data set specific etc. Perhaps there is no problem here.

It may not hurt to send a message to SRA support. Use the "Write to helpdesk" link at the bottom of the page for the toolkit download tab. Include the dataset you are using. It is weekend so you may not hear back till Monday. In past they have sometime confirmed if there was a problem with a specific dataset.

Thanks for the information

. As for the NCBI help desk, we did write to them more than 2 weeks ago, but there was no reply. We suppose that there's something wrong with the mail servers, and since we cannot find any related topics or threads on the internet, yesterday we sent another again and also decided to ask the question here. However, as you have mentioned here, maybe we should have included our dataset IDs to tell NCBI which ones we'd like to check.

**Yang Ding** · 02-17-2014, 06:13 AM

The NCBI Help Desk had replied to me a few days ago to help to fix these issues. I think it would be good to share the solution here to everyone, so here's the solution:

The data will always be valid/complete as long as fastq-dump does not produce any error messages. It is possible for fastq-dump to produce a lot of warnings when operating on a valid data, especially when the log-level is set to 5 (default is 4).
The data will also always be valid/complete as it passes the vdb-validate program (i.e. all the outputs are "OK").

**albireo** · 02-17-2014, 06:33 AM

What happens if you try samdump on the same SRA files instead?

**Yang Ding** · 02-22-2014, 04:29 AM

Originally posted by albireo View Post

What happens if you try samdump on the same SRA files instead?

Hi albireo,

Sorry for the late reply. These SRA files are pure FastQ files, not SAM files, and I'm not sure which parameters I should set to use sam-dump to decrypt these SRA files correctly even after I have read the help page of sam-dump. Could you tell me why you're interested in the output of sam-dump?

**shuoguo** · 02-22-2014, 07:38 AM

Thanks for share the information!
May i wonder why ncbi favors SRA instead of just keep FASTQ?

**Yang Ding** · 02-22-2014, 03:30 PM

Originally posted by shuoguo View Post

Thanks for share the information!
May i wonder why ncbi favors SRA instead of just keep FASTQ?

As far as I know, FASTQ is itself a text-based format, so it would be better to compress them first and distribute them to save time. I don't know the reason why NCBI chose SRA instead of other popular compression format, but I guess that NCBI, by developing a new compression format itself, could have total control over anything of files compressed in this way, the most important of which should be the security issue.

**shuoguo** · 02-22-2014, 08:09 PM

Originally posted by Yang Ding View Post

As far as I know, FASTQ is itself a text-based format, so it would be better to compress them first and distribute them to save time. I don't know the reason why NCBI chose SRA instead of other popular compression format, but I guess that NCBI, by developing a new compression format itself, could have total control over anything of files compressed in this way, the most important of which should be the security issue.

Thank you!

Topics	Statistics	Last Post
Gene Misexpression in the Healthy Human Population by seqadmin Started by seqadmin, 07-25-2024, 06:46 AM	0 responses 9 views 0 likes	Last Post by seqadmin 07-25-2024, 06:46 AM
New Method for Rapid Genetic Diagnosis of Mendelian Disorders by seqadmin Started by seqadmin, 07-24-2024, 11:09 AM	0 responses 28 views 0 likes	Last Post by seqadmin 07-24-2024, 11:09 AM
Advancing Nanopore Technology for Portable Sensing Devices by seqadmin Started by seqadmin, 07-19-2024, 07:20 AM	0 responses 161 views 0 likes	Last Post by seqadmin 07-19-2024, 07:20 AM
New RNA-Based Gene Writing Technology Achieves Precise Gene Integration by seqadmin Started by seqadmin, 07-16-2024, 05:49 AM	0 responses 127 views 0 likes	Last Post by seqadmin 07-16-2024, 05:49 AM

Seqanswers Leaderboard Ad

Announcement

FastQ decrypted from SRA toolkit with warnings: any loss of information?

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News