Seqanswers Leaderboard Ad

**TonyBrooks** · 07-02-2013, 06:00 AM

Contact your core facility to find out what they've done.

Both those files are read denoted as being read 1. See http://en.wikipedia.org/wiki/FASTQ_format for the header description.

@HWI-ST1018:135:H0A9YADXX:1:1101:1124:1996 1:N:0:GGCTAT

I would also recommend they use 0 errors when demultiplexing. The error should be low enough not to have to include indexes with errors unless there was a problem during the run (BMS during index read).

**rozitaa** · 07-02-2013, 06:07 AM

Thanks Tony. Yes both files have read 1. and they don't have the same identifier. I just wanted to be sure that there wouldn't be any other format for fastq apart from the one which you also have mentioned.

**TonyBrooks** · 07-02-2013, 06:34 AM

Something is up with that data. Your sequencing facility should be able to help you out.

**kmcarr** · 07-02-2013, 07:03 PM

Originally posted by rozitaa View Post

Hi,

I have got my sequencing data from a sequencing-core-facility. It has been done with illumina paired end sequencing. But the reads identifiers for the forward and reverse read of one sequence is not match at all. In addition the second part of identifier (related to the paired number) is always one.
The other problem is with the indexes, they are not same some times in an individual file.

e.g.
Read 1:
@HWI-ST1018:135:H0A9YADXX:1:1101:1124:1996 1:N:0:GGCTAT

@HWI-ST1018:135:H0A9YADXX:1:1101:2172:1979 1:N:0:GGCTAC

@HWI-ST1018:135:H0A9YADXX:1:1101:2146:1994 1:N:0:GGCTAC

Read 2:
@HWI-ST1018:135:H0A9YADXX:2:1101:1400:1999 1:N:0:GGCTAC

@HWI-ST1018:135:H0A9YADXX:2:1101:1657:1985 1:N:0:GGCTAC

@HWI-ST1018:135:H0A9YADXX:2:1101:1612:1996 1:N:0:GGCTAC

Could you please help me with identifying the format of my files?

Thanks,
Rozita

Those sets of reads come from two different lanes; lane 1 and lane 2 as indicated by the number shown in red.

**rozitaa** · 07-03-2013, 02:08 AM

Yes I see. Thanks. But they are in a same file representing 2 reads of one seq. I should contact them and figure it out.

**mastal** · 07-03-2013, 02:15 AM

Fastq file format for paired end sequences

The R1 and R2 reads of a pair are usually in different files.

How many files did you get from the sequence provider, and what were the files called?

**rozitaa** · 07-03-2013, 02:58 AM

Actually, I got one file for each sample (e.g. "P424_101_index11"). inside that there are two different files ("130419_AH02WFADXX", "130423_AH0A9YADXX") and based on their words only one of them is the experiment which is valid (the red one). In the inner directory I can file two fastq files ("1_130423_AH0A9YADXX_P424_101_index11_1.fastq" and "2_130423_AH0A9YADXX_P424_101_index11_1.fastq"). Some of the lines of each files are presented previously as examples.

**mastal** · 07-03-2013, 03:23 AM

Fastq file format for paired end sequences

You need to contact the sequence provider and find out what they did.

If they ran a paired-end experiment, then you should have files with the R2 reads matching the R1 reads that you already have.

You appear to have two files for the same sample, run on lane 1 and lane 2, and from what you showed previously, both files are R1.

Running the samples in more than one lane would be expected if you woudn't get enough reads from one lane of sequencing, or if you have several multiplexed samples, and you want to run each sample in the same lanes so as to avoid lane effects.

You need to find out whether the sequencing center performed a single-end or paired-end run with your samples, and if they did do a paired-end run, what have they done with the R2 files.

**rozitaa** · 07-03-2013, 03:25 AM

Yeah, Thanks all.

Topics	Statistics	Last Post
Study Highlights Challenges in Cellular Reprogramming for Regenerative Medicine by seqadmin Started by seqadmin, Today, 06:25 AM	0 responses 13 views 0 likes	Last Post by seqadmin Today, 06:25 AM
New DNA Modification Discovered as Key to Gene Activation in Early Development by seqadmin Started by seqadmin, Yesterday, 01:02 PM	0 responses 12 views 0 likes	Last Post by seqadmin Yesterday, 01:02 PM
Wastewater Analysis Unlocks New Method for Identifying Public Health Threats by seqadmin Started by seqadmin, 09-18-2024, 06:39 AM	0 responses 14 views 0 likes	Last Post by seqadmin 09-18-2024, 06:39 AM
Molecular Markers Shared Across Dementias by seqadmin Started by seqadmin, 09-11-2024, 02:44 PM	0 responses 14 views 0 likes	Last Post by seqadmin 09-11-2024, 02:44 PM

Seqanswers Leaderboard Ad

Announcement

Fastq file format for paired end sequences

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News