Seqanswers Leaderboard Ad

**Awesome** · 04-29-2011, 08:28 PM

Assuming you have the images saved...

You probably should have just removed the first cycle from image analysis and basecalling by using the command-line arguments. There isn't really a need to change file locations and folder names.

**rmdavies** · 05-03-2011, 06:07 AM

Originally posted by sramshey View Post

Hello-

I have a question regarding the use of the script illumina2srf. We recently had a HiSeq run in which the first cycle did not contain any data (clogged fluidics?). Illumina technical support advised us that we could improve the overall quality of our data for the lane in question by removing the first cycle. This involved removing the data folder in <run folder>/Data/Intensities/<lane>/C1.1, renaming the folders for all of the subsequent cycles, editing the config.xml in the Intensities folder to reflect the changes, and then repeating the entire procedure for the control lane as well. Following these steps we were able to generate fastq files, but when we attempt to run illumina2srf to generate our srf files we encounter an error indicating that cycle 1 is missing from our renumbered tiles:

/house/sdm/prod/illumina/staging/hiseq05/110224_HISEQ05_0066_B816YKABXX_1606/Data/Intensities/Bustard1.8.0_25-04-2011_sdm/../../../Config/FlowCellId.xml:
No such file or directory
Processing sequence files
/house/sdm/prod/illumina/staging/hiseq05/110224_HISEQ05_0066_B816YKABXX_1606/Data/Intensities/Bustard1.8.0_25-04-2011_sdm/s_3_1_0001_qseq.txt
/house/sdm/prod/illumina/staging/hiseq05/110224_HISEQ05_0066_B816YKABXX_1606/Data/Intensities/Bustard1.8.0_25-04-2011_sdm/s_3_2_0001_qseq.txt
Error: Missing cycle 1 for lane 3 tile 1 from CIF files.

I don't know how illumina2srf knows about cycles - perhaps they are encoded in the cif files? Is there a way that we can (easily) fool illumina2srf and force it to process the lane in a similar way to how we generated our fastqs?

Thanks in advance!

Yes, illumina2srf reads the cycle number from the .cif files, so it can't be fooled simply by changing the directory structure. You could try using the following perl script to fix them. You give it a list of .cif files to mangle on the command line.

Code:

#!/usr/bin/perl

use strict;
use warnings;

foreach my $file (@ARGV) {
    # Open .cif file read-write
    open(my $f, '+<', $file) || die "Couldn't open $file for update: $!\n";
    my $data;
    # Read header
    read($f, $data, 13) || die "Couldn't read $file: $!\n";
    # Subtract 1 from cycle number
    substr($data, 5, 2) = pack('v', unpack('v', substr($data, 5, 2)) - 1);
    # Write header back out
    seek($f, 0, 0) || die "Couldn't rewind $file: $!\n";
    print $f $data || die "Couldn't write to $file: $!\n";
    close($f) || die "Error writing to $file: $!\n";
}

An example of what it does:

Code:

$ hexdump -C -n 16 s_1_43.cif
00000000  43 49 46 01 02 19 00 01  00 81 3d 05 00 2a 00 dc  |CIF.......=..*..|
$ ./cif_fix.pl s_1_43.cif
$ hexdump -C -n 16 s_1_43.cif
00000000  43 49 46 01 02 18 00 01  00 81 3d 05 00 2a 00 dc  |CIF.......=..*..|

Note that this updates the .cif files in place, so I would strongly recommend backing them up before attempting to run it. Also, there's no guarantee that illumina2srf will work even after doing this. It would depend on whether it finds any other inconsistencies in the data.

If you can live without the intensity data then an easier solution would be to not use the -b or -r options. Illumina2srf will then ignore the .cif files and will generate a considerably smaller .srf file.

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 31 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 33 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 28 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 53 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad

Announcement

Comment

Comment

Latest Articles

ad_right_rmr

News