  • need a small illumina run data for practice

    Could someone please be so kind as to give me some illumina data (preferably miseq/hiseq) to play with? I'm trying to process the raw data with CASAVA, but the runs I have seem to be faulty, because they miss all kind of information files.
    Of course , it should only be small runs , since large ones would be impossible to download via network.
    I would be infinitely grateful, I really need to learn to work with illumina data.

  • #2
    You should check with your local illumina field applications scientist for help. They should be able to get you a copy from another institution that is local.

    That said why do you think the copies you have are faulty? Are you getting errors when trying to run CASAVA?


    • #3
      Yep, plenty of errors while trying to convert .bcl to fasta. First, it says that samplesheet.csv doesn't exist, when I create one and try to run,then it says that it cannot find bclconverter.cpp(although CASAVA has been configured, built and installed properly), then that .clocs files are missing (now where do I take them? I only have .locs, filter, .bcl,.control, .stats,).
      I think that CASAVA is up-todate (1.8.2) and the runs are >2 years old (dated 2011)


      • #4
        Is the data folder that you have access to a complete copy as made by the instrument?

        Depending on the RTA version used (make sure that your folder has the RunInfo.xml and config.xml files) the BCL to FASTQ converter is supposed to use the right position files (.clocs or .locs).

        You can explicitly provide (--positions-format .locs) option to the command and see if that works.


        • #5
          Yeah , I made sure the file tree matches the one pointed out in the user guide. But since the config file was missing , I had to take one from a different run. When it didn't work I made one myself. It worked to some extent, but well, I can't be sure that the absence of the original config file doesn't screw the following process.
          That's why I'd like to have a nice good run, preferably multiplex to practice demultiplexing as well, though it isn't supposed to be difficult. Well it all isn't supposed to be difficult but somehow I can't figure it out.


          • #6
            If you don't have the full flowcell folder then you are likely to run into issues. The XML files store run related information that is needed for downstream analysis (as you discovered).

            There are a couple of data sets (not sure if they are complete) included in the CASAVA install (they should be under /casava-1.8.2/src/CASAVA_v1.8.2/data/share/examples/Validation/ directory). Look into those while you locate a data set.


