Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Helps about using CASAVA to extract seq

    We have got base callings files (.bcl) and would like to extract reads from it, however, have troubles to do this:
    Warning messages:
    Message 1:
    [2011-11-08 16:37:30]ne1[configureBclToFastq.pl]/BaseCalWARNING: 'LocationFileType' element not found in /media/psf/Lane1Testing/Data/Intensities/BaseCalls/../RTAConfiguration.xml

    Message 2
    [2011-11-08 16:37:30]Lan[configureBclToFastq.pl]es/BaseCWARNING: Couldn't find run info in /media/psf/Lane1Testing/Data/Intensities/BaseCalls/../../../RunInfo.xml[2011-11-08 16:37:30] [configureBclToFastq.pl] WARNING: Couldn't find RunInfo.xml for /media/psf/Lane1Testing/Data/Intensities/BaseCalls

    Message 3

    [2011-11-08 16:37:30] [configureBclToFastq.pl] ERROR: barcode CGATGT for lane 1 has length 6: expected barcode lenth (including delimiters) is 1

    For message 1, we have RTAConfiguration.xml, but we don't know where we should put;
    For message 2, we can not find RunInfo.xml, need we go to machine to check it?
    For message 3, if you have experience to share about barcode mask we really appreciate it...

    thanks!

    P.S. CASAVA is so much painful to install and get work....

  • #2
    2) Yes. The RunInfo should be in the very top level of the run directory.

    3) I suspect that if you get the RTAconfiguration and RunInfo files into their proper places then then barcoding will work.
    Last edited by westerman; 11-10-2011, 11:02 AM. Reason: Took away point #1. It was not helpful and, worse, wrong! :-(

    Comment


    • #3
      "'LocationFileType' element not found" warning is what I get frequently with CASAVA 1.8.x. It seems not to be a problem.
      If RunInfo.xml is not found it is hard to guess what demultiplexing parameters to use. That's why you get "Message 3".

      For message 1: "RTAConfiguration.xml" should reside in "Data/Intensities/", but may lack the requested entry 'LocationFileType' .
      For message 2: yes, but better get the whole run folder! Then demultiplex.
      For message 3: How should I know what type of run you try to work on?

      Code:
          --use-bases-mask *mask*[[*,mask*]...]
                  Conversion mask characters:
      
                    - Y or y: use
                    - N or n: discard
                    - I or i: use for indexing
      
                  If not given, the mask will be guessed from the RunInfo.xml file
                  in the run folder.
      
                  For instance, in a 2x76 indexed paired end run, the mask
                  *Y76,I6n,y75n* means: "use all 76 bases from the first end,
                  discard the last base of the indexing read, and use only the
                  first 75 bases of the second end".
      CASAVA is so much painful to install and get work..
      Installation and getting things running is by far the smallest problem with Illumina data production and analysis pipeline :-)

      Comment


      • #4
        As pointed out before you do not appear to be using the entire folder structure to do the de-multiplexing.

        Perhaps you did not own all the samples on this flowcell so your facility gave you only a part of the total data (I have never tried to analyze data this way ... has anyone else done this?)
        ---------------------------------
        Message 1 about LocationType is a residual of some code that had to do with GAIIx (as I remember it).

        Messages 2 and 3 should go away once you do the de-multiplexing using the entire raw flowcell data folder.

        ---------------------------------
        I wonder if this another instance where a core facility is expecting users to do the work they should have done for them.

        Can you post the command line you are using? That may help with the diagnosis of this problem.

        Originally posted by lewewoo View Post

        For message 1, we have RTAConfiguration.xml, but we don't know where we should put;
        For message 2, we can not find RunInfo.xml, need we go to machine to check it?
        For message 3, if you have experience to share about barcode mask we really appreciate it...

        thanks!

        P.S. CASAVA is so much painful to install and get work....
        Last edited by GenoMax; 11-09-2011, 03:55 PM.

        Comment


        • #5
          Thanks for help, still not working...

          First we tried to extract a small piece of data to estimate the running time, so gave us the messages as shown before; now I put everything back (go back to the original folder we got from instrument operator), and as following is the screen script, any helps? thanks!


          [2011-11-09 18:06:02] [configureBclToFastq.pl] INFO: Basecalling software: RTA
          [2011-11-09 18:06:02] [configureBclToFastq.pl] INFO: version: 1.12 (build 4)
          [2011-11-09 18:06:02] [configureBclToFastq.pl] WARNING: 'LocationFileType' element not found in /media/psf/Desktop/X111018_SN968_BC09PAACXX/Data/Intensities/BaseCalls/../RTAConfiguration.xml
          [2011-11-09 18:06:02] [configureBclToFastq.pl] WARNING: Couldn't find run info in /media/psf/Desktop/X111018_SN968_BC09PAACXX/Data/Intensities/BaseCalls/../../../RunInfo.xml
          [2011-11-09 18:06:02] [configureBclToFastq.pl] WARNING: Couldn't find RunInfo.xml for /media/psf/Desktop/X111018_SN968_BC09PAACXX/Data/Intensities/BaseCalls
          [2011-11-09 18:06:02] [configureBclToFastq.pl] INFO: Original use-bases mask: undefined
          [2011-11-09 18:06:02] [configureBclToFastq.pl] INFO: Guessed use-bases mask: undefined
          [2011-11-09 18:06:02] [configureBclToFastq.pl] ERROR: undefined mask
          [2011-11-09 18:06:02] [configureBclToFastq.pl] BACKTRACE: at /home/PRINCESS/CASAVA-1.8.2-build/illumina/software/CASAVA-1.8.2/lib/CASAVA-1.8.2/perl/Casava/Demultiplex.pm line 285
          Casava:emultiplex::mask('Casava:emultiplex=HASH(0x254e2a0)', undef) called at /home/PRINCESS/CASAVA-1.8.2-build/illumina/software/CASAVA-1.8.2/bin/configureBclToFastq.pl line 379
          Died at /home/PRINCESS/CASAVA-1.8.2-build/illumina/software/CASAVA-1.8.2/lib/CASAVA-1.8.2/perl/Casava/Common/Log.pm line 310.

          Comment


          • #6
            what is LocationFileType? and I searched all the folders but failed to find RunInfo.xml, our operator forgot to transfer that file for us?

            Comment


            • #7
              See my post #4 above.

              1. Did you get the entire folder from your provider?
              2. Provide the command line you are using for this analysis (and the version of CASAVA).

              Originally posted by lewewoo View Post
              what is LocationFileType? and I searched all the folders but failed to find RunInfo.xml, our operator forgot to transfer that file for us?

              Comment


              • #8
                Thanks, as following is the command line with all the messages from computer:


                [PRINCESS@localhost ~]$ /home/PRINCESS/CASAVA-1.8.2-build/illumina/software/CASAVA-1.8.2/bin/configureBclToFastq.pl --input-dir /home/PRINCESS/Desktop/Parallels_Shared_Folders/Desktop/X111018_SN968_BC09PAACXX/Data/Intensities/BaseCalls --output-dir /home/PRINCESS/Desktop/Lane1Testing_OUT --sample-sheet /home/PRINCESS/Desktop/Parallels_Shared_Folders/Desktop/X111018_SN968_BC09PAACXX/SampleSheet.csv --force
                [2011-11-09 18:06:02] [configureBclToFastq.pl] INFO: Basecalling software: RTA
                [2011-11-09 18:06:02] [configureBclToFastq.pl] INFO: version: 1.12 (build 4)
                [2011-11-09 18:06:02] [configureBclToFastq.pl] WARNING: 'LocationFileType' element not found in /media/psf/Desktop/X111018_SN968_BC09PAACXX/Data/Intensities/BaseCalls/../RTAConfiguration.xml
                [2011-11-09 18:06:02] [configureBclToFastq.pl] WARNING: Couldn't find run info in /media/psf/Desktop/X111018_SN968_BC09PAACXX/Data/Intensities/BaseCalls/../../../RunInfo.xml
                [2011-11-09 18:06:02] [configureBclToFastq.pl] WARNING: Couldn't find RunInfo.xml for /media/psf/Desktop/X111018_SN968_BC09PAACXX/Data/Intensities/BaseCalls
                [2011-11-09 18:06:02] [configureBclToFastq.pl] INFO: Original use-bases mask: undefined
                [2011-11-09 18:06:02] [configureBclToFastq.pl] INFO: Guessed use-bases mask: undefined
                [2011-11-09 18:06:02] [configureBclToFastq.pl] ERROR: undefined mask
                [2011-11-09 18:06:02] [configureBclToFastq.pl] BACKTRACE: at /home/PRINCESS/CASAVA-1.8.2-build/illumina/software/CASAVA-1.8.2/lib/CASAVA-1.8.2/perl/Casava/Demultiplex.pm line 285
                Casava:emultiplex::mask('Casava:emultiplex=HASH(0x254e2a0)', undef) called at /home/PRINCESS/CASAVA-1.8.2-build/illumina/software/CASAVA-1.8.2/bin/configureBclToFastq.pl line 379
                Died at /home/PRINCESS/CASAVA-1.8.2-build/illumina/software/CASAVA-1.8.2/lib/CASAVA-1.8.2/perl/Casava/Common/Log.pm line 310.

                Comment


                • #9
                  the operator said everything is in the hard drive but may check him tomorrow... thanks for your help!

                  Comment


                  • #10
                    Command line looks fine. You are missing critical xml files with run parameters. Let us know what you find out.

                    Originally posted by lewewoo View Post
                    the operator said everything is in the hard drive but may check him tomorrow... thanks for your help!

                    Comment


                    • #11
                      need talk with operator tomorrow and to see what have happened to it... thanks!

                      Comment


                      • #12
                        Originally posted by lewewoo View Post
                        the operator said everything is in the hard drive but may check him tomorrow... thanks for your help!
                        Your top level dir should look like this (and some more txt files probably):

                        Code:
                        Config/
                        Data/
                        First_Base_Report.htm
                        InterOp/
                        Logs/
                        PeriodicSaveRates/
                        RTAComplete.txt
                        Recipe/
                        RunInfo.xml
                        Thumbnail_Images/
                        runParameters.xml

                        Comment


                        • #13
                          thanks everyone!

                          we got all the files listed as the above reply and went through the first step of BCL converter; however, after we typed:
                          nohup make -j 8
                          the computer response as:
                          nohup:ignoring input and appending output to 'nohup.out'

                          and the computer is running; and this message is good, right?
                          we noticed that there is increased usage of hard drive even it is slow... maybe tomorrow this could be done, but please let us know if this message means something bad, thanks!

                          we still have the warning message about LocationFileType, however, as GenoMax mentioned that may be doing with GAIIs, so we ignore this warning information, we hope the computer is doing well now...

                          Comment


                          • #14
                            Originally posted by lewewoo View Post
                            thanks everyone!

                            we got all the files listed as the above reply and went through the first step of BCL converter; however, after we typed:
                            nohup make -j 8
                            the computer response as:
                            nohup:ignoring input and appending output to 'nohup.out'

                            and the computer is running; and this message is good, right?
                            we noticed that there is increased usage of hard drive even it is slow... maybe tomorrow this could be done, but please let us know if this message means something bad, thanks!

                            we still have the warning message about LocationFileType, however, as GenoMax mentioned that may be doing with GAIIs, so we ignore this warning information, we hope the computer is doing well now...

                            The nohup message is just a message not a warning nor an error. It's fine this way.

                            Basecalling, BCL conversion etc. produces (as implemented by Illumina) very high I/O load; this is sometimes the limiting speed factor. When you use a few CPUs (e.g. 48) then you see that CPU load is far below 48 due to heavy I/O.

                            We get the LocationFileType warning always; so I ignore it.

                            Comment


                            • #15
                              Depending on what kind of a run this is (2 x 100 PE or bigger) it could easily take a day (or longer) to complete this process, especially if you are not doing this on a cluster with fast storage.

                              Originally posted by lewewoo View Post
                              thanks everyone!

                              we got all the files listed as the above reply and went through the first step of BCL converter; however, after we typed:
                              nohup make -j 8
                              the computer response as:
                              nohup:ignoring input and appending output to 'nohup.out'

                              and the computer is running; and this message is good, right?
                              we noticed that there is increased usage of hard drive even it is slow... maybe tomorrow this could be done, but please let us know if this message means something bad, thanks!

                              we still have the warning message about LocationFileType, however, as GenoMax mentioned that may be doing with GAIIs, so we ignore this warning information, we hope the computer is doing well now...

                              Comment

                              Latest Articles

                              Collapse

                              • seqadmin
                                Essential Discoveries and Tools in Epitranscriptomics
                                by seqadmin




                                The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
                                04-22-2024, 07:01 AM
                              • seqadmin
                                Current Approaches to Protein Sequencing
                                by seqadmin


                                Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                                04-04-2024, 04:25 PM

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by seqadmin, 04-25-2024, 11:49 AM
                              0 responses
                              19 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 04-24-2024, 08:47 AM
                              0 responses
                              18 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 04-11-2024, 12:08 PM
                              0 responses
                              62 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 04-10-2024, 10:19 PM
                              0 responses
                              60 views
                              0 likes
                              Last Post seqadmin  
                              Working...
                              X