Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Helps about using CASAVA to extract seq

    We have got base callings files (.bcl) and would like to extract reads from it, however, have troubles to do this:
    Warning messages:
    Message 1:
    [2011-11-08 16:37:30]ne1[configureBclToFastq.pl]/BaseCalWARNING: 'LocationFileType' element not found in /media/psf/Lane1Testing/Data/Intensities/BaseCalls/../RTAConfiguration.xml

    Message 2
    [2011-11-08 16:37:30]Lan[configureBclToFastq.pl]es/BaseCWARNING: Couldn't find run info in /media/psf/Lane1Testing/Data/Intensities/BaseCalls/../../../RunInfo.xml[2011-11-08 16:37:30] [configureBclToFastq.pl] WARNING: Couldn't find RunInfo.xml for /media/psf/Lane1Testing/Data/Intensities/BaseCalls

    Message 3

    [2011-11-08 16:37:30] [configureBclToFastq.pl] ERROR: barcode CGATGT for lane 1 has length 6: expected barcode lenth (including delimiters) is 1

    For message 1, we have RTAConfiguration.xml, but we don't know where we should put;
    For message 2, we can not find RunInfo.xml, need we go to machine to check it?
    For message 3, if you have experience to share about barcode mask we really appreciate it...

    thanks!

    P.S. CASAVA is so much painful to install and get work....

  • #2
    2) Yes. The RunInfo should be in the very top level of the run directory.

    3) I suspect that if you get the RTAconfiguration and RunInfo files into their proper places then then barcoding will work.
    Last edited by westerman; 11-10-2011, 11:02 AM. Reason: Took away point #1. It was not helpful and, worse, wrong! :-(

    Comment


    • #3
      "'LocationFileType' element not found" warning is what I get frequently with CASAVA 1.8.x. It seems not to be a problem.
      If RunInfo.xml is not found it is hard to guess what demultiplexing parameters to use. That's why you get "Message 3".

      For message 1: "RTAConfiguration.xml" should reside in "Data/Intensities/", but may lack the requested entry 'LocationFileType' .
      For message 2: yes, but better get the whole run folder! Then demultiplex.
      For message 3: How should I know what type of run you try to work on?

      Code:
          --use-bases-mask *mask*[[*,mask*]...]
                  Conversion mask characters:
      
                    - Y or y: use
                    - N or n: discard
                    - I or i: use for indexing
      
                  If not given, the mask will be guessed from the RunInfo.xml file
                  in the run folder.
      
                  For instance, in a 2x76 indexed paired end run, the mask
                  *Y76,I6n,y75n* means: "use all 76 bases from the first end,
                  discard the last base of the indexing read, and use only the
                  first 75 bases of the second end".
      CASAVA is so much painful to install and get work..
      Installation and getting things running is by far the smallest problem with Illumina data production and analysis pipeline :-)

      Comment


      • #4
        As pointed out before you do not appear to be using the entire folder structure to do the de-multiplexing.

        Perhaps you did not own all the samples on this flowcell so your facility gave you only a part of the total data (I have never tried to analyze data this way ... has anyone else done this?)
        ---------------------------------
        Message 1 about LocationType is a residual of some code that had to do with GAIIx (as I remember it).

        Messages 2 and 3 should go away once you do the de-multiplexing using the entire raw flowcell data folder.

        ---------------------------------
        I wonder if this another instance where a core facility is expecting users to do the work they should have done for them.

        Can you post the command line you are using? That may help with the diagnosis of this problem.

        Originally posted by lewewoo View Post

        For message 1, we have RTAConfiguration.xml, but we don't know where we should put;
        For message 2, we can not find RunInfo.xml, need we go to machine to check it?
        For message 3, if you have experience to share about barcode mask we really appreciate it...

        thanks!

        P.S. CASAVA is so much painful to install and get work....
        Last edited by GenoMax; 11-09-2011, 03:55 PM.

        Comment


        • #5
          Thanks for help, still not working...

          First we tried to extract a small piece of data to estimate the running time, so gave us the messages as shown before; now I put everything back (go back to the original folder we got from instrument operator), and as following is the screen script, any helps? thanks!


          [2011-11-09 18:06:02] [configureBclToFastq.pl] INFO: Basecalling software: RTA
          [2011-11-09 18:06:02] [configureBclToFastq.pl] INFO: version: 1.12 (build 4)
          [2011-11-09 18:06:02] [configureBclToFastq.pl] WARNING: 'LocationFileType' element not found in /media/psf/Desktop/X111018_SN968_BC09PAACXX/Data/Intensities/BaseCalls/../RTAConfiguration.xml
          [2011-11-09 18:06:02] [configureBclToFastq.pl] WARNING: Couldn't find run info in /media/psf/Desktop/X111018_SN968_BC09PAACXX/Data/Intensities/BaseCalls/../../../RunInfo.xml
          [2011-11-09 18:06:02] [configureBclToFastq.pl] WARNING: Couldn't find RunInfo.xml for /media/psf/Desktop/X111018_SN968_BC09PAACXX/Data/Intensities/BaseCalls
          [2011-11-09 18:06:02] [configureBclToFastq.pl] INFO: Original use-bases mask: undefined
          [2011-11-09 18:06:02] [configureBclToFastq.pl] INFO: Guessed use-bases mask: undefined
          [2011-11-09 18:06:02] [configureBclToFastq.pl] ERROR: undefined mask
          [2011-11-09 18:06:02] [configureBclToFastq.pl] BACKTRACE: at /home/PRINCESS/CASAVA-1.8.2-build/illumina/software/CASAVA-1.8.2/lib/CASAVA-1.8.2/perl/Casava/Demultiplex.pm line 285
          Casava:emultiplex::mask('Casava:emultiplex=HASH(0x254e2a0)', undef) called at /home/PRINCESS/CASAVA-1.8.2-build/illumina/software/CASAVA-1.8.2/bin/configureBclToFastq.pl line 379
          Died at /home/PRINCESS/CASAVA-1.8.2-build/illumina/software/CASAVA-1.8.2/lib/CASAVA-1.8.2/perl/Casava/Common/Log.pm line 310.

          Comment


          • #6
            what is LocationFileType? and I searched all the folders but failed to find RunInfo.xml, our operator forgot to transfer that file for us?

            Comment


            • #7
              See my post #4 above.

              1. Did you get the entire folder from your provider?
              2. Provide the command line you are using for this analysis (and the version of CASAVA).

              Originally posted by lewewoo View Post
              what is LocationFileType? and I searched all the folders but failed to find RunInfo.xml, our operator forgot to transfer that file for us?

              Comment


              • #8
                Thanks, as following is the command line with all the messages from computer:


                [PRINCESS@localhost ~]$ /home/PRINCESS/CASAVA-1.8.2-build/illumina/software/CASAVA-1.8.2/bin/configureBclToFastq.pl --input-dir /home/PRINCESS/Desktop/Parallels_Shared_Folders/Desktop/X111018_SN968_BC09PAACXX/Data/Intensities/BaseCalls --output-dir /home/PRINCESS/Desktop/Lane1Testing_OUT --sample-sheet /home/PRINCESS/Desktop/Parallels_Shared_Folders/Desktop/X111018_SN968_BC09PAACXX/SampleSheet.csv --force
                [2011-11-09 18:06:02] [configureBclToFastq.pl] INFO: Basecalling software: RTA
                [2011-11-09 18:06:02] [configureBclToFastq.pl] INFO: version: 1.12 (build 4)
                [2011-11-09 18:06:02] [configureBclToFastq.pl] WARNING: 'LocationFileType' element not found in /media/psf/Desktop/X111018_SN968_BC09PAACXX/Data/Intensities/BaseCalls/../RTAConfiguration.xml
                [2011-11-09 18:06:02] [configureBclToFastq.pl] WARNING: Couldn't find run info in /media/psf/Desktop/X111018_SN968_BC09PAACXX/Data/Intensities/BaseCalls/../../../RunInfo.xml
                [2011-11-09 18:06:02] [configureBclToFastq.pl] WARNING: Couldn't find RunInfo.xml for /media/psf/Desktop/X111018_SN968_BC09PAACXX/Data/Intensities/BaseCalls
                [2011-11-09 18:06:02] [configureBclToFastq.pl] INFO: Original use-bases mask: undefined
                [2011-11-09 18:06:02] [configureBclToFastq.pl] INFO: Guessed use-bases mask: undefined
                [2011-11-09 18:06:02] [configureBclToFastq.pl] ERROR: undefined mask
                [2011-11-09 18:06:02] [configureBclToFastq.pl] BACKTRACE: at /home/PRINCESS/CASAVA-1.8.2-build/illumina/software/CASAVA-1.8.2/lib/CASAVA-1.8.2/perl/Casava/Demultiplex.pm line 285
                Casava:emultiplex::mask('Casava:emultiplex=HASH(0x254e2a0)', undef) called at /home/PRINCESS/CASAVA-1.8.2-build/illumina/software/CASAVA-1.8.2/bin/configureBclToFastq.pl line 379
                Died at /home/PRINCESS/CASAVA-1.8.2-build/illumina/software/CASAVA-1.8.2/lib/CASAVA-1.8.2/perl/Casava/Common/Log.pm line 310.

                Comment


                • #9
                  the operator said everything is in the hard drive but may check him tomorrow... thanks for your help!

                  Comment


                  • #10
                    Command line looks fine. You are missing critical xml files with run parameters. Let us know what you find out.

                    Originally posted by lewewoo View Post
                    the operator said everything is in the hard drive but may check him tomorrow... thanks for your help!

                    Comment


                    • #11
                      need talk with operator tomorrow and to see what have happened to it... thanks!

                      Comment


                      • #12
                        Originally posted by lewewoo View Post
                        the operator said everything is in the hard drive but may check him tomorrow... thanks for your help!
                        Your top level dir should look like this (and some more txt files probably):

                        Code:
                        Config/
                        Data/
                        First_Base_Report.htm
                        InterOp/
                        Logs/
                        PeriodicSaveRates/
                        RTAComplete.txt
                        Recipe/
                        RunInfo.xml
                        Thumbnail_Images/
                        runParameters.xml

                        Comment


                        • #13
                          thanks everyone!

                          we got all the files listed as the above reply and went through the first step of BCL converter; however, after we typed:
                          nohup make -j 8
                          the computer response as:
                          nohup:ignoring input and appending output to 'nohup.out'

                          and the computer is running; and this message is good, right?
                          we noticed that there is increased usage of hard drive even it is slow... maybe tomorrow this could be done, but please let us know if this message means something bad, thanks!

                          we still have the warning message about LocationFileType, however, as GenoMax mentioned that may be doing with GAIIs, so we ignore this warning information, we hope the computer is doing well now...

                          Comment


                          • #14
                            Originally posted by lewewoo View Post
                            thanks everyone!

                            we got all the files listed as the above reply and went through the first step of BCL converter; however, after we typed:
                            nohup make -j 8
                            the computer response as:
                            nohup:ignoring input and appending output to 'nohup.out'

                            and the computer is running; and this message is good, right?
                            we noticed that there is increased usage of hard drive even it is slow... maybe tomorrow this could be done, but please let us know if this message means something bad, thanks!

                            we still have the warning message about LocationFileType, however, as GenoMax mentioned that may be doing with GAIIs, so we ignore this warning information, we hope the computer is doing well now...

                            The nohup message is just a message not a warning nor an error. It's fine this way.

                            Basecalling, BCL conversion etc. produces (as implemented by Illumina) very high I/O load; this is sometimes the limiting speed factor. When you use a few CPUs (e.g. 48) then you see that CPU load is far below 48 due to heavy I/O.

                            We get the LocationFileType warning always; so I ignore it.

                            Comment


                            • #15
                              Depending on what kind of a run this is (2 x 100 PE or bigger) it could easily take a day (or longer) to complete this process, especially if you are not doing this on a cluster with fast storage.

                              Originally posted by lewewoo View Post
                              thanks everyone!

                              we got all the files listed as the above reply and went through the first step of BCL converter; however, after we typed:
                              nohup make -j 8
                              the computer response as:
                              nohup:ignoring input and appending output to 'nohup.out'

                              and the computer is running; and this message is good, right?
                              we noticed that there is increased usage of hard drive even it is slow... maybe tomorrow this could be done, but please let us know if this message means something bad, thanks!

                              we still have the warning message about LocationFileType, however, as GenoMax mentioned that may be doing with GAIIs, so we ignore this warning information, we hope the computer is doing well now...

                              Comment

                              Latest Articles

                              Collapse

                              • seqadmin
                                Non-Coding RNA Research and Technologies
                                by seqadmin




                                Non-coding RNAs (ncRNAs) do not code for proteins but play important roles in numerous cellular processes including gene silencing, developmental pathways, and more. There are numerous types including microRNA (miRNA), long ncRNA (lncRNA), circular RNA (circRNA), and more. In this article, we discuss innovative ncRNA research and explore recent technological advancements that improve the study of ncRNAs.

                                Nobel Prize for MicroRNA Discovery
                                This week,...
                                Yesterday, 08:07 AM
                              • seqadmin
                                Recent Developments in Metagenomics
                                by seqadmin





                                Metagenomics has improved the way researchers study microorganisms across diverse environments. Historically, studying microorganisms relied on culturing them in the lab, a method that limits the investigation of many species since most are unculturable1. Metagenomics overcomes these issues by allowing the study of microorganisms regardless of their ability to be cultured or the environments they inhabit. Over time, the field has evolved, especially with the advent...
                                09-23-2024, 06:35 AM
                              • seqadmin
                                Understanding Genetic Influence on Infectious Disease
                                by seqadmin




                                During the COVID-19 pandemic, scientists observed that while some individuals experienced severe illness when infected with SARS-CoV-2, others were barely affected. These disparities left researchers and clinicians wondering what causes the wide variations in response to viral infections and what role genetics plays.

                                Jean-Laurent Casanova, M.D., Ph.D., Professor at Rockefeller University, is a leading expert in this crossover between genetics and infectious...
                                09-09-2024, 10:59 AM

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by seqadmin, 10-02-2024, 04:51 AM
                              0 responses
                              95 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 10-01-2024, 07:10 AM
                              0 responses
                              106 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 09-30-2024, 08:33 AM
                              1 response
                              106 views
                              0 likes
                              Last Post EmiTom
                              by EmiTom
                               
                              Started by seqadmin, 09-26-2024, 12:57 PM
                              0 responses
                              20 views
                              0 likes
                              Last Post seqadmin  
                              Working...
                              X