Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • jeferson
    Junior Member
    • Jan 2010
    • 8

    #16
    appeared the message:

    You need to install the perl-doc package to use this program.

    then ran the command:

    sudo aptitude install perl-doc

    worked, thanks

    Comment

    • ale
      Junior Member
      • Mar 2011
      • 4

      #17
      How to bring .csfasta and .qval in same order?

      I have the same problem as jeferson (see below),
      so how can the reads in .csfasta files be ordered in the same way as in the .qval files?

      Originally posted by jeferson View Post
      when I run the script solid2fastq with the following command:

      $ bfast-0.6.1c/scripts/solid2fastq n-10000000 reads barcode2/barcode2_F3.csfasta barcode2/barcode2_F3.qual

      Surgi this error message:

      Outputting, currently on:
      0csfasta_name = [> 9_42_916_F3]
      qual_name = [> 9_42_20_F3]
      ************************************************** **********
      In function "fastq_read" Fatal Error [outofrange]. Variable / Value: read-> name! = Qual_name.
      Message: Read names did not match.
      ***** Exiting due to errors *****
      ************************************************** **********

      What would it be?

      Comment

      • kenietz
        Member
        • Nov 2011
        • 86

        #18
        Hi guys,
        i know its an old thread but still its the appropriate one.

        I get this error while processing SOLID data:

        Outputting, currently on:
        108300000read->name=[>1272^300_899^F3]
        qual_name=[>1272_300_899_F3]
        ************************************************************
        In function "fastq_read": Fatal Error[OutOfRange]. Variable/Value: read->name != qual_name.
        Message: Read names did not match.
        ***** Exiting due to errors *****
        ************************************************************

        I did not do anything to the files before that tho. So i suppose i have to write a script which just replaces the '^' with '_' i suppose. Thats ok but do these different symbols mean different things? Or is just and error which i cant imagine how might happen at all as the reads come from the SOLID machine directly.

        Comment

        • kenietz
          Member
          • Nov 2011
          • 86

          #19
          Hi again,
          found some more of that weird spelling errors:

          Outputting, currently on:
          83900000read->name=[>1171_196?_1958_F5-P2]
          qual_name=[>1171_1967_1958_F5-P2]

          What is going on at all?

          Comment

          • nilshomer
            Nils Homer
            • Nov 2008
            • 1283

            #20
            Originally posted by kenietz View Post
            Hi again,
            found some more of that weird spelling errors:

            Outputting, currently on:
            83900000read->name=[>1171_196?_1958_F5-P2]
            qual_name=[>1171_1967_1958_F5-P2]

            What is going on at all?
            Make sure that the CSFAST/QUAL files have the same # of lines and the read names are in the same order.

            Nils

            Comment

            • westerman
              Rick Westerman
              • Jun 2008
              • 1104

              #21
              @Nils: I think kenietz is doing exactly that. I.e., trying to make sure that the files have the same read names. However the names are popping up with weird characters in them. This could be a symptom of a deeper problem --- data corruption of the files. Bit-flipping due to bad disks, bad memory, bad data transfer, etc.

              One item to check is to see if the sequence data itself (and not just the names) have corruption problems. E.g., something beside the 0,1,2,3 and T (if that is your initial base) that are expected. I hope I am wrong about the data corruption being the problem because this would be nasty to fix but it is something to be checked. md5sums of the original data and the working data could also be in order.

              Comment

              • nilshomer
                Nils Homer
                • Nov 2008
                • 1283

                #22
                Originally posted by westerman View Post
                @Nils: I think kenietz is doing exactly that. I.e., trying to make sure that the files have the same read names. However the names are popping up with weird characters in them. This could be a symptom of a deeper problem --- data corruption of the files. Bit-flipping due to bad disks, bad memory, bad data transfer, etc.

                One item to check is to see if the sequence data itself (and not just the names) have corruption problems. E.g., something beside the 0,1,2,3 and T (if that is your initial base) that are expected. I hope I am wrong about the data corruption being the problem because this would be nasty to fix but it is something to be checked. md5sums of the original data and the working data could also be in order.
                As a temporary fix, you could replace offending characters with a "Z" symbol.

                Comment

                • kenietz
                  Member
                  • Nov 2011
                  • 86

                  #23
                  Hi,
                  of course i checked the number of lines etc.
                  As Westerman said and i was having the same thoughts. Bad transfer, bad disk or bad memory. But how to check out this problems i have no idea. Is it also possible that the 'solid2fastq' the C variant has some weird bug? I am not sure.
                  But for example just now i got that error:

                  85500000read->name=[>308_1887_1544_F5-P2T21112303032103:132120201210102:2002]
                  qual_name=[>308_1887_1544_F5-P2]
                  ************************************************************
                  In function "fastq_read": Fatal Error[OutOfRange]. Variable/Value: read->name != qual_name.

                  So because im not sure whats going on i ran the following command which gave no result which means that the CSFASTA should be correct:

                  grep -m 1 -P -n '>308_1887_1544_F5-P2T' sl0453_20120208_PE_DUKE_NUS_SURESELECT_4gDNA_Kato_lll_F5-P2.csfasta

                  After i ran the solid2fastq again on the same file and got that error, on the same number of read but totally different error,amazing:

                  85500000read->name=[>308_1887_9551_F5-X2]
                  qual_name=[>308_1887_1551_F5-P2]
                  ************************************************************
                  In function "fastq_read": Fatal Error[OutOfRange]. Variable/Value: read->name != qual_name.

                  So im baffled. Is it my PC or the Solid2fastq which plays games with me ?

                  Any kind of help is appreciated.
                  Thank you in advance

                  Comment

                  • kenietz
                    Member
                    • Nov 2011
                    • 86

                    #24
                    Hi again,
                    another problem could be the software on the SOLID machine. There was power failure recently and they had to re-run the experiment. After i took the resulting csfasta and qual files 2 times and every time i have errors in different sets. Then i made them to redo the priamary analysis and took the result for the third time and still errors

                    The person from the support team suggested that Solid2fastq could possible meddle with input files but i highly doubt that option.

                    Me feeling is that the PC there on the machine is having troubles of some kind.

                    Update on the case shown in my prev post: It turned out that the qual file has 6 more entries than the csfasta. But that is in the files which i took after the primary re-analysis. In the same files which i took for the second time the number of entries is the same.

                    Confusing and frustrating

                    Comment

                    • kenietz
                      Member
                      • Nov 2011
                      • 86

                      #25
                      Hi again,
                      sorry for the spam

                      Good read names are like this:
                      >1272_300_1473_F3
                      >1171_196_1958_F5-P2

                      So i made a perl script which is checking every line starting with '>' for the following patterns depending on the file type i check for errors:

                      my $patF3='\d+_\d+_\d+_F3';
                      my $patF5='\d+_\d+_\d+_F5-P2';

                      Its SOLID PE data with F3 and F5-P2 reads. The F3 exited with no error while the F5-P2 with 37 errors.

                      So i suppose its some sort of error on the SOLID machine.
                      Last edited by kenietz; 03-01-2012, 12:45 AM.

                      Comment

                      Latest Articles

                      Collapse

                      ad_right_rmr

                      Collapse

                      News

                      Collapse

                      Topics Statistics Last Post
                      Started by SEQadmin2, Yesterday, 11:58 AM
                      0 responses
                      10 views
                      0 reactions
                      Last Post SEQadmin2  
                      Started by SEQadmin2, 06-05-2026, 10:09 AM
                      0 responses
                      25 views
                      0 reactions
                      Last Post SEQadmin2  
                      Started by SEQadmin2, 06-04-2026, 08:59 AM
                      0 responses
                      35 views
                      0 reactions
                      Last Post SEQadmin2  
                      Started by SEQadmin2, 06-02-2026, 12:03 PM
                      0 responses
                      58 views
                      0 reactions
                      Last Post SEQadmin2  
                      Working...