Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Perl Loops

    Hello

    Anyone have any experience on perl while loops (I'm a beginner)?
    I have a script that needs to use the declared variables outside a the loop, but only one loop is working, even though I have declared the variables outside of the loop, the code is:

    my $sample;
    my $fastq1;

    open(IN, 'ls /*_R1_*.gz |');
    while(my $sample = <IN>){
    chomp $sample;
    print "sample = $sample\n";
    my $fastq1="${sample}";

    my $sample2;
    my $fastq2;

    open(IN, 'ls /*_R2_*.gz |');
    while(my $sample2 = <IN>){
    chomp $sample2;
    print "sample2 = $sample2\n";
    my $fastq2="${sample2}";

    }
    }


    Sample2 works but sample1 does not, only the first sample is output and then the loop goes onto sample2, the output is:

    sample =/sample1_R1_001.fastq.gz
    sample2 =/sample1_R2_001.fastq.gz
    sample2 =/sample2_R2_001.fastq.gz
    sample2 =/sample3_R2_001.fastq.gz

    etc..



    Can anyone figure this out?

    Thanks

  • #2
    Can you specify exactly what you want to do here? I know it should be obvious but just to have an idea of the specific aim. There may be a command line or other utility, so no need to reinvent the wheel. And list the contents, or example contents, of the dir you are working in. Presumably it holds gzipped fastq files only?

    As an aside, have a look at the special variable $_

    Use it instead of initialising your $sample, $sample2

    Code:
    open(IN, "file.foo");
    while(<IN>){
    chomp $_;
    print "sample = $_\n";
    }

    Comment


    • #3
      Misplaced/missing "}" for first while loop.

      These are two separate while loops correct? Start using a programmer's editor. The choice would be dependent on the OS where you develop code.

      Comment


      • #4
        Yes the dir holds multiple fastq files, read 1 and read 2. I want read 1 and read 2 (sample 1 and fastq 1, sample 2 and fastq 2) variables declared separately for use outside the loop for alignment

        Comment


        • #5
          yes two separate loops

          Comment


          • #6
            OK this now works:

            my $sample;
            my $fastq1;

            open(IN, 'ls /*_R1_*.gz |');
            while(my $sample = <IN>){
            chomp $sample;
            print "sample = $sample\n";
            my $fastq1="${sample}";
            }

            my $sample2;
            my $fastq2;

            open(IN, 'ls /*_R2_*.gz |');
            while(my $sample2 = <IN>){
            chomp $sample2;
            print "sample2 = $sample2\n";
            my $fastq2="${sample2}";
            }



            But is there a quick way of checking that the both loops will work when variables are declared outside of them?

            Comment


            • #7
              You initialise strings outside the while loop with 'my $sample;' but you also then declare it with 'while(my $sample = <IN>)'. I don't think you are using 'use strict' and 'use warnings' which will not allow you to declare the variable twice (using 'my' for the same variable) which will quit and give you an error based on the issue. This is a great way to troubleshoot and also learn good practice.

              Anyway, a better format for your loop might be:

              Code:
              my ($sample,$fastq1);
              
              open(IN, 'ls /*_R1_*.gz |');
              while(<IN>){
              chomp $_;
              $sample=$_;
              print "sample = $sample\n";
              $fastq1="${sample}";
              }
              Once the variable is initialised (created but not storing any value) you can then declare it as '$sample=$_;' or such.

              Comment


              • #8
                thanks, 'use strict' and 'use warnings' were specified at start of script

                Now I have this:


                my ($sample,$fastq1);

                open(IN, 'ls *_R1_*.gz |');
                while(<IN>){
                chomp $_;
                $sample=$_;
                print "sample = $sample\n";
                $fastq1="${sample}";
                }

                my ($sample2, $fastq2);

                open(IN, 'ls *_R2_*.gz |');
                while(<IN>){
                chomp $_;
                $sample2=$_;
                print "sample2 = $sample2\n";
                $fastq2="${sample2}";
                }


                But the alignment is only working for the samples right at the end of the list, so the loop isn't working. Should I put the closing parentheses at the end of the entire script?
                Last edited by Vanisha; 02-19-2013, 05:42 AM.

                Comment


                • #9
                  Ok so your endpoint is to get each set of fastq files submitted to an aligner? So you have a system call or something using your declared variables $fastq1, $fastq2 etc.? What does that look like?

                  Your variables are being overwritten each time the loop iterates, so you only get the last set of variables submitted to the aligner.

                  I think what you might want to do is push the variable you declare into an array or hash, then use a separate loop to iterate over the array/hash and submit your files to the aligner. The basic structure is something like:

                  Code:
                  my (@samples, @fastqs);
                  #open your fastqs
                  while (<IN>){
                  chomp $_;
                  my $sample=$_;
                  push (@samples,$sample);
                  push (@fastqs, ${$sample});
                  }
                  
                  close IN; #btw you don't seem to close your filehandle anywhere which can confuse Perl
                  
                  my $length=@samples; 
                  for (my $i=0;$i<$length;$i++){ #use $i to iterate over the array
                  #system call
                  system("tophat $samples[$i] $samples2[$i] $fastqs[$i] $fastqs2[$i]");
                  }
                  The $samples[$i] construct is the $i'th element of the array @samples. Use $i so you can get the corresponding array position for your @fastqs.

                  Might seem like a lot to take in but arrays and hashes are there for a purpose!

                  Comment


                  • #10
                    Originally posted by bruce01 View Post
                    Can you specify exactly what you want to do here? I know it should be obvious but just to have an idea of the specific aim. There may be a command line or other utility, so no need to reinvent the wheel. And list the contents, or example contents, of the dir you are working in. Presumably it holds gzipped fastq files only?

                    As an aside, have a look at the special variable $_

                    Use it instead of initialising your $sample, $sample2

                    Code:
                    open(IN, "file.foo");
                    while(<IN>){
                    chomp $_;
                    print "sample = $_\n";
                    }
                    Just for reference, $_ is an alias in Perl, the point is you don't have to write it out. Saying, "chomp $_;" is unnecessary, just say "chomp." Also, if you are going to use a lexical variable in place of that, the normal convention is to write
                    Code:
                    while(my $line = <$in>) { # do something with $line }
                    If you are modifying $_, then use lexical variables, or just use them all the time to be clear (especially with filehandles as above). People use $_ to be concise and technically your code will work, but it's best to not mix both approaches because it isn't concise or clear.

                    Comment


                    • #11
                      Yeah I was just doing 'chomp $_;' for clarity for the OP, more than 3 ways to skin a cat in Perl

                      Comment


                      • #12
                        Originally posted by bruce01 View Post
                        Yeah I was just doing 'chomp $_;' for clarity for the OP, more than 3 ways to skin a cat in Perl
                        Gotcha. I was just trying to save you some typing. It is true that trying to be concise can lead to obfuscation so I guess another good thing to say about that is to use comments in the code where things might not be easy to understand.

                        Comment


                        • #13
                          If you just want to pick two fastQ files, loop through the fastQ files of the first read and simply substitute _R1_ with _R2_, like;

                          Code:
                          open(IN, 'ls *_R1_*.gz |');
                          while(<IN>){
                            chomp $_;
                            my $sample1=$_;
                            print "sample = $sample1\n";
                            (my $sample2 = $sample1) =~ s/_R1_/_R2_/;
                            print "sample2 = $sample2\n";  
                            #Do something with sample1 and sample2
                          }

                          Comment


                          • #14
                            that's worked! thanks for your help, learning perl is a long process!

                            Comment


                            • #15
                              closinging IN and even having two different file handles would make this a lot more readable. IN1 and IN2 for instance.

                              Comment

                              Latest Articles

                              Collapse

                              • seqadmin
                                Essential Discoveries and Tools in Epitranscriptomics
                                by seqadmin




                                The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
                                04-22-2024, 07:01 AM
                              • seqadmin
                                Current Approaches to Protein Sequencing
                                by seqadmin


                                Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                                04-04-2024, 04:25 PM

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by seqadmin, 04-25-2024, 11:49 AM
                              0 responses
                              19 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 04-24-2024, 08:47 AM
                              0 responses
                              19 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 04-11-2024, 12:08 PM
                              0 responses
                              62 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 04-10-2024, 10:19 PM
                              0 responses
                              60 views
                              0 likes
                              Last Post seqadmin  
                              Working...
                              X