Header Leaderboard Ad

Collapse

Crossbow: Genotyping from short reads using cloud computing

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    Originally posted by VIX_Z View Post
    Hi Ben,
    I want to try crossbow on my local hadoop enabled cluster. Can you share the data you tried for the "small version of the experiment on local Hadoop cluster". I am ending up with various errors while using other reads data.

    With thanks,
    Vix
    Hi Vix,

    Yes, several people contacted me with similar questions. We have since then released a version of Crossbow (v1.0.4) that we think has a much better Hadoop mode. It also has a single-computer mode, which does not require Hadoop (or Java). If you have time, please give that a shot.

    Sorry for the trouble,
    Ben

    Comment


    • #17
      Comments on Crossbow 1.0.4

      Originally posted by Ben Langmead View Post
      Hi Vix,

      Yes, several people contacted me with similar questions. We have since then released a version of Crossbow (v1.0.4) that we think has a much better Hadoop mode. It also has a single-computer mode, which does not require Hadoop (or Java). If you have time, please give that a shot.

      Sorry for the trouble,
      Ben
      Hi Ben,

      Thank you for writing that wonderful software. I tried the latest release (v1.0.4), and got some minor hiccups.

      I run Crossbow on local server running Ubuntu. Some of the scripts included with the tool used !/bin/sh, and in my case it was /bin/dash. Dash does not have pushd and popd commands called from the script. By changing the script to explicitly use /bin/bash, I was able to solve the minor hiccups.

      Is there particular reason why /bin/sh was explicitly declared as opposed to /bin/bash?

      Second comment is more of a wish list. I have huge data files of several whole human genomes (each genome dataset contains >2.4 billion reads). I used Crossbow preprocess routine, and it appears to be single-threaded. Is that correct. It takes several hours just to pre-process the data on the server with 16 cores. Is there a way to speed up the process, or is it limited by disk access speed?

      Third is question related to SOAPsnp. After finishing the alignment, aligned data is split into 48 tasks. I am getting the following error in processing input task-00023 [24 of 48]....Aborting master loop because child failed.

      Jeremy

      Comment


      • #18
        Hi Jeremy,

        Originally posted by jchien View Post
        I run Crossbow on local server running Ubuntu. Some of the scripts included with the tool used !/bin/sh, and in my case it was /bin/dash. Dash does not have pushd and popd commands called from the script. By changing the script to explicitly use /bin/bash, I was able to solve the minor hiccups.

        Is there particular reason why /bin/sh was explicitly declared as opposed to /bin/bash?
        Our mistake! We'll fix this in the next release.

        Second comment is more of a wish list. I have huge data files of several whole human genomes (each genome dataset contains >2.4 billion reads). I used Crossbow preprocess routine, and it appears to be single-threaded. Is that correct. It takes several hours just to pre-process the data on the server with 16 cores. Is there a way to speed up the process, or is it limited by disk access speed?
        The preprocess step is parallel like the other steps. Can you tell me exactly what parameters you used, and what makes you conclude it's running single-threadedly?

        Third is question related to SOAPsnp. After finishing the alignment, aligned data is split into 48 tasks. I am getting the following error in processing input task-00023 [24 of 48]....Aborting master loop because child failed.
        There should be another message that gives you more information about the error; could you send me the entire output?

        Thanks,
        Ben

        Comment


        • #19
          Crossbow-1.0.4 preprocess and SOAPsnp

          The preprocess step is parallel like the other steps. Can you tell me exactly what parameters you used, and what makes you conclude it's running single-threadedly?
          I specified cpus=12, but during the preprocess, there is only one process running as opposed to bowtie or soapsnp steps where I notice 12 process running in parallel. That's why I thought preprocess step is single-threaded.

          There should be another message that gives you more information about the error; could you send me the entire output?
          Here is output of entire error:

          Pid 14294 processing input task-00023 [24 of 48]...
          Aborting master loop because child failed
          Pid 16156 processing input task-00024 [25 of 48]...
          -- Reduce counters --
          SOAPsnp Alignments read 9664618
          SOAPsnp Paired alignments read 19492260
          SOAPsnp Positions called 1358172823
          SOAPsnp Positions called uncovered by any alignments 780183126
          SOAPsnp Positions called uncovered by unique alignments 789447449
          SOAPsnp Positions called with known SNP info 0
          SOAPsnp Positions with non-reference allele called 194467
          SOAPsnp Unique alignments read 9292965
          SOAPsnp Unpaired alignments read 0
          SOAPsnp wrapper Alignments processed 9746226
          SOAPsnp wrapper Out-of-range SNPs trimmed 9
          SOAPsnp wrapper Ranges processed 1477
          SOAPsnp wrapper SNP files missing 1477
          SOAPsnp wrapper SNPs reported 194458
          ==========================
          Stage 4 of 4. Postprocess
          ==========================
          Mon Aug 2 11:42:48 CDT 2010
          === Reduce ===
          # parallel reducers: 12
          # reduce tasks: 1
          Input: /tmp/crossbow/intermediate/10029/snps
          Output: /data/101b_full/crossbow_results
          Intermediate: /data/101b_full/crossbow_results.reduce.pre
          # bin, sort fields: 1, 2
          Total allowed sort memory footprint: 0
          Options: [ -keep-all -force ]
          Could not create new directory /data/101b_full/crossbow_results at /home/jchien/crossbow-1.0.4/ReduceWrap.pl line 81.
          Non-zero exitlevel from Postprocess stage

          Comment


          • #20
            Whole-human resequencing with Crossbow

            Hi Ben,

            Originally posted by Ben Langmead View Post
            Hi all,

            The Crossbow paper, Searching for SNPs with cloud computing came out in provisional form today. Take a look if you're interested.

            Thanks,
            Ben
            I've read your paper and tried to simulate part of your experiment you did on YH on your Web-based GUI with some of the paired-end read data with 10 instances, with the Job type option of "Just preprocess reads".

            However, AWS EC2 failed on "2. Preprocess short reads" step.
            And the stderr log appeared as following:

            Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
            at java.util.Arrays.copyOf(Arrays.java:2734)
            at java.util.ArrayList.ensureCapacity(ArrayList.java:167)
            at java.util.ArrayList.add(ArrayList.java:351)
            at org.apache.hadoop.mapred.lib.NLineInputFormat.getSplits(NLineInputFormat.java:100)
            at org.apache.hadoop.mapred.JobClient.writeOldSplits(JobClient.java:833)
            at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:804)
            at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:753)
            at org.apache.hadoop.streaming.StreamJob.submitAndMonitorJob(StreamJob.java:1012)
            at org.apache.hadoop.streaming.StreamJob.run(StreamJob.java:127)
            at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
            at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
            at org.apache.hadoop.streaming.HadoopStreaming.main(HadoopStreaming.java:50)
            at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
            at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
            at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
            at java.lang.reflect.Method.invoke(Method.java:597)
            at org.apache.hadoop.util.RunJar.main(RunJar.java:156)

            Seems like there is a memory leak or something...
            Can you give me some advice?

            Moreover, do I need to unzip the read files before uploading on S3?
            I really don't want to do that - because the transferring time and cost will be getting a monster...

            Thanks,
            Serena Rhie

            Comment


            • #21
              Originally posted by Serena Rhie View Post
              I've read your paper and tried to simulate part of your experiment you did on YH on your Web-based GUI with some of the paired-end read data with 10 instances, with the Job type option of "Just preprocess reads".

              However, AWS EC2 failed on "2. Preprocess short reads" step.
              And the stderr log appeared as following:

              Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
              at java.util.Arrays.copyOf(Arrays.java:2734)
              at java.util.ArrayList.ensureCapacity(ArrayList.java:167)
              at java.util.ArrayList.add(ArrayList.java:351)
              at org.apache.hadoop.mapred.lib.NLineInputFormat.getSplits(NLineInputFormat.java:100)
              at org.apache.hadoop.mapred.JobClient.writeOldSplits(JobClient.java:833)
              at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:804)
              at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:753)
              at org.apache.hadoop.streaming.StreamJob.submitAndMonitorJob(StreamJob.java:1012)
              at org.apache.hadoop.streaming.StreamJob.run(StreamJob.java:127)
              at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
              at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
              at org.apache.hadoop.streaming.HadoopStreaming.main(HadoopStreaming.java:50)
              at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
              at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
              at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
              at java.lang.reflect.Method.invoke(Method.java:597)
              at org.apache.hadoop.util.RunJar.main(RunJar.java:156)

              Seems like there is a memory leak or something...
              Can you give me some advice?
              Yikes, that's odd. Can you send me the manifest file you were using? I'll try to recreate this and let you know what I find. (I did encounter some issues with memory exhaustion when I moved over to Hadoop 0.20 but I thought I had fixed that by turning off JVM reuse - we'll see)

              Moreover, do I need to unzip the read files before uploading on S3?
              I really don't want to do that - because the transferring time and cost will be getting a monster...
              No, you don't. The input to the preprocessing step can be compressed with gzip or bzip2.

              Thanks,
              Ben

              Comment


              • #22
                The manifest file is too long, so I tried with the sample provided as on Crossbow (via web interface) manual.
                I used the manifest file included in
                $CROSSBOW_HOME/example/e_coli/small.manifest
                I fail on Step3. with an stderr msg
                Streaming Command Failed!
                Here is the stdout:
                packageJobJar: [/mnt/var/lib/hadoop/tmp/hadoop-unjar3365379216250869521/] [] /mnt/var/lib/hadoop/steps/5/tmp/streamjob208563616485322347.jar tmpDir=null
                Ben, can you help me? I really want to see the output results!

                Comment


                • #23
                  Originally posted by Serena Rhie View Post
                  The manifest file is too long, so I tried with the sample provided as on Crossbow (via web interface) manual.
                  I used the manifest file included in


                  I fail on Step3. with an stderr msg


                  Here is the stdout:


                  Ben, can you help me? I really want to see the output results!
                  Hi Serena,

                  Please send me the exact command used.

                  Thanks,
                  Ben

                  Comment


                  • #24
                    Here are my commands on web interface:

                    Job type: Crossbow
                    s3n://pings-ewha/e-coli/read-data/small.manifest
                    s3n://pings-ewha/e-coli/crossbow-mediated
                    Input type: Manifast
                    Genome/Annotation: E. coli O157:H7
                    ...
                    Chromosome ploidy: All are haploid
                    EC2 instances: 1

                    Other options: set as default

                    Comment


                    • #25
                      Originally posted by Serena Rhie View Post
                      Here are my commands on web interface:

                      Job type: Crossbow
                      s3n://pings-ewha/e-coli/read-data/small.manifest
                      s3n://pings-ewha/e-coli/crossbow-mediated
                      Input type: Manifast
                      Genome/Annotation: E. coli O157:H7
                      ...
                      Chromosome ploidy: All are haploid
                      EC2 instances: 1

                      Other options: set as default
                      That set of options works for me. Does your job fail on step 2 or step 3? Your earlier post said step 2 but your newer one said step 3.

                      Could you send me the "stderr" and "syslog" logs from one of the "task attempts" that fail? This document explains how to do this using the AWS Console interface:

                      http://docs.amazonwebservices.com/El...gJobFlows.html

                      Thanks,
                      Ben

                      Comment


                      • #26
                        My job fails on Step 3. Yeah, it's quite confusable- I started a new Job for simplicity.

                        I've got 4 attempts on "task attempts", and each stderr and syslog gives almost the same message. Seems like hadoop is throwing a null pointer exception.

                        stderr:
                        Code:
                        log4j:WARN No appenders could be found for logger (org.apache.hadoop.conf.Configuration).
                        log4j:WARN Please initialize the log4j system properly.
                        s3cmd: found: /usr/bin/s3cmd, given: 
                        jar: found: /usr/lib/jvm/java-6-sun/bin/jar, given: 
                        hadoop: found: /home/hadoop/bin/../bin/hadoop, given: 
                        wget: found: /usr/bin/wget, given: 
                        s3cfg: 
                        cmap_file: 
                        cmap_jar: S3N://crossbow-refs/e_coli.jar
                        local destination dir: /mnt/14270
                        Output dir: S3N://pings-ewha/e-coli/crossbow-mediated
                        Ensuring cmap jar is installed
                        Get.pm:ensureFetched: called on "S3N://crossbow-refs/e_coli.jar"
                        Get.pm:ensureFetched: base name "e_coli.jar"
                        mkdir -p /mnt/14270 >&2 2>/dev/null
                        ls -al /mnt/14270/*e_coli.jar* /mnt/14270/.*e_coli.jar*
                        -rw-r--r-- 1 hadoop hadoop 9990385 2010-08-17 01:16 /mnt/14270/e_coli.jar
                        -rw-r--r-- 1 hadoop hadoop       0 2010-08-17 01:16 /mnt/14270/.e_coli.jar.done
                        -rw-r--r-- 1 hadoop hadoop       0 2010-08-17 01:16 /mnt/14270/.e_coli.jar.lock
                        Pid 22358: Checking for done file /mnt/14270/.e_coli.jar.done
                        Pid 22358: done file /mnt/14270/.e_coli.jar.done was there already; continuing
                        Examining extracted files
                        find /mnt/14270
                        /mnt/14270
                        /mnt/14270/META-INF
                        /mnt/14270/META-INF/MANIFEST.MF
                        /mnt/14270/snps
                        /mnt/14270/sequences
                        /mnt/14270/sequences/chr0.fa
                        /mnt/14270/index
                        /mnt/14270/index/index.rev.1.ebwt
                        /mnt/14270/index/index.1.ebwt
                        /mnt/14270/index/index.rev.2.ebwt
                        /mnt/14270/index/index.2.ebwt
                        /mnt/14270/index/index.4.ebwt
                        /mnt/14270/index/index.3.ebwt
                        /mnt/14270/.e_coli.jar.lock
                        /mnt/14270/cmap.txt
                        /mnt/14270/.e_coli.jar.done
                        /mnt/14270/e_coli.jar
                        java.lang.RuntimeException: java.lang.NullPointerException
                        	at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:386)
                        	at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:582)
                        	at org.apache.hadoop.streaming.PipeReducer.close(PipeReducer.java:137)
                        	at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:477)
                        	at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:415)
                        	at org.apache.hadoop.mapred.Child.main(Child.java:170)
                        Caused by: java.lang.NullPointerException
                        	at org.apache.hadoop.streaming.PipeMapRed$MRErrorThread.incrCounter(PipeMapRed.java:549)
                        	at org.apache.hadoop.streaming.PipeMapRed$MRErrorThread.run(PipeMapRed.java:490)
                        And here are the syslog:

                        Code:
                        2010-08-17 01:29:12,507 INFO org.apache.hadoop.metrics.jvm.JvmMetrics (main): Initializing JVM Metrics with processName=SHUFFLE, sessionId=
                        2010-08-17 01:29:12,701 INFO org.apache.hadoop.mapred.ReduceTask (main): Host name: domU-12-31-39-0C-68-41.compute-1.internal
                        2010-08-17 01:29:14,770 INFO com.hadoop.compression.lzo.GPLNativeCodeLoader (main): Loaded native gpl library
                        2010-08-17 01:29:14,771 INFO com.hadoop.compression.lzo.LzoCodec (main): Successfully loaded & initialized native-lzo library
                        2010-08-17 01:29:14,781 INFO org.apache.hadoop.mapred.ReduceTask (main): ShuffleRamManager: MemoryLimit=488066240, MaxSingleShuffleLimit=122016560
                        2010-08-17 01:29:14,786 INFO org.apache.hadoop.io.compress.CodecPool (main): Got brand-new decompressor
                        2010-08-17 01:29:14,786 INFO org.apache.hadoop.io.compress.CodecPool (main): Got brand-new decompressor
                        2010-08-17 01:29:14,787 INFO org.apache.hadoop.io.compress.CodecPool (main): Got brand-new decompressor
                        2010-08-17 01:29:14,788 INFO org.apache.hadoop.io.compress.CodecPool (main): Got brand-new decompressor
                        2010-08-17 01:29:14,789 INFO org.apache.hadoop.io.compress.CodecPool (main): Got brand-new decompressor
                        2010-08-17 01:29:14,790 INFO org.apache.hadoop.io.compress.CodecPool (main): Got brand-new decompressor
                        2010-08-17 01:29:14,791 INFO org.apache.hadoop.io.compress.CodecPool (main): Got brand-new decompressor
                        2010-08-17 01:29:14,792 INFO org.apache.hadoop.io.compress.CodecPool (main): Got brand-new decompressor
                        2010-08-17 01:29:14,792 INFO org.apache.hadoop.io.compress.CodecPool (main): Got brand-new decompressor
                        2010-08-17 01:29:14,793 INFO org.apache.hadoop.io.compress.CodecPool (main): Got brand-new decompressor
                        2010-08-17 01:29:14,794 INFO org.apache.hadoop.io.compress.CodecPool (main): Got brand-new decompressor
                        2010-08-17 01:29:14,795 INFO org.apache.hadoop.io.compress.CodecPool (main): Got brand-new decompressor
                        2010-08-17 01:29:14,796 INFO org.apache.hadoop.io.compress.CodecPool (main): Got brand-new decompressor
                        2010-08-17 01:29:14,796 INFO org.apache.hadoop.io.compress.CodecPool (main): Got brand-new decompressor
                        2010-08-17 01:29:14,798 INFO org.apache.hadoop.io.compress.CodecPool (main): Got brand-new decompressor
                        2010-08-17 01:29:14,798 INFO org.apache.hadoop.io.compress.CodecPool (main): Got brand-new decompressor
                        2010-08-17 01:29:14,799 INFO org.apache.hadoop.io.compress.CodecPool (main): Got brand-new decompressor
                        2010-08-17 01:29:14,800 INFO org.apache.hadoop.io.compress.CodecPool (main): Got brand-new decompressor
                        2010-08-17 01:29:14,801 INFO org.apache.hadoop.io.compress.CodecPool (main): Got brand-new decompressor
                        2010-08-17 01:29:14,802 INFO org.apache.hadoop.io.compress.CodecPool (main): Got brand-new decompressor
                        2010-08-17 01:29:14,805 INFO org.apache.hadoop.mapred.ReduceTask (Thread for merging on-disk files): attempt_201008170108_0004_r_000001_0 Thread started: Thread for merging on-disk files
                        2010-08-17 01:29:14,805 INFO org.apache.hadoop.mapred.ReduceTask (Thread for merging on-disk files): attempt_201008170108_0004_r_000001_0 Thread waiting: Thread for merging on-disk files
                        2010-08-17 01:29:14,806 INFO org.apache.hadoop.mapred.ReduceTask (Thread for merging in memory files): attempt_201008170108_0004_r_000001_0 Thread started: Thread for merging in memory files
                        2010-08-17 01:29:14,807 INFO org.apache.hadoop.mapred.ReduceTask (main): attempt_201008170108_0004_r_000001_0 Need another 32 map output(s) where 0 is already in progress
                        2010-08-17 01:29:14,807 INFO org.apache.hadoop.mapred.ReduceTask (main): attempt_201008170108_0004_r_000001_0 Scheduled 0 outputs (0 slow hosts and0 dup hosts)
                        2010-08-17 01:29:14,807 INFO org.apache.hadoop.mapred.ReduceTask (Thread for polling Map Completion Events): attempt_201008170108_0004_r_000001_0 Thread started: Thread for polling Map Completion Events
                        2010-08-17 01:29:14,815 INFO org.apache.hadoop.mapred.ReduceTask (Thread for polling Map Completion Events): attempt_201008170108_0004_r_000001_0: Got 9 new map-outputs
                        2010-08-17 01:29:17,828 INFO org.apache.hadoop.mapred.ReduceTask (Thread for polling Map Completion Events): attempt_201008170108_0004_r_000001_0: Got 1 new map-outputs
                        2010-08-17 01:29:19,822 INFO org.apache.hadoop.mapred.ReduceTask (main): attempt_201008170108_0004_r_000001_0 Scheduled 1 outputs (0 slow hosts and0 dup hosts)
                        2010-08-17 01:29:19,829 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.0): header: attempt_201008170108_0004_m_000000_0, compressed len: 18, decompressed len: 2
                        2010-08-17 01:29:19,829 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.0): Shuffling 2 bytes (18 raw bytes) into RAM from attempt_201008170108_0004_m_000000_0
                        2010-08-17 01:29:19,833 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.0): Read 2 bytes from map-output for attempt_201008170108_0004_m_000000_0
                        2010-08-17 01:29:19,833 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.0): Rec #1 from attempt_201008170108_0004_m_000000_0 -> (-1, -1) from domU-12-31-39-0C-68-41.compute-1.internal
                        2010-08-17 01:29:19,835 INFO org.apache.hadoop.mapred.ReduceTask (main): attempt_201008170108_0004_r_000001_0 Scheduled 1 outputs (0 slow hosts and0 dup hosts)
                        2010-08-17 01:29:19,839 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.0): header: attempt_201008170108_0004_m_000001_0, compressed len: 18, decompressed len: 2
                        2010-08-17 01:29:19,840 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.0): Shuffling 2 bytes (18 raw bytes) into RAM from attempt_201008170108_0004_m_000001_0
                        2010-08-17 01:29:19,840 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.0): Read 2 bytes from map-output for attempt_201008170108_0004_m_000001_0
                        2010-08-17 01:29:19,840 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.0): Rec #1 from attempt_201008170108_0004_m_000001_0 -> (-1, -1) from domU-12-31-39-0C-68-41.compute-1.internal
                        2010-08-17 01:29:19,840 INFO org.apache.hadoop.mapred.ReduceTask (main): attempt_201008170108_0004_r_000001_0 Scheduled 1 outputs (0 slow hosts and0 dup hosts)
                        2010-08-17 01:29:19,847 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.5): header: attempt_201008170108_0004_m_000002_0, compressed len: 18, decompressed len: 2
                        2010-08-17 01:29:19,847 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.5): Shuffling 2 bytes (18 raw bytes) into RAM from attempt_201008170108_0004_m_000002_0
                        2010-08-17 01:29:19,848 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.5): Read 2 bytes from map-output for attempt_201008170108_0004_m_000002_0
                        2010-08-17 01:29:19,848 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.5): Rec #1 from attempt_201008170108_0004_m_000002_0 -> (-1, -1) from domU-12-31-39-0C-68-41.compute-1.internal
                        2010-08-17 01:29:19,848 INFO org.apache.hadoop.mapred.ReduceTask (main): attempt_201008170108_0004_r_000001_0 Scheduled 1 outputs (0 slow hosts and0 dup hosts)
                        2010-08-17 01:29:19,886 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.10): header: attempt_201008170108_0004_m_000003_0, compressed len: 18, decompressed len: 2
                        2010-08-17 01:29:19,886 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.10): Shuffling 2 bytes (18 raw bytes) into RAM from attempt_201008170108_0004_m_000003_0
                        2010-08-17 01:29:19,887 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.10): Read 2 bytes from map-output for attempt_201008170108_0004_m_000003_0
                        2010-08-17 01:29:19,887 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.10): Rec #1 from attempt_201008170108_0004_m_000003_0 -> (-1, -1) from domU-12-31-39-0C-68-41.compute-1.internal
                        2010-08-17 01:29:19,887 INFO org.apache.hadoop.mapred.ReduceTask (main): attempt_201008170108_0004_r_000001_0 Scheduled 1 outputs (0 slow hosts and0 dup hosts)
                        2010-08-17 01:29:19,889 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.15): header: attempt_201008170108_0004_m_000004_0, compressed len: 18, decompressed len: 2
                        2010-08-17 01:29:19,889 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.15): Shuffling 2 bytes (18 raw bytes) into RAM from attempt_201008170108_0004_m_000004_0
                        2010-08-17 01:29:19,899 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.15): Read 2 bytes from map-output for attempt_201008170108_0004_m_000004_0
                        2010-08-17 01:29:19,899 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.15): Rec #1 from attempt_201008170108_0004_m_000004_0 -> (-1, -1) from domU-12-31-39-0C-68-41.compute-1.internal
                        2010-08-17 01:29:19,899 INFO org.apache.hadoop.mapred.ReduceTask (main): attempt_201008170108_0004_r_000001_0 Scheduled 1 outputs (0 slow hosts and0 dup hosts)
                        2010-08-17 01:29:19,901 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.18): header: attempt_201008170108_0004_m_000005_0, compressed len: 18, decompressed len: 2
                        2010-08-17 01:29:19,901 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.18): Shuffling 2 bytes (18 raw bytes) into RAM from attempt_201008170108_0004_m_000005_0
                        2010-08-17 01:29:19,902 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.18): Read 2 bytes from map-output for attempt_201008170108_0004_m_000005_0
                        2010-08-17 01:29:19,902 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.18): Rec #1 from attempt_201008170108_0004_m_000005_0 -> (-1, -1) from domU-12-31-39-0C-68-41.compute-1.internal
                        2010-08-17 01:29:19,902 INFO org.apache.hadoop.mapred.ReduceTask (main): attempt_201008170108_0004_r_000001_0 Scheduled 1 outputs (0 slow hosts and0 dup hosts)
                        2010-08-17 01:29:19,904 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.5): header: attempt_201008170108_0004_m_000006_0, compressed len: 18, decompressed len: 2
                        2010-08-17 01:29:19,904 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.5): Shuffling 2 bytes (18 raw bytes) into RAM from attempt_201008170108_0004_m_000006_0
                        2010-08-17 01:29:19,904 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.5): Read 2 bytes from map-output for attempt_201008170108_0004_m_000006_0
                        2010-08-17 01:29:19,904 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.5): Rec #1 from attempt_201008170108_0004_m_000006_0 -> (-1, -1) from domU-12-31-39-0C-68-41.compute-1.internal
                        2010-08-17 01:29:19,905 INFO org.apache.hadoop.mapred.ReduceTask (main): attempt_201008170108_0004_r_000001_0 Scheduled 1 outputs (0 slow hosts and0 dup hosts)
                        2010-08-17 01:29:19,907 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.14): header: attempt_201008170108_0004_m_000007_0, compressed len: 18, decompressed len: 2
                        2010-08-17 01:29:19,907 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.14): Shuffling 2 bytes (18 raw bytes) into RAM from attempt_201008170108_0004_m_000007_0
                        2010-08-17 01:29:19,907 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.14): Read 2 bytes from map-output for attempt_201008170108_0004_m_000007_0
                        2010-08-17 01:29:19,907 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.14): Rec #1 from attempt_201008170108_0004_m_000007_0 -> (-1, -1) from domU-12-31-39-0C-68-41.compute-1.internal
                        2010-08-17 01:29:19,907 INFO org.apache.hadoop.mapred.ReduceTask (main): attempt_201008170108_0004_r_000001_0 Scheduled 1 outputs (0 slow hosts and0 dup hosts)
                        2010-08-17 01:29:19,909 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.12): header: attempt_201008170108_0004_m_000008_0, compressed len: 18, decompressed len: 2
                        2010-08-17 01:29:19,909 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.12): Shuffling 2 bytes (18 raw bytes) into RAM from attempt_201008170108_0004_m_000008_0
                        2010-08-17 01:29:19,910 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.12): Read 2 bytes from map-output for attempt_201008170108_0004_m_000008_0
                        2010-08-17 01:29:19,910 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.12): Rec #1 from attempt_201008170108_0004_m_000008_0 -> (-1, -1) from domU-12-31-39-0C-68-41.compute-1.internal
                        2010-08-17 01:29:19,910 INFO org.apache.hadoop.mapred.ReduceTask (main): attempt_201008170108_0004_r_000001_0 Scheduled 1 outputs (0 slow hosts and0 dup hosts)
                        2010-08-17 01:29:19,912 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.13): header: attempt_201008170108_0004_m_000009_0, compressed len: 18, decompressed len: 2
                        2010-08-17 01:29:19,912 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.13): Shuffling 2 bytes (18 raw bytes) into RAM from attempt_201008170108_0004_m_000009_0
                        2010-08-17 01:29:19,913 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.13): Read 2 bytes from map-output for attempt_201008170108_0004_m_000009_0
                        2010-08-17 01:29:19,913 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.13): Rec #1 from attempt_201008170108_0004_m_000009_0 -> (-1, -1) from domU-12-31-39-0C-68-41.compute-1.internal
                        2010-08-17 01:29:20,851 INFO org.apache.hadoop.mapred.ReduceTask (Thread for polling Map Completion Events): attempt_201008170108_0004_r_000001_0: Got 1 new map-outputs
                        2010-08-17 01:29:23,864 INFO org.apache.hadoop.mapred.ReduceTask (Thread for polling Map Completion Events): attempt_201008170108_0004_r_000001_0: Got 1 new map-outputs
                        2010-08-17 01:29:24,914 INFO org.apache.hadoop.mapred.ReduceTask (main): attempt_201008170108_0004_r_000001_0 Scheduled 1 outputs (0 slow hosts and0 dup hosts)
                        2010-08-17 01:29:24,920 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.13): header: attempt_201008170108_0004_m_000010_0, compressed len: 18, decompressed len: 2
                        2010-08-17 01:29:24,920 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.13): Shuffling 2 bytes (18 raw bytes) into RAM from attempt_201008170108_0004_m_000010_0
                        2010-08-17 01:29:24,920 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.13): Read 2 bytes from map-output for attempt_201008170108_0004_m_000010_0
                        2010-08-17 01:29:24,921 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.13): Rec #1 from attempt_201008170108_0004_m_000010_0 -> (-1, -1) from domU-12-31-39-0C-68-41.compute-1.internal
                        2010-08-17 01:29:24,921 INFO org.apache.hadoop.mapred.ReduceTask (main): attempt_201008170108_0004_r_000001_0 Scheduled 1 outputs (0 slow hosts and0 dup hosts)
                        2010-08-17 01:29:24,925 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.10): header: attempt_201008170108_0004_m_000011_0, compressed len: 18, decompressed len: 2
                        2010-08-17 01:29:24,925 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.10): Shuffling 2 bytes (18 raw bytes) into RAM from attempt_201008170108_0004_m_000011_0
                        2010-08-17 01:29:24,925 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.10): Read 2 bytes from map-output for attempt_201008170108_0004_m_000011_0
                        2010-08-17 01:29:24,925 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.10): Rec #1 from attempt_201008170108_0004_m_000011_0 -> (-1, -1) from domU-12-31-39-0C-68-41.compute-1.internal
                        2010-08-17 01:29:26,880 INFO org.apache.hadoop.mapred.ReduceTask (Thread for polling Map Completion Events): attempt_201008170108_0004_r_000001_0: Got 1 new map-outputs
                        2010-08-17 01:29:29,895 INFO org.apache.hadoop.mapred.ReduceTask (Thread for polling Map Completion Events): attempt_201008170108_0004_r_000001_0: Got 1 new map-outputs
                        2010-08-17 01:29:29,928 INFO org.apache.hadoop.mapred.ReduceTask (main): attempt_201008170108_0004_r_000001_0 Scheduled 1 outputs (0 slow hosts and0 dup hosts)
                        2010-08-17 01:29:29,936 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.0): header: attempt_201008170108_0004_m_000012_0, compressed len: 18, decompressed len: 2
                        2010-08-17 01:29:29,937 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.0): Shuffling 2 bytes (18 raw bytes) into RAM from attempt_201008170108_0004_m_000012_0
                        2010-08-17 01:29:29,937 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.0): Read 2 bytes from map-output for attempt_201008170108_0004_m_000012_0
                        2010-08-17 01:29:29,937 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.0): Rec #1 from attempt_201008170108_0004_m_000012_0 -> (-1, -1) from domU-12-31-39-0C-68-41.compute-1.internal
                        2010-08-17 01:29:29,937 INFO org.apache.hadoop.mapred.ReduceTask (main): attempt_201008170108_0004_r_000001_0 Scheduled 1 outputs (0 slow hosts and0 dup hosts)
                        2010-08-17 01:29:29,941 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.1): header: attempt_201008170108_0004_m_000013_0, compressed len: 18, decompressed len: 2
                        2010-08-17 01:29:29,941 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.1): Shuffling 2 bytes (18 raw bytes) into RAM from attempt_201008170108_0004_m_000013_0
                        2010-08-17 01:29:29,941 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.1): Read 2 bytes from map-output for attempt_201008170108_0004_m_000013_0
                        2010-08-17 01:29:29,942 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.1): Rec #1 from attempt_201008170108_0004_m_000013_0 -> (-1, -1) from domU-12-31-39-0C-68-41.compute-1.internal
                        2010-08-17 01:29:32,904 INFO org.apache.hadoop.mapred.ReduceTask (Thread for polling Map Completion Events): attempt_201008170108_0004_r_000001_0: Got 1 new map-outputs
                        2010-08-17 01:29:34,943 INFO org.apache.hadoop.mapred.ReduceTask (main): attempt_201008170108_0004_r_000001_0 Scheduled 1 outputs (0 slow hosts and0 dup hosts)
                        2010-08-17 01:29:34,948 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.1): header: attempt_201008170108_0004_m_000014_0, compressed len: 18, decompressed len: 2
                        2010-08-17 01:29:34,948 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.1): Shuffling 2 bytes (18 raw bytes) into RAM from attempt_201008170108_0004_m_000014_0
                        2010-08-17 01:29:34,949 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.1): Read 2 bytes from map-output for attempt_201008170108_0004_m_000014_0
                        2010-08-17 01:29:34,949 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.1): Rec #1 from attempt_201008170108_0004_m_000014_0 -> (-1, -1) from domU-12-31-39-0C-68-41.compute-1.internal
                        2010-08-17 01:29:35,912 INFO org.apache.hadoop.mapred.ReduceTask (Thread for polling Map Completion Events): attempt_201008170108_0004_r_000001_0: Got 1 new map-outputs
                        2010-08-17 01:29:38,935 INFO org.apache.hadoop.mapred.ReduceTask (Thread for polling Map Completion Events): attempt_201008170108_0004_r_000001_0: Got 1 new map-outputs
                        2010-08-17 01:29:39,950 INFO org.apache.hadoop.mapred.ReduceTask (main): attempt_201008170108_0004_r_000001_0 Scheduled 1 outputs (0 slow hosts and0 dup hosts)
                        2010-08-17 01:29:39,971 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.12): header: attempt_201008170108_0004_m_000015_0, compressed len: 18, decompressed len: 2
                        2010-08-17 01:29:39,971 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.12): Shuffling 2 bytes (18 raw bytes) into RAM from attempt_201008170108_0004_m_000015_0
                        2010-08-17 01:29:39,971 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.12): Read 2 bytes from map-output for attempt_201008170108_0004_m_000015_0
                        2010-08-17 01:29:39,972 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.12): Rec #1 from attempt_201008170108_0004_m_000015_0 -> (-1, -1) from domU-12-31-39-0C-68-41.compute-1.internal
                        2010-08-17 01:29:39,972 INFO org.apache.hadoop.mapred.ReduceTask (main): attempt_201008170108_0004_r_000001_0 Scheduled 1 outputs (0 slow hosts and0 dup hosts)
                        2010-08-17 01:29:39,978 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.1): header: attempt_201008170108_0004_m_000016_0, compressed len: 18, decompressed len: 2
                        2010-08-17 01:29:39,978 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.1): Shuffling 2 bytes (18 raw bytes) into RAM from attempt_201008170108_0004_m_000016_0
                        2010-08-17 01:29:39,979 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.1): Read 2 bytes from map-output for attempt_201008170108_0004_m_000016_0
                        2010-08-17 01:29:39,979 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.1): Rec #1 from attempt_201008170108_0004_m_000016_0 -> (-1, -1) from domU-12-31-39-0C-68-41.compute-1.internal
                        2010-08-17 01:29:41,942 INFO org.apache.hadoop.mapred.ReduceTask (Thread for polling Map Completion Events): attempt_201008170108_0004_r_000001_0: Got 1 new map-outputs
                        2010-08-17 01:29:44,952 INFO org.apache.hadoop.mapred.ReduceTask (Thread for polling Map Completion Events): attempt_201008170108_0004_r_000001_0: Got 1 new map-outputs
                        2010-08-17 01:29:44,984 INFO org.apache.hadoop.mapred.ReduceTask (main): attempt_201008170108_0004_r_000001_0 Scheduled 1 outputs (0 slow hosts and0 dup hosts)
                        2010-08-17 01:29:44,990 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.16): header: attempt_201008170108_0004_m_000017_0, compressed len: 18, decompressed len: 2
                        2010-08-17 01:29:44,990 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.16): Shuffling 2 bytes (18 raw bytes) into RAM from attempt_201008170108_0004_m_000017_0
                        2010-08-17 01:29:44,990 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.16): Read 2 bytes from map-output for attempt_201008170108_0004_m_000017_0
                        2010-08-17 01:29:44,990 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.16): Rec #1 from attempt_201008170108_0004_m_000017_0 -> (-1, -1) from domU-12-31-39-0C-68-41.compute-1.internal
                        2010-08-17 01:29:44,991 INFO org.apache.hadoop.mapred.ReduceTask (main): attempt_201008170108_0004_r_000001_0 Scheduled 1 outputs (0 slow hosts and0 dup hosts)
                        2010-08-17 01:29:44,993 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.16): header: attempt_201008170108_0004_m_000018_0, compressed len: 18, decompressed len: 2
                        2010-08-17 01:29:44,993 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.16): Shuffling 2 bytes (18 raw bytes) into RAM from attempt_201008170108_0004_m_000018_0
                        2010-08-17 01:29:44,994 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.16): Read 2 bytes from map-output for attempt_201008170108_0004_m_000018_0
                        2010-08-17 01:29:44,994 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.16): Rec #1 from attempt_201008170108_0004_m_000018_0 -> (-1, -1) from domU-12-31-39-0C-68-41.compute-1.internal
                        2010-08-17 01:29:47,962 INFO org.apache.hadoop.mapred.ReduceTask (Thread for polling Map Completion Events): attempt_201008170108_0004_r_000001_0: Got 1 new map-outputs
                        2010-08-17 01:29:50,000 INFO org.apache.hadoop.mapred.ReduceTask (main): attempt_201008170108_0004_r_000001_0 Scheduled 1 outputs (0 slow hosts and0 dup hosts)
                        2010-08-17 01:29:50,004 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.16): header: attempt_201008170108_0004_m_000019_0, compressed len: 18, decompressed len: 2
                        2010-08-17 01:29:50,004 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.16): Shuffling 2 bytes (18 raw bytes) into RAM from attempt_201008170108_0004_m_000019_0
                        2010-08-17 01:29:50,005 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.16): Read 2 bytes from map-output for attempt_201008170108_0004_m_000019_0
                        2010-08-17 01:29:50,005 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.16): Rec #1 from attempt_201008170108_0004_m_000019_0 -> (-1, -1) from domU-12-31-39-0C-68-41.compute-1.internal
                        2010-08-17 01:29:50,970 INFO org.apache.hadoop.mapred.ReduceTask (Thread for polling Map Completion Events): attempt_201008170108_0004_r_000001_0: Got 1 new map-outputs
                        2010-08-17 01:29:53,979 INFO org.apache.hadoop.mapred.ReduceTask (Thread for polling Map Completion Events): attempt_201008170108_0004_r_000001_0: Got 1 new map-outputs
                        2010-08-17 01:29:55,006 INFO org.apache.hadoop.mapred.ReduceTask (main): attempt_201008170108_0004_r_000001_0 Scheduled 1 outputs (0 slow hosts and0 dup hosts)
                        2010-08-17 01:29:55,011 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.10): header: attempt_201008170108_0004_m_000020_0, compressed len: 18, decompressed len: 2
                        2010-08-17 01:29:55,011 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.10): Shuffling 2 bytes (18 raw bytes) into RAM from attempt_201008170108_0004_m_000020_0
                        2010-08-17 01:29:55,011 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.10): Read 2 bytes from map-output for attempt_201008170108_0004_m_000020_0
                        2010-08-17 01:29:55,012 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.10): Rec #1 from attempt_201008170108_0004_m_000020_0 -> (-1, -1) from domU-12-31-39-0C-68-41.compute-1.internal
                        2010-08-17 01:29:55,012 INFO org.apache.hadoop.mapred.ReduceTask (main): attempt_201008170108_0004_r_000001_0 Scheduled 1 outputs (0 slow hosts and0 dup hosts)
                        2010-08-17 01:29:55,016 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.5): header: attempt_201008170108_0004_m_000021_0, compressed len: 18, decompressed len: 2
                        2010-08-17 01:29:55,016 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.5): Shuffling 2 bytes (18 raw bytes) into RAM from attempt_201008170108_0004_m_000021_0
                        2010-08-17 01:29:55,016 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.5): Read 2 bytes from map-output for attempt_201008170108_0004_m_000021_0
                        2010-08-17 01:29:55,016 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.5): Rec #1 from attempt_201008170108_0004_m_000021_0 -> (-1, -1) from domU-12-31-39-0C-68-41.compute-1.internal
                        2010-08-17 01:29:56,986 INFO org.apache.hadoop.mapred.ReduceTask (Thread for polling Map Completion Events): attempt_201008170108_0004_r_000001_0: Got 1 new map-outputs
                        2010-08-17 01:29:59,997 INFO org.apache.hadoop.mapred.ReduceTask (Thread for polling Map Completion Events): attempt_201008170108_0004_r_000001_0: Got 1 new map-outputs
                        2010-08-17 01:30:00,017 INFO org.apache.hadoop.mapred.ReduceTask (main): attempt_201008170108_0004_r_000001_0 Scheduled 1 outputs (0 slow hosts and0 dup hosts)
                        2010-08-17 01:30:00,023 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.5): header: attempt_201008170108_0004_m_000022_0, compressed len: 18, decompressed len: 2
                        2010-08-17 01:30:00,024 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.5): Shuffling 2 bytes (18 raw bytes) into RAM from attempt_201008170108_0004_m_000022_0
                        2010-08-17 01:30:00,024 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.5): Read 2 bytes from map-output for attempt_201008170108_0004_m_000022_0
                        2010-08-17 01:30:00,024 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.5): Rec #1 from attempt_201008170108_0004_m_000022_0 -> (-1, -1) from domU-12-31-39-0C-68-41.compute-1.internal
                        2010-08-17 01:30:00,024 INFO org.apache.hadoop.mapred.ReduceTask (main): attempt_201008170108_0004_r_000001_0 Scheduled 1 outputs (0 slow hosts and0 dup hosts)
                        2010-08-17 01:30:00,031 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.16): header: attempt_201008170108_0004_m_000023_0, compressed len: 18, decompressed len: 2
                        2010-08-17 01:30:00,031 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.16): Shuffling 2 bytes (18 raw bytes) into RAM from attempt_201008170108_0004_m_000023_0
                        2010-08-17 01:30:00,031 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.16): Read 2 bytes from map-output for attempt_201008170108_0004_m_000023_0
                        2010-08-17 01:30:00,031 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.16): Rec #1 from attempt_201008170108_0004_m_000023_0 -> (-1, -1) from domU-12-31-39-0C-68-41.compute-1.internal
                        2010-08-17 01:30:03,005 INFO org.apache.hadoop.mapred.ReduceTask (Thread for polling Map Completion Events): attempt_201008170108_0004_r_000001_0: Got 1 new map-outputs
                        2010-08-17 01:30:05,033 INFO org.apache.hadoop.mapred.ReduceTask (main): attempt_201008170108_0004_r_000001_0 Scheduled 1 outputs (0 slow hosts and0 dup hosts)
                        2010-08-17 01:30:05,038 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.18): header: attempt_201008170108_0004_m_000024_0, compressed len: 18, decompressed len: 2
                        2010-08-17 01:30:05,038 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.18): Shuffling 2 bytes (18 raw bytes) into RAM from attempt_201008170108_0004_m_000024_0
                        2010-08-17 01:30:05,039 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.18): Read 2 bytes from map-output for attempt_201008170108_0004_m_000024_0
                        2010-08-17 01:30:05,039 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.18): Rec #1 from attempt_201008170108_0004_m_000024_0 -> (-1, -1) from domU-12-31-39-0C-68-41.compute-1.internal
                        2010-08-17 01:30:06,011 INFO org.apache.hadoop.mapred.ReduceTask (Thread for polling Map Completion Events): attempt_201008170108_0004_r_000001_0: Got 1 new map-outputs
                        2010-08-17 01:30:09,018 INFO org.apache.hadoop.mapred.ReduceTask (Thread for polling Map Completion Events): attempt_201008170108_0004_r_000001_0: Got 1 new map-outputs
                        2010-08-17 01:30:10,039 INFO org.apache.hadoop.mapred.ReduceTask (main): attempt_201008170108_0004_r_000001_0 Scheduled 1 outputs (0 slow hosts and0 dup hosts)
                        2010-08-17 01:30:10,044 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.19): header: attempt_201008170108_0004_m_000025_0, compressed len: 18, decompressed len: 2
                        2010-08-17 01:30:10,045 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.19): Shuffling 2 bytes (18 raw bytes) into RAM from attempt_201008170108_0004_m_000025_0
                        2010-08-17 01:30:10,045 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.19): Read 2 bytes from map-output for attempt_201008170108_0004_m_000025_0
                        2010-08-17 01:30:10,045 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.19): Rec #1 from attempt_201008170108_0004_m_000025_0 -> (-1, -1) from domU-12-31-39-0C-68-41.compute-1.internal
                        2010-08-17 01:30:10,046 INFO org.apache.hadoop.mapred.ReduceTask (main): attempt_201008170108_0004_r_000001_0 Scheduled 1 outputs (0 slow hosts and0 dup hosts)
                        2010-08-17 01:30:10,050 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.17): header: attempt_201008170108_0004_m_000026_0, compressed len: 18, decompressed len: 2
                        2010-08-17 01:30:10,050 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.17): Shuffling 2 bytes (18 raw bytes) into RAM from attempt_201008170108_0004_m_000026_0
                        2010-08-17 01:30:10,050 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.17): Read 2 bytes from map-output for attempt_201008170108_0004_m_000026_0
                        2010-08-17 01:30:10,050 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.17): Rec #1 from attempt_201008170108_0004_m_000026_0 -> (-1, -1) from domU-12-31-39-0C-68-41.compute-1.internal
                        2010-08-17 01:30:12,026 INFO org.apache.hadoop.mapred.ReduceTask (Thread for polling Map Completion Events): attempt_201008170108_0004_r_000001_0: Got 1 new map-outputs
                        2010-08-17 01:30:15,049 INFO org.apache.hadoop.mapred.ReduceTask (Thread for polling Map Completion Events): attempt_201008170108_0004_r_000001_0: Got 1 new map-outputs
                        2010-08-17 01:30:15,052 INFO org.apache.hadoop.mapred.ReduceTask (main): attempt_201008170108_0004_r_000001_0 Need another 5 map output(s) where 0 is already in progress
                        2010-08-17 01:30:15,052 INFO org.apache.hadoop.mapred.ReduceTask (main): attempt_201008170108_0004_r_000001_0 Scheduled 1 outputs (0 slow hosts and0 dup hosts)
                        2010-08-17 01:30:15,060 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.0): header: attempt_201008170108_0004_m_000027_0, compressed len: 18, decompressed len: 2
                        2010-08-17 01:30:15,060 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.0): Shuffling 2 bytes (18 raw bytes) into RAM from attempt_201008170108_0004_m_000027_0
                        2010-08-17 01:30:15,060 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.0): Read 2 bytes from map-output for attempt_201008170108_0004_m_000027_0
                        2010-08-17 01:30:15,060 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.0): Rec #1 from attempt_201008170108_0004_m_000027_0 -> (-1, -1) from domU-12-31-39-0C-68-41.compute-1.internal
                        2010-08-17 01:30:15,061 INFO org.apache.hadoop.mapred.ReduceTask (main): attempt_201008170108_0004_r_000001_0 Scheduled 1 outputs (0 slow hosts and0 dup hosts)
                        2010-08-17 01:30:15,070 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.0): header: attempt_201008170108_0004_m_000028_0, compressed len: 18, decompressed len: 2
                        2010-08-17 01:30:15,070 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.0): Shuffling 2 bytes (18 raw bytes) into RAM from attempt_201008170108_0004_m_000028_0
                        2010-08-17 01:30:15,076 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.0): Read 2 bytes from map-output for attempt_201008170108_0004_m_000028_0
                        2010-08-17 01:30:15,076 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.0): Rec #1 from attempt_201008170108_0004_m_000028_0 -> (-1, -1) from domU-12-31-39-0C-68-41.compute-1.internal
                        2010-08-17 01:30:18,063 INFO org.apache.hadoop.mapred.ReduceTask (Thread for polling Map Completion Events): attempt_201008170108_0004_r_000001_0: Got 1 new map-outputs
                        2010-08-17 01:30:20,077 INFO org.apache.hadoop.mapred.ReduceTask (main): attempt_201008170108_0004_r_000001_0 Scheduled 1 outputs (0 slow hosts and0 dup hosts)
                        2010-08-17 01:30:20,082 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.0): header: attempt_201008170108_0004_m_000029_0, compressed len: 18, decompressed len: 2
                        2010-08-17 01:30:20,082 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.0): Shuffling 2 bytes (18 raw bytes) into RAM from attempt_201008170108_0004_m_000029_0
                        2010-08-17 01:30:20,082 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.0): Read 2 bytes from map-output for attempt_201008170108_0004_m_000029_0
                        2010-08-17 01:30:20,082 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.0): Rec #1 from attempt_201008170108_0004_m_000029_0 -> (-1, -1) from domU-12-31-39-0C-68-41.compute-1.internal
                        2010-08-17 01:30:21,080 INFO org.apache.hadoop.mapred.ReduceTask (Thread for polling Map Completion Events): attempt_201008170108_0004_r_000001_0: Got 1 new map-outputs
                        2010-08-17 01:30:24,086 INFO org.apache.hadoop.mapred.ReduceTask (Thread for polling Map Completion Events): attempt_201008170108_0004_r_000001_0: Got 1 new map-outputs
                        2010-08-17 01:30:25,084 INFO org.apache.hadoop.mapred.ReduceTask (main): attempt_201008170108_0004_r_000001_0 Scheduled 1 outputs (0 slow hosts and0 dup hosts)
                        2010-08-17 01:30:25,091 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.0): header: attempt_201008170108_0004_m_000030_0, compressed len: 18, decompressed len: 2
                        2010-08-17 01:30:25,091 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.0): Shuffling 2 bytes (18 raw bytes) into RAM from attempt_201008170108_0004_m_000030_0
                        2010-08-17 01:30:25,092 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.0): Read 2 bytes from map-output for attempt_201008170108_0004_m_000030_0
                        2010-08-17 01:30:25,092 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.0): Rec #1 from attempt_201008170108_0004_m_000030_0 -> (-1, -1) from domU-12-31-39-0C-68-41.compute-1.internal
                        2010-08-17 01:30:25,093 INFO org.apache.hadoop.mapred.ReduceTask (main): attempt_201008170108_0004_r_000001_0 Scheduled 1 outputs (0 slow hosts and0 dup hosts)
                        2010-08-17 01:30:25,104 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.19): header: attempt_201008170108_0004_m_000031_0, compressed len: 18, decompressed len: 2
                        2010-08-17 01:30:25,104 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.19): Shuffling 2 bytes (18 raw bytes) into RAM from attempt_201008170108_0004_m_000031_0
                        2010-08-17 01:30:25,104 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.19): Read 2 bytes from map-output for attempt_201008170108_0004_m_000031_0
                        2010-08-17 01:30:25,105 INFO org.apache.hadoop.mapred.ReduceTask (MapOutputCopier attempt_201008170108_0004_r_000001_0.19): Rec #1 from attempt_201008170108_0004_m_000031_0 -> (-1, -1) from domU-12-31-39-0C-68-41.compute-1.internal
                        2010-08-17 01:30:26,097 INFO org.apache.hadoop.mapred.ReduceTask (Thread for polling Map Completion Events): GetMapEventsThread exiting
                        2010-08-17 01:30:26,097 INFO org.apache.hadoop.mapred.ReduceTask (main): getMapsEventsThread joined.
                        2010-08-17 01:30:26,097 INFO org.apache.hadoop.mapred.ReduceTask (main): Closed ram manager
                        2010-08-17 01:30:26,097 INFO org.apache.hadoop.mapred.ReduceTask (main): Interleaved on-disk merge complete: 0 files left.
                        2010-08-17 01:30:26,098 INFO org.apache.hadoop.mapred.ReduceTask (main): In-memory merge complete: 32 files left.
                        2010-08-17 01:30:26,160 INFO org.apache.hadoop.mapred.Merger (main): Merging 32 sorted segments
                        2010-08-17 01:30:26,161 INFO org.apache.hadoop.mapred.Merger (main): Down to the last merge-pass, with 0 segments left of total size: 0 bytes
                        2010-08-17 01:30:26,173 INFO org.apache.hadoop.io.compress.CodecPool (main): Got brand-new compressor
                        2010-08-17 01:30:26,184 INFO org.apache.hadoop.mapred.ReduceTask (main): Merged 32 segments, 64 bytes to disk to satisfy reduce memory limit
                        2010-08-17 01:30:26,185 INFO org.apache.hadoop.mapred.ReduceTask (main): Merging 1 files, 22 bytes from disk
                        2010-08-17 01:30:26,185 INFO org.apache.hadoop.mapred.ReduceTask (main): Merging 0 segments, 0 bytes from memory into reduce
                        2010-08-17 01:30:26,186 INFO org.apache.hadoop.mapred.Merger (main): Merging 1 sorted segments
                        2010-08-17 01:30:26,196 INFO org.apache.hadoop.mapred.Merger (main): Down to the last merge-pass, with 0 segments left of total size: 0 bytes
                        2010-08-17 01:30:26,293 INFO org.apache.hadoop.streaming.PipeMapRed (main): PipeMapRed exec [/mnt3/var/lib/hadoop/mapred/taskTracker/jobcache/job_201008170108_0004/attempt_201008170108_0004_r_000001_0/work/./CBFinish.pl, --cmapjar=S3N://crossbow-refs/e_coli.jar, --destdir=/mnt/14270, --output=S3N://pings-ewha/e-coli/crossbow-mediated]
                        2010-08-17 01:30:27,372 INFO org.apache.hadoop.fs.s3native.NativeS3FileSystem (main): Creating new file 's3n://pings-ewha/e-coli/crossbow-mediated/ignoreme2/part-00001' in S3
                        2010-08-17 01:30:27,374 INFO org.apache.hadoop.fs.s3native.NativeS3FileSystem (main): Outputstream for key 'e-coli/crossbow-mediated/ignoreme2/part-00001' writing to tempfile '/mnt/var/lib/hadoop/s3,/mnt1/var/lib/hadoop/s3,/mnt2/var/lib/hadoop/s3,/mnt3/var/lib/hadoop/s3/output-4764919983786119736.tmp'
                        2010-08-17 01:30:27,380 WARN org.apache.hadoop.streaming.PipeMapRed (Thread-34): java.lang.NullPointerException
                        	at org.apache.hadoop.streaming.PipeMapRed$MRErrorThread.incrCounter(PipeMapRed.java:549)
                        	at org.apache.hadoop.streaming.PipeMapRed$MRErrorThread.run(PipeMapRed.java:490)
                        
                        2010-08-17 01:30:27,390 INFO org.apache.hadoop.streaming.PipeMapRed (main): PipeMapRed failed!
                        2010-08-17 01:30:27,392 WARN org.apache.hadoop.mapred.TaskTracker (main): Error running child
                        java.lang.RuntimeException: java.lang.NullPointerException
                        	at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:386)
                        	at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:582)
                        	at org.apache.hadoop.streaming.PipeReducer.close(PipeReducer.java:137)
                        	at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:477)
                        	at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:415)
                        	at org.apache.hadoop.mapred.Child.main(Child.java:170)
                        Caused by: java.lang.NullPointerException
                        	at org.apache.hadoop.streaming.PipeMapRed$MRErrorThread.incrCounter(PipeMapRed.java:549)
                        	at org.apache.hadoop.streaming.PipeMapRed$MRErrorThread.run(PipeMapRed.java:490)
                        2010-08-17 01:30:27,395 INFO org.apache.hadoop.mapred.TaskRunner (main): Runnning cleanup for the task
                        2010-08-17 01:30:27,395 INFO org.apache.hadoop.mapred.DirectFileOutputCommitter (main): Nothing to clean up on abort since there are no temporary files written
                        Last edited by Serena Rhie; 08-16-2010, 06:30 PM. Reason: Sorry for bad view!

                        Comment


                        • #27
                          How to work with Local files in Crossbow

                          Dear All

                          I am using the version 1.2.0 of Crossbow. I have managed to get it working for a file specified in .manifest input file that points to an FTP server.

                          I am wondering how would you get it to work for a input data that is locally already on your computer, such as .fastq or .fastq.gz file or .sra file ? This will avoid internet dependency.

                          Also, I tried the bowtie option by specifying the path to bowtie2 and crossbow did not work. Does Crossbow work only for bowtie1 and no other aligner ?

                          Narain

                          Comment


                          • #28
                            Originally posted by Ben Langmead View Post
                            Hi all,

                            The Crossbow paper, Searching for SNPs with cloud computing came out in provisional form today. Take a look if you're interested.

                            Thanks,
                            Ben
                            Hello Ben,

                            I am trying to use Crossbow using the web interface with 5 instances. My emr job keeps failing at Crossbow Step 1: Align with Bowtie. Can you please suggest me something so that I can make it work.

                            Total of 7 tasks were run. Out of which 2 tasks failed with 4 attempts each.
                            Here are the syslog and stderr of failed task

                            stderr
                            Code:
                            Warning: No TOOLNAME file in tool directory: Bin
                            Align.pl: s3cmd: found: /usr/bin/s3cmd, given: 
                            Align.pl: jar: found: /usr/lib/jvm/java-6-sun/bin/jar, given: 
                            Align.pl: hadoop: found: /home/hadoop/.versions/0.20.205/libexec/../bin/hadoop, given: 
                            Align.pl: wget: found: /usr/bin/wget, given: 
                            Align.pl: s3cfg: 
                            Align.pl: bowtie: found: ./bowtie, given: 
                            Align.pl: partition len: 1000000
                            Align.pl: ref: S3N://crossbow-refs/hg18.jar
                            Align.pl: quality: phred33
                            Align.pl: truncate at: 0
                            Align.pl: discard mate: 0
                            Align.pl: discard reads < truncate len: 0
                            Align.pl: SAM passthrough: 0
                            Align.pl: Straight through: 0
                            Align.pl: local index path: 
                            Align.pl: counters: 
                            Align.pl: dest dir: /mnt/15049
                            Align.pl: bowtie args: --partition 1000000 --mm -t --hadoopout --startverbose -m 1
                            Align.pl: ls -al
                            Align.pl: total 4
                            drwxr-xr-x 3 hadoop hadoop 4096 Oct  8 00:12 .
                            drwxr-xr-x 3 hadoop hadoop   17 Oct  8 00:12 ..
                            lrwxrwxrwx 1 hadoop hadoop   94 Oct  8 00:12 .job.jar.crc -> /mnt/var/lib/hadoop/mapred/taskTracker/hadoop/jobcache/job_201310072347_0002/jars/.job.jar.crc
                            lrwxrwxrwx 1 hadoop hadoop  118 Oct  8 00:12 AWS.pm -> /mnt2/var/lib/hadoop/mapred/taskTracker/distcache/-4647442751582941863_-1803196130_294487920/crossbow-emr/1.2.1/AWS.pm
                            lrwxrwxrwx 1 hadoop hadoop  117 Oct  8 00:12 Align.pl -> /mnt3/var/lib/hadoop/mapred/taskTracker/distcache/-949217784113535430_-79486178_294487920/crossbow-emr/1.2.1/Align.pl
                            lrwxrwxrwx 1 hadoop hadoop  122 Oct  8 00:12 Counters.pm -> /mnt3/var/lib/hadoop/mapred/taskTracker/distcache/3768974508281659116_-1504224494_294492920/crossbow-emr/1.2.1/Counters.pm
                            lrwxrwxrwx 1 hadoop hadoop  116 Oct  8 00:12 Get.pm -> /mnt2/var/lib/hadoop/mapred/taskTracker/distcache/1742673893144693291_-703943522_294499920/crossbow-emr/1.2.1/Get.pm
                            lrwxrwxrwx 1 hadoop hadoop   90 Oct  8 00:12 META-INF -> /mnt/var/lib/hadoop/mapred/taskTracker/hadoop/jobcache/job_201310072347_0002/jars/META-INF
                            lrwxrwxrwx 1 hadoop hadoop  119 Oct  8 00:12 Tools.pm -> /mnt1/var/lib/hadoop/mapred/taskTracker/distcache/2561649607146715239_-2000054114_294501920/crossbow-emr/1.2.1/Tools.pm
                            lrwxrwxrwx 1 hadoop hadoop  118 Oct  8 00:12 Util.pm -> /mnt/var/lib/hadoop/mapred/taskTracker/distcache/-6230906815329997085_-1944427886_294502920/crossbow-emr/1.2.1/Util.pm
                            lrwxrwxrwx 1 hadoop hadoop  119 Oct  8 00:12 bowtie -> /mnt2/var/lib/hadoop/mapred/taskTracker/distcache/8717311255330235240_-1327012900_303598920/crossbow-emr/1.2.1/bowtie64
                            lrwxrwxrwx 1 hadoop hadoop   89 Oct  8 00:12 job.jar -> /mnt/var/lib/hadoop/mapred/taskTracker/hadoop/jobcache/job_201310072347_0002/jars/job.jar
                            lrwxrwxrwx 1 hadoop hadoop   85 Oct  8 00:12 org -> /mnt/var/lib/hadoop/mapred/taskTracker/hadoop/jobcache/job_201310072347_0002/jars/org
                            drwxr-xr-x 2 hadoop hadoop    6 Oct  8 00:12 tmp
                            Align.pl: Read first line of stdin:
                            FN:human.125.1M.1.fq;RN:@chr10_77883106_77883509_0:0:0_4:0:0_0/2	GTTTCTGAGATGCTGCAGAATGCTGCCTCACATCCACCTCTGAGTGAAAGAATTCCTTCACAGATTATATATATTCAGAGAAGGACTATCCTAACCTACAGTTTCGAAGCTTTTATGTCTAAAGA	22222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222	AAAAAAAAAAAATTAGCCAGGTATGGCGGCTCACACCTGCGGTCCCAGCTACTTGGGAGACTAAGGTGGGAGGATCACCTGAGCCTGGGAGGTCGAGGCTGCAGTGAGCTGTGATTGTGCCACTG	22222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222
                            Align.pl: Read last line of stdin:
                            FN:human.125.1M.1.fq;RN:@chr2_196190269_196190796_1:0:0_4:0:0_9041/2	TATTAAAGCCAGGTGGAGAATAAAACCTGCCTACATTAATTCTATCACCTTCCCTAATTCCTAATTGCCATTTAACCATGGGAAGCCATAACTACCAAAAAGCGGGGCAGAGAAAGCAGAAGATA	22222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222	GATAGGAATTGTTAAATTATTTATATAAGTCAAATGAAGCTTTGCAGTCCTGTACTAAAACACTATTTAGTGGGAATAGAATGTAAGAAGCTCTAGAAAATCAATTTGCCACAGTACTCTTATTT	22222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222
                            Align.pl: 500000 reads downloaded
                            Align.pl: 500000 reads downloaded
                            Align.pl: head -4 .tmp.11427:
                            Align.pl: r	GTTTCTGAGATGCTGCAGAATGCTGCCTCACATCCACCTCTGAGTGAAAGAATTCCTTCACAGATTATATATATTCAGAGAAGGACTATCCTAACCTACAGTTTCGAAGCTTTTATGTCTAAAGA	22222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222	AAAAAAAAAAAATTAGCCAGGTATGGCGGCTCACACCTGCGGTCCCAGCTACTTGGGAGACTAAGGTGGGAGGATCACCTGAGCCTGGGAGGTCGAGGCTGCAGTGAGCTGTGATTGTGCCACTG	22222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222
                            r	TGCCTGCTCTAGGGTTGAGTAAGCGCAGAAAACTCCTAGCTCACCCTCCATCCTCTGCTGCATTTATTGGGGTGGAGTGGGGAACAGGGAGTTGGACCTTGATAAACTGGGACAGCTGGGCTGAG	22222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222	AGTAACCACCTCCTCTCCTTCAACACATCCTTACCTCCCTCCCACCCCAGGTGCCATGGAGAGGTGGGAGGGAGGCAGTGGGCCAGGCAGGGAGATCGATGGCATTCGTGGCCTCTGGCCCAGGG	22222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222
                            r	GATAGAATGAAATCGAAAAAAATCAGTGAAAGGAATCTAATGGAATCATCATCGAATGGAATCGAATGGAATCATCATCGAATAGAATCGAATGGAATCCTCAAAAGTAATTGAATGGAAAAAAC	22222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222	ATTCGAGGATTCCATTCGTTTCCACTCGATGTTTATTCCATTCGATTAACTTTGACGATTCCATTCAATTCATTCGTTGATGATTCCTTTCGATTCCATTTGATGATGATTGCATTAGGTATCAT	22222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222
                            r	GACATCTATCTTCTAGATGACCCCCTGTCTTCAGTGGATGCTCATGTACGAAAACGTATTTTTAATAAGGTCTTGGGCCCCAATGGCCTGTTCAAAGGCAAGGTGAGAAATCATTGACCATGATG	22222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222	GTCTTAGAAATGTCTTCAGATTCTTAGCAAGCTCTCCTTTTTTGGCCAGGAGAGCACTGTAGGATCCTTTCTCTACAATTGTTCCATTCCCCAGAACTACAATCTCATCCACTTGAGGAAGAAAG	22222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222
                            Align.pl: tail -4 .tmp.11427:
                            Align.pl: r	AGAGTGGAGAGTTCTCCTCTCCGTGCTTAAAAACCCCTGAGACTTCAAGAATACTCAAAATAGTACAGATCAAAAGCCCTAAAAATGCATGTACTCCCAGAACAACATAAGCAACTTTAAGAAAT	22222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222	GGTTGTCTCTAATTTTTCTTCTGTTGCAAACAGGGGGGCAAAGAATAAACTCATACATGTCATTGCCTACAAGTACAGATGTATTGCTAAGATAAATTCCTAGAAGTGGAATTGTTGAGTCAAAG	22222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222
                            r	GAGAAAAGGTCAAATGGAAATGTTAGAAATAAAAAAAACATGATATCAAGCATAAAGGATTCTATTAATGAGTTCATCCATAACTTTGGCACAGTTGAAGAAGAATCAGTAAAACTGAAGATAGG	22222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222	TTATGATGAGATATCTGCCATGATTCTTATTTTTTTCCTTTGTATATTATGTATCTTTTTCTGTCTGCTGTCCAGATTTTTTCTTTGGGTTTTAGAAGTTTGCTGTGATGTGTCTAGGGGTGTGT	22222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222
                            r	TGGCAAGCTGGAGACCCAGGTGAACTGATGGTGTAGTTCCATACGGAATGCCGGCAGGCTTGAGACCCAGAAAGAGCTGATGTTTAGTCTGAGGCTGAAGGCAGGAAGGAACTGATGTCCCAGCT	22222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222	GGCAGCAATTAACTCAAGTAAATGCTCTCCTCTGAAGCCTGAAGACCAGAGTTCTGTAGGGTTGATTGCAGAAAGAATCATCAGCTTTTGTTGCAAAACTACTTGGCCGAAAAGTTGGCCTTCTG	22222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222
                            r	TATTAAAGCCAGGTGGAGAATAAAACCTGCCTACATTAATTCTATCACCTTCCCTAATTCCTAATTGCCATTTAACCATGGGAAGCCATAACTACCAAAAAGCGGGGCAGAGAAAGCAGAAGATA	22222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222	GATAGGAATTGTTAAATTATTTATATAAGTCAAATGAAGCTTTGCAGTCCTGTACTAAAACACTATTTAGTGGGAATAGAATGTAAGAAGCTCTAGAAAATCAATTTGCCACAGTACTCTTATTT	22222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222
                            Align.pl:   ...ensuring reference jar is installed first
                            Get.pm:ensureFetched: called on "S3N://crossbow-refs/hg18.jar"
                            Get.pm:ensureFetched: base name "hg18.jar"
                            ls -al /mnt/15049/*hg18.jar* /mnt/15049/.*hg18.jar*
                            -rw-r--r-- 1 hadoop hadoop          0 Oct  8 00:02 /mnt/15049/.hg18.jar.done
                            -rw-r--r-- 1 hadoop hadoop          0 Oct  7 23:55 /mnt/15049/.hg18.jar.lock
                            -rw-r--r-- 1 hadoop hadoop 3896493171 Oct  7 23:57 /mnt/15049/hg18.jar
                            Pid 11427: Checking for done file /mnt/15049/.hg18.jar.done
                            Pid 11427: done file /mnt/15049/.hg18.jar.done was there already; continuing
                            Align.pl: Running: ./bowtie --partition 1000000 --mm -t --hadoopout --startverbose -m 1 --12 .tmp.11427 /mnt/15049/index/index 2>.tmp.Align.pl.11427.err
                            ./bowtie --partition 1000000 --mm -t --hadoopout --startverbose -m 1 --12 .tmp.11427 /mnt/15049/index/index 2>.tmp.Align.pl.11427.err
                            java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 143
                            	at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:372)
                            	at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:582)
                            	at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:135)
                            	at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
                            	at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:36)
                            	at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:441)
                            	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:377)
                            	at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
                            	at java.security.AccessController.doPrivileged(Native Method)
                            	at javax.security.auth.Subject.doAs(Subject.java:396)
                            	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
                            	at org.apache.hadoop.mapred.Child.main(Child.java:249)
                            syslog
                            Code:
                            2013-10-08 00:12:31,100 WARN org.apache.hadoop.conf.Configuration (main): DEPRECATED: hadoop-site.xml found in the classpath. Usage of hadoop-site.xml is deprecated. Instead use core-site.xml, mapred-site.xml and hdfs-site.xml to override properties of core-default.xml, mapred-default.xml and hdfs-default.xml respectively
                            2013-10-08 00:12:31,779 INFO org.apache.hadoop.util.NativeCodeLoader (main): Loaded the native-hadoop library
                            2013-10-08 00:12:31,842 INFO org.apache.hadoop.mapred.TaskRunner (main): Creating symlink: /mnt2/var/lib/hadoop/mapred/taskTracker/distcache/8717311255330235240_-1327012900_303598920/crossbow-emr/1.2.1/bowtie64 <- /mnt3/var/lib/hadoop/mapred/taskTracker/hadoop/jobcache/job_201310072347_0002/attempt_201310072347_0002_m_000001_2/work/bowtie
                            2013-10-08 00:12:31,848 INFO org.apache.hadoop.mapred.TaskRunner (main): Creating symlink: /mnt2/var/lib/hadoop/mapred/taskTracker/distcache/1742673893144693291_-703943522_294499920/crossbow-emr/1.2.1/Get.pm <- /mnt3/var/lib/hadoop/mapred/taskTracker/hadoop/jobcache/job_201310072347_0002/attempt_201310072347_0002_m_000001_2/work/Get.pm
                            2013-10-08 00:12:31,852 INFO org.apache.hadoop.mapred.TaskRunner (main): Creating symlink: /mnt3/var/lib/hadoop/mapred/taskTracker/distcache/3768974508281659116_-1504224494_294492920/crossbow-emr/1.2.1/Counters.pm <- /mnt3/var/lib/hadoop/mapred/taskTracker/hadoop/jobcache/job_201310072347_0002/attempt_201310072347_0002_m_000001_2/work/Counters.pm
                            2013-10-08 00:12:31,856 INFO org.apache.hadoop.mapred.TaskRunner (main): Creating symlink: /mnt/var/lib/hadoop/mapred/taskTracker/distcache/-6230906815329997085_-1944427886_294502920/crossbow-emr/1.2.1/Util.pm <- /mnt3/var/lib/hadoop/mapred/taskTracker/hadoop/jobcache/job_201310072347_0002/attempt_201310072347_0002_m_000001_2/work/Util.pm
                            2013-10-08 00:12:31,859 INFO org.apache.hadoop.mapred.TaskRunner (main): Creating symlink: /mnt1/var/lib/hadoop/mapred/taskTracker/distcache/2561649607146715239_-2000054114_294501920/crossbow-emr/1.2.1/Tools.pm <- /mnt3/var/lib/hadoop/mapred/taskTracker/hadoop/jobcache/job_201310072347_0002/attempt_201310072347_0002_m_000001_2/work/Tools.pm
                            2013-10-08 00:12:31,863 INFO org.apache.hadoop.mapred.TaskRunner (main): Creating symlink: /mnt2/var/lib/hadoop/mapred/taskTracker/distcache/-4647442751582941863_-1803196130_294487920/crossbow-emr/1.2.1/AWS.pm <- /mnt3/var/lib/hadoop/mapred/taskTracker/hadoop/jobcache/job_201310072347_0002/attempt_201310072347_0002_m_000001_2/work/AWS.pm
                            2013-10-08 00:12:31,867 INFO org.apache.hadoop.mapred.TaskRunner (main): Creating symlink: /mnt3/var/lib/hadoop/mapred/taskTracker/distcache/-949217784113535430_-79486178_294487920/crossbow-emr/1.2.1/Align.pl <- /mnt3/var/lib/hadoop/mapred/taskTracker/hadoop/jobcache/job_201310072347_0002/attempt_201310072347_0002_m_000001_2/work/Align.pl
                            2013-10-08 00:12:31,874 INFO org.apache.hadoop.filecache.TrackerDistributedCacheManager (main): Creating symlink: /mnt/var/lib/hadoop/mapred/taskTracker/hadoop/jobcache/job_201310072347_0002/jars/job.jar <- /mnt3/var/lib/hadoop/mapred/taskTracker/hadoop/jobcache/job_201310072347_0002/attempt_201310072347_0002_m_000001_2/work/job.jar
                            2013-10-08 00:12:31,878 INFO org.apache.hadoop.filecache.TrackerDistributedCacheManager (main): Creating symlink: /mnt/var/lib/hadoop/mapred/taskTracker/hadoop/jobcache/job_201310072347_0002/jars/.job.jar.crc <- /mnt3/var/lib/hadoop/mapred/taskTracker/hadoop/jobcache/job_201310072347_0002/attempt_201310072347_0002_m_000001_2/work/.job.jar.crc
                            2013-10-08 00:12:31,881 INFO org.apache.hadoop.filecache.TrackerDistributedCacheManager (main): Creating symlink: /mnt/var/lib/hadoop/mapred/taskTracker/hadoop/jobcache/job_201310072347_0002/jars/META-INF <- /mnt3/var/lib/hadoop/mapred/taskTracker/hadoop/jobcache/job_201310072347_0002/attempt_201310072347_0002_m_000001_2/work/META-INF
                            2013-10-08 00:12:31,885 INFO org.apache.hadoop.filecache.TrackerDistributedCacheManager (main): Creating symlink: /mnt/var/lib/hadoop/mapred/taskTracker/hadoop/jobcache/job_201310072347_0002/jars/org <- /mnt3/var/lib/hadoop/mapred/taskTracker/hadoop/jobcache/job_201310072347_0002/attempt_201310072347_0002_m_000001_2/work/org
                            2013-10-08 00:12:32,166 WARN org.apache.hadoop.metrics2.impl.MetricsSystemImpl (main): Source name ugi already exists!
                            2013-10-08 00:12:32,264 INFO org.apache.hadoop.mapred.MapTask (main): Host name: ip-10-181-4-225.ec2.internal
                            2013-10-08 00:12:32,282 INFO org.apache.hadoop.util.ProcessTree (main): setsid exited with exit code 0
                            2013-10-08 00:12:32,290 INFO org.apache.hadoop.mapred.Task (main):  Using ResourceCalculatorPlugin : [email protected]
                            2013-10-08 00:12:32,418 INFO com.hadoop.compression.lzo.GPLNativeCodeLoader (main): Loaded native gpl library
                            2013-10-08 00:12:32,431 WARN com.hadoop.compression.lzo.LzoCodec (main): Could not find build properties file with revision hash
                            2013-10-08 00:12:32,431 INFO com.hadoop.compression.lzo.LzoCodec (main): Successfully loaded & initialized native-lzo library [hadoop-lzo rev UNKNOWN]
                            2013-10-08 00:12:32,440 WARN org.apache.hadoop.io.compress.snappy.LoadSnappy (main): Snappy native library is available
                            2013-10-08 00:12:32,441 INFO org.apache.hadoop.io.compress.snappy.LoadSnappy (main): Snappy native library loaded
                            2013-10-08 00:12:32,450 INFO org.apache.hadoop.io.compress.zlib.ZlibFactory (main): Successfully loaded & initialized native-zlib library
                            2013-10-08 00:12:32,451 INFO org.apache.hadoop.mapred.MapTask (main): numReduceTasks: 0
                            2013-10-08 00:12:32,537 INFO org.apache.hadoop.streaming.PipeMapRed (main): PipeMapRed exec [/mnt3/var/lib/hadoop/mapred/taskTracker/hadoop/jobcache/job_201310072347_0002/attempt_201310072347_0002_m_000001_2/work/./Align.pl, --discard-reads=0, --ref=S3N://crossbow-refs/hg18.jar, --destdir=/mnt/15049, --partlen=1000000, --qual=phred33, --truncate=0, --, --partition, 1000000, --mm, -t, --hadoopout, --startverbose, -m, 1]
                            2013-10-08 00:12:32,631 INFO org.apache.hadoop.streaming.PipeMapRed (main): R/W/S=1/0/0 in:NA [rec/s] out:NA [rec/s]
                            2013-10-08 00:12:32,632 INFO org.apache.hadoop.streaming.PipeMapRed (main): R/W/S=10/0/0 in:NA [rec/s] out:NA [rec/s]
                            2013-10-08 00:12:32,637 INFO org.apache.hadoop.streaming.PipeMapRed (main): R/W/S=100/0/0 in:NA [rec/s] out:NA [rec/s]
                            2013-10-08 00:12:33,169 INFO org.apache.hadoop.streaming.PipeMapRed (main): R/W/S=1000/0/0 in:NA [rec/s] out:NA [rec/s]
                            2013-10-08 00:12:36,782 INFO org.apache.hadoop.streaming.PipeMapRed (main): R/W/S=10000/0/0 in:2500=10000/4 [rec/s] out:0=0/4 [rec/s]
                            2013-10-08 00:13:13,024 INFO org.apache.hadoop.streaming.PipeMapRed (main): R/W/S=100000/0/0 in:2500=100000/40 [rec/s] out:0=0/40 [rec/s]
                            2013-10-08 00:13:52,968 INFO org.apache.hadoop.streaming.PipeMapRed (main): R/W/S=200000/0/0 in:2500=200000/80 [rec/s] out:0=0/80 [rec/s]
                            2013-10-08 00:14:32,916 INFO org.apache.hadoop.streaming.PipeMapRed (main): R/W/S=300000/0/0 in:2500=300000/120 [rec/s] out:0=0/120 [rec/s]
                            2013-10-08 00:15:12,966 INFO org.apache.hadoop.streaming.PipeMapRed (main): R/W/S=400000/0/0 in:2500=400000/160 [rec/s] out:0=0/160 [rec/s]
                            2013-10-08 00:15:52,985 INFO org.apache.hadoop.streaming.PipeMapRed (main): R/W/S=500000/0/0 in:2500=500000/200 [rec/s] out:0=0/200 [rec/s]
                            2013-10-08 00:25:58,776 INFO org.apache.hadoop.streaming.PipeMapRed (Thread-14): MRErrorThread done
                            2013-10-08 00:25:58,777 INFO org.apache.hadoop.streaming.PipeMapRed (main): PipeMapRed failed!
                            Looking forward to a reply.

                            Thanks

                            Comment


                            • #29
                              Quick update to previous post

                              Hello,

                              I would like to update the previous post. I managed to complete 3 steps of Crossbow via EMR command line. But I am getting error in the final step 'Get Counters'.

                              I hope someone can help me out with this.

                              controller
                              Code:
                              2013-10-08T03:39:40.661Z INFO Fetching jar file.
                              2013-10-08T03:39:42.169Z INFO Working dir /mnt/var/lib/hadoop/steps/5
                              2013-10-08T03:39:42.169Z INFO Executing /usr/lib/jvm/java-6-sun/bin/java -cp /home/hadoop/conf:/usr/lib/jvm/java-6-sun/lib/tools.jar:/home/hadoop:/home/hadoop/hadoop-tools.jar:/home/hadoop/hadoop-core.jar:/home/hadoop/hadoop-core-0.20.205.jar:/home/hadoop/hadoop-tools-0.20.205.jar:/home/hadoop/lib/*:/home/hadoop/lib/jetty-ext/* -Xmx1000m -Dhadoop.log.dir=/mnt/var/log/hadoop/steps/5 -Dhadoop.log.file=syslog -Dhadoop.home.dir=/home/hadoop -Dhadoop.id.str=hadoop -Dhadoop.root.logger=INFO,DRFA -Djava.io.tmpdir=/mnt/var/lib/hadoop/steps/5/tmp -Djava.library.path=/home/hadoop/native/Linux-amd64-64 org.apache.hadoop.util.RunJar /home/hadoop/contrib/streaming/hadoop-streaming-0.20.205.jar -D mapred.reduce.tasks=1 -input s3n://crossbow-emr/dummy-input -output s3n://ashwin-test/crossbow-emr-cli_crossbow_counters/ignoreme1 -mapper cat -reducer s3n://crossbow-emr/1.2.1/Counters.pl  --output=S3N://ashwin-test/crossbow-emr-cli_crossbow_counters -cacheFile s3n://crossbow-emr/1.2.1/Get.pm#Get.pm -cacheFile s3n://crossbow-emr/1.2.1/Counters.pm#Counters.pm -cacheFile s3n://crossbow-emr/1.2.1/Util.pm#Util.pm -cacheFile s3n://crossbow-emr/1.2.1/Tools.pm#Tools.pm -cacheFile s3n://crossbow-emr/1.2.1/AWS.pm#AWS.pm
                              2013-10-08T03:39:45.175Z INFO Execution ended with ret val 1
                              2013-10-08T03:39:45.176Z WARN Step failed with bad retval
                              2013-10-08T03:39:46.681Z INFO Step created jobs:
                              syslog

                              Code:
                              2013-10-08 03:39:42,458 WARN org.apache.hadoop.conf.Configuration (main): DEPRECATED: hadoop-site.xml found in the classpath. Usage of hadoop-site.xml is deprecated. Instead use core-site.xml, mapred-site.xml and hdfs-site.xml to override properties of core-default.xml, mapred-default.xml and hdfs-default.xml respectively
                              2013-10-08 03:39:43,393 INFO org.apache.hadoop.mapred.JobClient (main): Default number of map tasks: null
                              2013-10-08 03:39:43,393 INFO org.apache.hadoop.mapred.JobClient (main): Setting default number of map tasks based on cluster size to : 56
                              2013-10-08 03:39:43,393 INFO org.apache.hadoop.mapred.JobClient (main): Default number of reduce tasks: 1
                              2013-10-08 03:39:44,940 INFO com.hadoop.compression.lzo.GPLNativeCodeLoader (main): Loaded native gpl library
                              2013-10-08 03:39:44,943 WARN com.hadoop.compression.lzo.LzoCodec (main): Could not find build properties file with revision hash
                              2013-10-08 03:39:44,943 INFO com.hadoop.compression.lzo.LzoCodec (main): Successfully loaded & initialized native-lzo library [hadoop-lzo rev UNKNOWN]
                              2013-10-08 03:39:44,950 WARN org.apache.hadoop.io.compress.snappy.LoadSnappy (main): Snappy native library is available
                              2013-10-08 03:39:44,951 INFO org.apache.hadoop.io.compress.snappy.LoadSnappy (main): Snappy native library loaded
                              2013-10-08 03:39:45,047 INFO org.apache.hadoop.mapred.JobClient (main): Cleaning up the staging area hdfs://10.159.25.174:9000/mnt/var/lib/hadoop/tmp/mapred/staging/hadoop/.staging/job_201310080306_0004
                              stderr
                              Code:
                              Exception in thread "main" Status Code: 403, AWS Request ID: 2977B25629DD5007, AWS Error Code: null, AWS Error Message: Forbidden, S3 Extended Request ID: OcPQrMLKUHBKHfdh4ICR5BgEWNzDtUEzc8H2km55h0nCL92RKph4rFXSCEY9y6vq
                              	at com.amazonaws.http.AmazonHttpClient.handleErrorResponse(AmazonHttpClient.java:544)
                              	at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:284)
                              	at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:169)
                              	at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:2619)
                              	at com.amazonaws.services.s3.AmazonS3Client.getObjectMetadata(AmazonS3Client.java:708)
                              	at com.amazonaws.services.s3.AmazonS3Client.getObjectMetadata(AmazonS3Client.java:688)
                              	at org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.retrieveMetadata(Jets3tNativeFileSystemStore.java:100)
                              	at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
                              	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
                              	at java.lang.reflect.Method.invoke(Method.java:597)
                              	at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
                              	at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
                              	at org.apache.hadoop.fs.s3native.$Proxy3.retrieveMetadata(Unknown Source)
                              	at org.apache.hadoop.fs.s3native.NativeS3FileSystem.listStatus(NativeS3FileSystem.java:730)
                              	at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:783)
                              	at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:808)
                              	at org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:185)
                              	at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:208)
                              	at org.apache.hadoop.mapred.JobClient.writeOldSplits(JobClient.java:1026)
                              	at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:1018)
                              	at org.apache.hadoop.mapred.JobClient.access$700(JobClient.java:172)
                              	at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:934)
                              	at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:887)
                              	at java.security.AccessController.doPrivileged(Native Method)
                              	at javax.security.auth.Subject.doAs(Subject.java:396)
                              	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
                              	at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:887)
                              	at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:861)
                              	at org.apache.hadoop.streaming.StreamJob.submitAndMonitorJob(StreamJob.java:1010)
                              	at org.apache.hadoop.streaming.StreamJob.run(StreamJob.java:127)
                              	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
                              	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
                              	at org.apache.hadoop.streaming.HadoopStreaming.main(HadoopStreaming.java:50)
                              	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
                              	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
                              	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
                              	at java.lang.reflect.Method.invoke(Method.java:597)
                              	at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
                              stdout

                              Code:
                              packageJobJar: [/mnt/var/lib/hadoop/tmp/hadoop-unjar9002137556695792672/] [] /mnt/var/lib/hadoop/steps/5/tmp/streamjob4081705531014015666.jar tmpDir=null
                              Thanks

                              Comment

                              Latest Articles

                              Collapse

                              • seqadmin
                                A Brief Overview and Common Challenges in Single-cell Sequencing Analysis
                                by seqadmin


                                ​​​​​​The introduction of single-cell sequencing has advanced the ability to study cell-to-cell heterogeneity. Its use has improved our understanding of somatic mutations1, cell lineages2, cellular diversity and regulation3, and development in multicellular organisms4. Single-cell sequencing encompasses hundreds of techniques with different approaches to studying the genomes, transcriptomes, epigenomes, and other omics of individual cells. The analysis of single-cell sequencing data i...

                                01-24-2023, 01:19 PM
                              • seqadmin
                                Introduction to Single-Cell Sequencing
                                by seqadmin
                                Single-cell sequencing is a technique used to investigate the genome, transcriptome, epigenome, and other omics of individual cells using high-throughput sequencing. This technology has provided many scientific breakthroughs and continues to be applied across many fields, including microbiology, oncology, immunology, neurobiology, precision medicine, and stem cell research.

                                The advancement of single-cell sequencing began in 2009 when Tang et al. investigated the single-cell transcriptomes
                                ...
                                01-09-2023, 03:10 PM
                              • seqadmin
                                AVITI from Element Biosciences: Latest Sequencing Technologies—Part 6
                                by seqadmin
                                Element Biosciences made its sequencing market debut this year when it released AVITI, its first sequencer. The AVITI System uses avidity sequencing, a novel sequencing chemistry that delivers higher quality data, decreases cycle times, and requires lower reagent concentrations. This new instrument reportedly features lower operating and start-up costs while maintaining quality sequencing.

                                Read type and length
                                AVITI is a short-read benchtop sequencer that also offers an innovative...
                                12-29-2022, 10:44 AM

                              ad_right_rmr

                              Collapse
                              Working...
                              X