从Python java.io.IOException访问Hadoop：管道已结束

2020-10-21 01:52:36,861 INFO impl.MetricsConfig: Loaded properties from hadoop-metrics2.properties 2020-10-21 01:52:37,002 INFO impl.MetricsSystemImpl: Scheduled Metric snapshot period at 10 second(s). 2020-10-21 01:52:37,002 INFO impl.MetricsSystemImpl: JobTracker metrics system started 2020-10-21 01:52:37,028 WARN impl.MetricsSystemImpl: JobTracker metrics system already initialized! 2020-10-21 01:52:37,964 INFO mapred.FileInputFormat: Total input files to process : 1 2020-10-21 01:52:38,209 INFO mapreduce.JobSubmitter: number of splits:1 2020-10-21 01:52:38,386 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_local105213688_0001 2020-10-21 01:52:38,387 INFO mapreduce.JobSubmitter: Executing with tokens: [] 2020-10-21 01:52:38,557 INFO mapreduce.Job: The url to track the job: http://localhost:8080/ 2020-10-21 01:52:38,561 INFO mapreduce.Job: Running job: job_local105213688_0001 2020-10-21 01:52:38,561 INFO mapred.LocalJobRunner: OutputCommitter set in config null 2020-10-21 01:52:38,566 INFO mapred.LocalJobRunner: OutputCommitter is org.apache.hadoop.mapred.FileOutputCommitter 2020-10-21 01:52:38,579 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 2 2020-10-21 01:52:38,579 INFO output.FileOutputCommitter: FileOutputCommitter skip cleanup _temporary folders under output directory:false, ignore cleanup failures: false 2020-10-21 01:52:38,689 INFO mapred.LocalJobRunner: Waiting for map tasks 2020-10-21 01:52:38,694 INFO mapred.LocalJobRunner: Starting task: attempt_local105213688_0001_m_000000_0 2020-10-21 01:52:38,732 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 2 2020-10-21 01:52:38,733 INFO output.FileOutputCommitter: FileOutputCommitter skip cleanup _temporary folders under output directory:false, ignore cleanup failures: false 2020-10-21 01:52:38,745 INFO util.ProcfsBasedProcessTree: ProcfsBasedProcessTree currently is supported only on Linux. 2020-10-21 01:52:38,801 INFO mapred.Task: Using ResourceCalculatorProcessTree : org.apache.hadoop.yarn.util.WindowsBasedProcessTree@6b965617 2020-10-21 01:52:38,820 INFO mapred.MapTask: Processing split: hdfs://localhost:9000/inp/inputfile.txt:0+6488666 2020-10-21 01:52:38,883 INFO mapred.MapTask: numReduceTasks: 1 2020-10-21 01:52:39,001 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584) 2020-10-21 01:52:39,001 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100 2020-10-21 01:52:39,003 INFO mapred.MapTask: soft limit at 83886080 2020-10-21 01:52:39,004 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600 2020-10-21 01:52:39,005 INFO mapred.MapTask: kvstart = 26214396; length = 6553600 2020-10-21 01:52:39,012 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer 2020-10-21 01:52:39,115 INFO streaming.PipeMapRed: PipeMapRed exec [D:\Wireshark\uninstall.exe, D:\ll\sem5\Cloud_Computing\assignment\lab6-7\lab7\mapper.py] 2020-10-21 01:52:39,127 INFO Configuration.deprecation: mapred.work.output.dir is deprecated. Instead, use mapreduce.task.output.dir 2020-10-21 01:52:39,129 INFO Configuration.deprecation: map.input.start is deprecated. Instead, use mapreduce.map.input.start 2020-10-21 01:52:39,130 INFO Configuration.deprecation: mapred.task.is.map is deprecated. Instead, use mapreduce.task.ismap 2020-10-21 01:52:39,131 INFO Configuration.deprecation: mapred.task.id is deprecated. Instead, use mapreduce.task.attempt.id 2020-10-21 01:52:39,133 INFO Configuration.deprecation: mapred.tip.id is deprecated. Instead, use mapreduce.task.id 2020-10-21 01:52:39,137 INFO Configuration.deprecation: mapred.local.dir is deprecated. Instead, use mapreduce.cluster.local.dir 2020-10-21 01:52:39,138 INFO Configuration.deprecation: map.input.file is deprecated. Instead, use mapreduce.map.input.file 2020-10-21 01:52:39,140 INFO Configuration.deprecation: mapred.skip.on is deprecated. Instead, use mapreduce.job.skiprecords 2020-10-21 01:52:39,141 INFO Configuration.deprecation: map.input.length is deprecated. Instead, use mapreduce.map.input.length 2020-10-21 01:52:39,143 INFO Configuration.deprecation: mapred.job.id is deprecated. Instead, use mapreduce.job.id 2020-10-21 01:52:39,144 INFO Configuration.deprecation: user.name is deprecated. Instead, use mapreduce.job.user.name 2020-10-21 01:52:39,147 INFO Configuration.deprecation: mapred.task.partition is deprecated. Instead, use mapreduce.task.partition 2020-10-21 01:52:39,470 INFO streaming.PipeMapRed: R/W/S=1/0/0 in:NA [rec/s] out:NA [rec/s] 2020-10-21 01:52:39,471 INFO streaming.PipeMapRed: R/W/S=10/0/0 in:NA [rec/s] out:NA [rec/s] 2020-10-21 01:52:39,474 INFO streaming.PipeMapRed: R/W/S=100/0/0 in:NA [rec/s] out:NA [rec/s] 2020-10-21 01:52:39,481 INFO streaming.PipeMapRed: R/W/S=1000/0/0 in:NA [rec/s] out:NA [rec/s] 2020-10-21 01:52:39,584 INFO mapreduce.Job: Job job_local105213688_0001 running in uber mode : false 2020-10-21 01:52:39,586 INFO mapreduce.Job: map 0% reduce 0% 2020-10-21 01:52:40,423 INFO streaming.PipeMapRed: MRErrorThread done 2020-10-21 01:52:40,425 INFO streaming.PipeMapRed: R/W/S=1299/0/0 in:1299=1299/1 [rec/s] out:0=0/1 [rec/s] minRecWrittenToEnableSkip_=9223372036854775807 HOST=null USER=null HADOOP_USER=null last tool output: |null| java.io.IOException: The pipe has been ended at java.base/java.io.FileOutputStream.writeBytes(Native Method) at java.base/java.io.FileOutputStream.write(FileOutputStream.java:348) at java.base/java.io.BufferedOutputStream.write(BufferedOutputStream.java:123) at java.base/java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:81) at java.base/java.io.BufferedOutputStream.write(BufferedOutputStream.java:127) at java.base/java.io.DataOutputStream.write(DataOutputStream.java:107) at org.apache.hadoop.streaming.io.TextInputWriter.writeUTF8(TextInputWriter.java:72) at org.apache.hadoop.streaming.io.TextInputWriter.writeValue(TextInputWriter.java:51) at org.apache.hadoop.streaming.PipeMapper.map(PipeMapper.java:106) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:465) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:349) at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:271) at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) at java.base/java.lang.Thread.run(Thread.java:835) 2020-10-21 01:52:40,431 WARN streaming.PipeMapRed: {} java.io.IOException: The pipe is being closed at java.base/java.io.FileOutputStream.writeBytes(Native Method) at java.base/java.io.FileOutputStream.write(FileOutputStream.java:348) at java.base/java.io.BufferedOutputStream.write(BufferedOutputStream.java:123) at java.base/java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:81) at java.base/java.io.BufferedOutputStream.flush(BufferedOutputStream.java:142) at java.base/java.io.DataOutputStream.flush(DataOutputStream.java:123) at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:532) at org.apache.hadoop.streaming.PipeMapper.map(PipeMapper.java:120) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:465) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:349) at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:271) at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) at java.base/java.lang.Thread.run(Thread.java:835) 2020-10-21 01:52:40,433 INFO streaming.PipeMapRed: mapRedFinished 2020-10-21 01:52:40,435 WARN streaming.PipeMapRed: {} java.io.IOException: The pipe is being closed at java.base/java.io.FileOutputStream.writeBytes(Native Method) at java.base/java.io.FileOutputStream.write(FileOutputStream.java:348) at java.base/java.io.BufferedOutputStream.write(BufferedOutputStream.java:123) at java.base/java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:81) at java.base/java.io.BufferedOutputStream.flush(BufferedOutputStream.java:142) at java.base/java.io.DataOutputStream.flush(DataOutputStream.java:123) at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:532) at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61) at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:465) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:349) at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:271) at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) at java.base/java.lang.Thread.run(Thread.java:835) 2020-10-21 01:52:40,438 INFO streaming.PipeMapRed: mapRedFinished 2020-10-21 01:52:40,445 INFO mapred.LocalJobRunner: map task executor complete. 2020-10-21 01:52:40,506 WARN mapred.LocalJobRunner: job_local105213688_0001 java.lang.Exception: java.io.IOException: The pipe has been ended at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:492) at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:552) Caused by: java.io.IOException: The pipe has been ended at java.base/java.io.FileOutputStream.writeBytes(Native Method) at java.base/java.io.FileOutputStream.write(FileOutputStream.java:348) at java.base/java.io.BufferedOutputStream.write(BufferedOutputStream.java:123) at java.base/java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:81) at java.base/java.io.BufferedOutputStream.write(BufferedOutputStream.java:127) at java.base/java.io.DataOutputStream.write(DataOutputStream.java:107) at org.apache.hadoop.streaming.io.TextInputWriter.writeUTF8(TextInputWriter.java:72) at org.apache.hadoop.streaming.io.TextInputWriter.writeValue(TextInputWriter.java:51) at org.apache.hadoop.streaming.PipeMapper.map(PipeMapper.java:106) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:465) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:349) at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:271) at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) at java.base/java.lang.Thread.run(Thread.java:835) 2020-10-21 01:52:40,596 INFO mapreduce.Job: Job job_local105213688_0001 failed with state FAILED due to: NA 2020-10-21 01:52:40,625 INFO mapreduce.Job: Counters: 0 2020-10-21 01:52:40,625 ERROR streaming.StreamJob: Job not successful! Streaming Command Failed!

Over the years, there have been many releases of PowerShell. Initially, Windows PowerShell was built on the .NET Framework and only worked on Windows systems. With the current release, PowerShell uses .NET Core 3.1 as its runtime. PowerShell runs on Windows, macOS, and Linux platforms.

import sys cur_word= None cur_cnt=0 for line in sys.stdin: line=line.strip().split(',') # print(line) word, count= line[0], (int)(line[1]) if cur_word==None: cur_word=word cur_cnt=count elif cur_word==word: cur_cnt+=count else: print(cur_word,',', cur_cnt) cur_word=word cur_cnt=count print(cur_word,",",count)

1条回答

网友

1楼 · 发布于 2024-04-24 05:40:19

您的映射程序似乎不完整：缺少print word

我做了什么？

输入文件

我的Python代码

相关问题更多 >

编程相关推荐

热门问题

热门文章