擅长:python、mysql、java
<p>好吧,我已经在同一个问题上坚持了两天了。。乔在他的<a href="https://stackoverflow.com/questions/4460522/hadoop-streaming-job-failed-error-in-python">other post</a>中提供的解决方案对我很有效。。</p>
<p>为了解决你的问题,我建议:</p>
<p>1)盲目地、仅盲目地遵循关于如何设置单节点集群<a href="http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-single-node-cluster/" rel="nofollow noreferrer">here</a>的说明(我假设您已经这样做了)</p>
<p>2)如果在任何地方遇到java.io.IOException:不兼容的namespaceIDs错误(如果检查日志,您会发现它),请查看<a href="http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-multi-node-cluster/#java-io-ioexception-incompatible-namespaceids" rel="nofollow noreferrer">here</a></p>
<p>3)在示例运行中,从命令中删除所有双引号</p>
<pre><code>./bin/hadoop jar contrib/streaming/hadoop-0.20.2-streaming.jar \
-input "p1input/*" \
-output p1output \
-mapper p1mapper.py \
-reducer p1reducer.py \
-file /Users/Tish/Desktop/HW1/p1mapper.py \
-file /Users/Tish/Desktop/HW1/p1reducer.py
</code></pre>
<p>这太荒谬了,但我却坚持了整整两天</p>