MRJob:在MapReduce中显示中间值
我想知道在用Python的MRJob库运行MapReduce程序时,怎么能在终端上显示一些中间值,比如打印一个变量或者一个列表?
1 个回答
6
你可以使用 sys.stderr.write() 将结果输出到标准错误。下面是一个例子:
from mrjob.job import MRJob
import sys
class MRWordCounter(MRJob):
def mapper(self, key, line):
sys.stderr.write("MAPPER INPUT: ({0},{1})\n".format(key,line))
for word in line.split():
yield word, 1
def reducer(self, word, occurrences):
occurencesList= list(occurrences)
sys.stderr.write("REDUCER INPUT: ({0},{1})\n".format(word,occurencesList))
yield word, sum(occurencesList)
if __name__ == '__main__':
MRWordCounter.run()