MRJob:在MapReduce中显示中间值

4 投票
1 回答
3651 浏览
提问于 2025-04-17 13:31

我想知道在用Python的MRJob库运行MapReduce程序时,怎么能在终端上显示一些中间值,比如打印一个变量或者一个列表?

1 个回答

6

你可以使用 sys.stderr.write() 将结果输出到标准错误。下面是一个例子:

from mrjob.job import MRJob
import sys
class MRWordCounter(MRJob):
    def mapper(self, key, line):
        sys.stderr.write("MAPPER INPUT: ({0},{1})\n".format(key,line))
        for word in line.split():
            yield word, 1

    def reducer(self, word, occurrences):
        occurencesList= list(occurrences)
        sys.stderr.write("REDUCER INPUT: ({0},{1})\n".format(word,occurencesList))
        yield word, sum(occurencesList)

if __name__ == '__main__':
    MRWordCounter.run()

撰写回答