java Hadoop映射器参数解释

3 日，9 小时 Questions & Answers 708

我是Hadoop的新手，对Mapper参数感到困惑

以众所周知的字数为例：

class TokenizerMapper extends Mapper<Object, Text, Text, IntWritable> {
  private Text outputKey;
  private IntWritable outputVal;

  @Override
  public void setup(Context context) {
    outputKey = new Text();
    outputVal = new IntWritable(1);
  }

  @Override
  public void map(Object key, Text value, Context context) throws IOException, InterruptedException {
    StringTokenizer stk = new StringTokenizer(value.toString());
    while(stk.hasMoreTokens()) {
      outputKey.set(stk.nextToken());
      context.write(outputKey, outputVal);
    }
  }
}

参见map函数，参数是Object key、Text value和Context context，我对Object key看起来像什么感到困惑（你看，key从未在Map函数中使用过）

由于输入文件格式如下：

Deer
Beer
Bear
Beer
Deer
Deer
Bear
...

我知道值看起来像每一行Deer、Beer，等等。它们是逐行处理的

但是键看起来怎么样？如何决定应使用哪种数据类型的键

Python中文网

有 Java 编程相关的问题?

java Hadoop映射器参数解释

共 (1) 个答案

# 1 楼答案