`ClassificationDataSet`中的`target`有什么用？

Question

我一直在尝试弄清楚ClassificationDataSet中的target参数到底有什么用，但我还是不太明白。

我尝试过的

>>> from pybrain.datasets import ClassificationDataSet
>>> help(ClassificationDataSet)
Help on class ClassificationDataSet in module pybrain.datasets.classification:

class ClassificationDataSet(pybrain.datasets.supervised.SupervisedDataSet)
 |  Specialized data set for classification data. Classes are to be numbered from 0 to nb_classes-1.
 |  
 |  Method resolution order:
 |      ClassificationDataSet
 |      pybrain.datasets.supervised.SupervisedDataSet
 |      pybrain.datasets.dataset.DataSet
 |      pybrain.utilities.Serializable
 |      __builtin__.object
 |  
 |  Methods defined here:
 |  
 |  __add__(self, other)
 |      Adds the patterns of two datasets, if dimensions and type match.
 |  
 |  __init__(self, inp, target=1, nb_classes=0, class_labels=None)
 |      Initialize an empty dataset. 
 |      
 |      `inp` is used to specify the dimensionality of the input. While the 
 |      number of targets is given by implicitly by the training samples, it can
 |      also be set explicity by `nb_classes`. To give the classes names, supply
 |      an iterable of strings as `class_labels`.
 |  
 |  __reduce__(self)

这个参数的说明里没有关于target的信息（除了默认值是1），所以我查看了ClassificationDataSet的源代码：

class ClassificationDataSet(SupervisedDataSet):
    """ Specialized data set for classification data. Classes are to be numbered from 0 to nb_classes-1. """

    def __init__(self, inp, target=1, nb_classes=0, class_labels=None):
        """Initialize an empty dataset.

        `inp` is used to specify the dimensionality of the input. While the
        number of targets is given by implicitly by the training samples, it can
        also be set explicity by `nb_classes`. To give the classes names, supply
        an iterable of strings as `class_labels`."""
        # FIXME: hard to keep nClasses synchronized if appendLinked() etc. is used.
        SupervisedDataSet.__init__(self, inp, target)
        self.addField('class', 1)
        self.nClasses = nb_classes
        if len(self) > 0:
            # calculate class histogram, if we already have data
            self.calculateStatistics()
        self.convertField('target', int)
        if class_labels is None:
            self.class_labels = list(set(self.getField('target').flatten()))
        else:
            self.class_labels = class_labels
        # copy classes (may be changed into other representation)
        self.setField('class', self.getField('target'))

但还是不太清楚，所以我又看了SupervisedDataSet的源代码：

class SupervisedDataSet(DataSet):
    """SupervisedDataSets have two fields, one for input and one for the target.
    """

    def __init__(self, inp, target):
        """Initialize an empty supervised dataset.

        Pass `inp` and `target` to specify the dimensions of the input and
        target vectors."""
        DataSet.__init__(self)
        if isscalar(inp):
            # add input and target fields and link them
            self.addField('input', inp)
            self.addField('target', target)
        else:
            self.setField('input', inp)
            self.setField('target', target)

        self.linkFields(['input', 'target'])

        # reset the index marker
        self.index = 0

        # the input and target dimensions
        self.indim = self.getDimension('input')
        self.outdim = self.getDimension('target')

看起来这个参数是和输出的维度有关。但那target不应该是nb_classes吗？

machine learning dataset data representation source code analysis classification supervised learning output dimension parameter explanation

`ClassificationDataSet`中的`target`有什么用？

我尝试过的

1 个回答

撰写回答