无法在Amazon SageM上开始训练

2024-04-24 20:55:25 发布

您现在位置:Python中文网/ 问答频道 /正文

def input_csv_fn():
    #filenames = np.load(file_io.FileIO(npy_file, 'r'))
    Dataset = tf.data.TextLineDataset(csv_file).skip(1).shuffle(buffer_size = 2000000).map(parser_csv, num_parallel_calls = cpu_count())
    #Dataset = Dataset.prefetch(2560)
    #Dataset = Dataset.shuffle(buffer_size = 1280)
    Dataset = Dataset.map(input_parser_plain, num_parallel_calls = cpu_count())
    Dataset = Dataset.apply(tf.contrib.data.ignore_errors())
    Dataset = Dataset.repeat(epochs)
    Dataset = Dataset.batch(batch_size)
    Dataset = Dataset.prefetch(batch_size)
    iterator = Dataset.make_one_shot_iterator()
    feats, labs = iterator.get_next()
    return feats, labs
def aggregate_csv_batches():
    features = []
    labels = []
    # add if GPU exists condition here to fit GPU and CPU data processing
    if num_gpus > 0:
        num_devices = num_gpus
    else:
        num_devices = 1
    for i in range(num_devices):
        _features, _labels = input_csv_fn()
        features.append(_features)
        labels.append(_labels)
    return features, labels
return aggregate_csv_batches 

上面是通过CSV从S3 bucket读取数据集的代码,但是当我尝试这样做的时候,当我在AWS Sagemaker上创建培训作业时,我经常会遇到以下错误

TypeError: Failed to convert object of type <type 'function'> to Tensor. Contents: function aggregate_csv_batches at 0x7f1559eeaaa0. Consider casting elements to a supported type.


Tags: csvtoinputdatasizelabelsreturnbatch
1条回答
网友
1楼 · 发布于 2024-04-24 20:55:25

没有更多细节很难确定问题。您是否在培训作业中使用SageMaker Tensorflow映像运行此代码?如果是,你看过这里的文件了吗?https://github.com/aws/sagemaker-python-sdk/blob/master/src/sagemaker/tensorflow/README.rst

仅从错误消息来看,似乎您正在传递函数本身(聚合\u csv\u批处理)到某个需要张量的地方。你知道吗

如果你能提供你正在运行的完整代码和如何运行它的描述,或者更好的是提供一个最小的repro案例,我可以试着进一步帮助你。你知道吗

相关问题 更多 >