为什么numpy形状为空？

1 投票

1 回答

5137 浏览

提问于 2025-04-18 12:04

我有以下内容：

(Pdb) training
array(<418326x223957 sparse matrix of type '<type 'numpy.float64'>'
    with 165657096 stored elements in Compressed Sparse Row format>, dtype=object)
(Pdb) training.shape
()

为什么没有形状信息呢？

补充说明：这是我所做的：

training, target, test, projectids = generate_features(outcomes, projects, resources)
target = np.array([1. if i == 't' else 0. for i in target])
projectids = np.array([i for i in projectids])

print 'vectorizing training features'
d = DictVectorizer(sparse=True)
training = d.fit_transform(training[:10].T.to_dict().values())
#test_data = d.fit_transform(training.T.to_dict().values())
test_data = d.transform(test[:10].T.to_dict().values())

print 'training shape: %s, %s' %(training.shape[0], training[1])
print 'test shape: %s, %s' %(test_data.shape[0], test_data[1])

print 'saving vectorized instances'
with open(filename, "wb") as f:
    np.save(f, training)
    np.save(f, test_data)
    np.save(f, target)
    np.save(f, projectids)

在这个时候，我的训练数据的形状还是 (10, 121)。

后来，我只是重新初始化了这4个变量：

with open("../data/f1/training.dat", "rb") as f:
    training = np.load(f)
    test_data = np.load(f)
    target = np.load(f)
    projectids = np.load(f)

但是形状信息就消失了。

numpy 数组形状训练数据数据初始化

1 个回答

这里有一些形状信息在

array(<418326x223957 sparse matrix of type '<type 'numpy.float64'>'
    with 165657096 stored elements in Compressed Sparse Row format>, dtype=object)

这是一个只有一个元素的数组，而且没有维度，所以它的形状是 ()。这个元素的类型是 dtype=object。具体来说，它是一个稀疏数组，显示的维度是 <418...x22...。

我本来想问关于 DictVectorizer 和 fit_transform 的事情，但这不重要。关键是保存和加载操作会改变数值。

我猜测你没有加载你刚刚写入的文件。

你的 np.save(f,training) 是把稀疏矩阵包裹在一个 np.array 中，类型是 object。这就是你在加载时看到的情况。

training = training.item()

这个操作是把稀疏矩阵从那个数组包装中取出来。

是 418326x223957 代表了完整数据集的 training 的形状，而 (10, 121) 是缩减后的调试集的形状吗？

回答于 2025-04-18 由 Python大师

分享举报

为什么numpy形状为空？

1 个回答

撰写回答