为Pandas DataFrame列的单行赋值

1 投票

3 回答

2676 浏览

提问于 2025-04-18 14:32

我正在尝试在Pandas的DataFrame中重新给某一行的某一列赋值。

import pandas as pd
import numpy as np

这是我的DataFrame：

test_df = pd.DataFrame({'range_total' : [3000,3000,3000,3000,3000,3000,0,2000,2000,1000,1000,1000,1000,1000,1000],
    'high_boundary' : [6,6,6,6,6,6,7,9,9,15,15,15,15,15,15],
    'dist_num' : [1197, 142, 142, 1197, 159, 159, 0, 1000, 1000, 398, 50, 50, 398, 50, 50],
    'round_num_sum' : [2996, 2996, 2996, 2996, 2996, 2996, 0, 2000, 2000, 996, 996, 996, 996, 996, 996]})

在我的代码中，我针对每个high_boundary的值对DataFrame进行筛选，然后找到对应于dist_num最大值的test_df的索引（如果有多个最大值，就选第一个）。在这个例子中，我把索引设置为：

sub_idx = 0

我可以用这个（还有其他类似的代码）来访问这个值：

test_df.ix[(test_df.high_boundary == 6), "dist_num"][sub_idx]

这段代码返回：

但是，当我尝试赋一个新值时却失败了：

test_df.ix[(test_df.high_boundary == 6), "dist_num"][sub_idx] = 42
test_df.ix[(test_df.high_boundary == 6), "dist_num"][sub_idx]

这段代码仍然返回：

但是：

test_df.ix[(test_df.high_boundary == 6), "dist_num"] = 42
test_df.ix[(test_df.high_boundary == 6), "dist_num"]

返回的是：

0    42
1    42 
2    42
3    42
4    42
5    42
Name: dist_num, dtype: int64

我非常感谢任何帮助。这是我第一次发帖，因为到现在为止，我总能在Stack Overflow上找到我需要的东西。我使用的版本是0.14.0。

数据处理数据分析 pandas dataframe 数据筛选数据赋值行列操作最大值索引

3 个回答

在几年后重新查看这段代码时，我发现上面提到的解决方案现在出现了一个错误（使用的是Pandas版本0.20.1和Python 2.7.13）：TypeError: 'Series' objects are mutable, thus they cannot be hashed。如果其他人也遇到这个问题，我在下面添加了一个解决方案。

为了更新一个pd.DataFrame子集中的单个元素，我们首先找到了子集的索引值，然后使用对应的行索引来选择要更新的元素。

sub_idx = 0

indices = test_df.loc[test_df.high_boundary == 6,"dist_num"].index
print(test_df.loc[indices[sub_idx],"dist_num"])
# 1197
test_df.loc[indices[sub_idx],"dist_num"] = 0 

print(test_df.loc[indices[sub_idx],"dist_num"])
# 0

回答于 2025-04-18 由 Python大师

分享举报

我以前也遇到过类似的问题。建议你看看这个例子：

http://pandas.pydata.org/pandas-docs/stable/indexing.html

特别是这里的部分：

http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy，这会对你有帮助。

简单来说，如果你使用 df[][] 这种方式来切片数据，通常是先切到一个系列，然后再取值。对于 pandas 来说，跟踪你最初的筛选条件，以便让你能把结果写回去，是不太可行的。

所以，简单的建议是，尝试用一个单一的操作符，比如 ".loc"，来进行你想要赋值的选择。

回答于 2025-04-18 由 Python大师

分享举报

有时候，你可能会得到原始数据表 test_df 的一部分的副本。

特别是当你使用 [...][...] 来选择元素的时候。

这样你其实是在副本中修改了一个值，而不是在原始的 test_df 中修改。

你可以试试这个例子：

test_df["dist_num"].ix(test_df.high_boundary == 6)[sub_idx] = 0

这样你应该能得到你想要的结果。

回答于 2025-04-18 由 Python大师

分享举报

为Pandas DataFrame列的单行赋值

3 个回答

撰写回答