Pandas：根据应用于字符串的条件选择行

网友

1楼 · 编辑于 2024-04-20 13:28:26

试试这个：

keep = [] #hold all the rows you want to keep
for key in frame_dict.keys():
    frame = frame_dict[key]
    keep.append(
        frame[frame['A'].astype(str).str.contains('^\d\d02', regex=True)].copy()
    ) #append the rows matching regex for start of word (^), digit (\d), digit (\d), 02 
final = pd.concat(keep) #concatenate the matching rows

网友

2楼 · 编辑于 2024-04-20 13:28:26

第一行创建一个索引器，该索引器检查A列的第3个和第4个字符，并为任何带有“02”的内容返回布尔索引器True/false。在

第二行在应用索引器后从原始数据帧创建一个新的数据帧。在

indexer = df['A'].apply(lambda x: x[2:4] == '02')
results = df.loc[indexer]

编辑：上面的解决方案适用于数据帧字典。在

^{pr2}$

网友

3楼 · 编辑于 2024-04-20 13:28:26

像下面这样的东西怎么样，其中d是你的口述：

pd.concat((v[v.A.str[2:4] == '02'] for v in d.itervalues()))

使用由示例数据帧重复3次和键组成的dict 0-2

^{pr2}$

这就产生了：

          A   B
2  10020001   5
3  10020002  11
4  10020003   2
2  10020001   5
3  10020002  11
4  10020003   2
2  10020001   5
3  10020002  11
4  10020003   2

这应该比创建行列表或使用列表理解更节省内存，因为它使用生成器表达式。由于直接索引（假设数据值是标准化的），它也应该比使用regex更快。在

如果您不喜欢组合数组的索引，可以始终reset_index()。例如：

y = pd.concat((v[v.A.str[2:4] == '02'] for v in d.itervalues()))
y.reset_index.drop('index', axis=1)

          A   B
0  10020001   5
1  10020002  11
2  10020003   2
3  10020001   5
4  10020002  11
5  10020003   2
6  10020001   5
7  10020002  11
8  10020003   2

相关问题更多 >

编程相关推荐

热门问题

热门文章

Pandas：根据应用于字符串的条件选择行

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >