如何从字符串中提取多个int并附加到Pandas中的dataframe?

2024-06-06 14:22:18 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个数据帧df,看起来像

Pairing        Result
1001_1234_1235 1
1001_1233_1236 0
...

我想提取Pairing列中每行的最后2个int,并将它们放入新的列中。也就是说,我希望df现在看起来像

^{pr2}$

有人知道怎么做吗?在


Tags: 数据dfresultintpairingpr2
3条回答
import pandas as pd
import numpy as np

# assuming you have defined other columns in df here

# Create empty columns for the new int columns
df['First'] = np.NaN
df['Second'] = np.NaN

# For each element in Pairing
for i, pairing in enumerate(df['Pairing']):
    # split pairing into list based on underscores, get last two ints only
    ints = [int(x) for x in pairing.split('_')[-2:]]
    df['First'][i] = ints[0]
    df['Second'][i] = ints[1]

print(df)

新的df应该如下所示:

^{pr2}$

使用pandasstr操作可以很容易地做到这一点:

import pandas as pd

df = pd.DataFrame({
    'Pairing': ['1001_1234_1235', '1001_1233_1236'],
    'Result': [1, 0],
})

# split at '_', each result will become a new column
df2 = df['Pairing'].str.split('_', expand=True)

# convert to numbers
df2 = df2.astype(int)

#rename columns back to something useful
df2.columns = ['Pairing{}'.format(col) for col in df2.columns ]

# add the columns back to the old DataFrame
df = df.join(df2)

这将导致:

^{pr2}$

有关更多示例,请参见Pandas–使用文本数据:

http://pandas.pydata.org/pandas-docs/stable/text.html

如果您有pairing = '1001_1234_1235',那么

first = pairing.split("_")[-2]
second = pairing.split("_")[-1]

相关问题 更多 >