Python：基于另一列的条件函数和R创建新列

df = pd.Series(['Fruit[edit]','Apple','Orange','Banana','Vegetable[edit]','Celery','Beans','Kale']) 0 Fruit[edit] 1 Apple 2 Orange 3 Banana 4 Vegetable[edit] 5 Celery 6 Beans 7 Kale

3条回答

网友

1楼 · 编辑于 2024-04-25 02:23:09

这可能很难看，但工作：

df = pd.DataFrame(df) #since df is a series
df['Name']=df.groupby(df[0].str.contains('edit').cumsum())[0].apply(lambda x: x.shift(-1))
df=df.dropna().rename(columns={0:'Category'})
df.loc[~df.Category.str.contains('edit'),'Category']=np.nan
df.Category=df.Category.ffill()
df.Category=df.Category.str.split("[").str[0]
print(df)

    Category    Name
0      Fruit   Apple
1      Fruit  Orange
2      Fruit  Banana
4  Vegetable  Celery
5  Vegetable   Beans
6  Vegetable    Kale

网友

2楼 · 编辑于 2024-04-25 02:23:09

你可以用结构提取物要根据关键字的存在提取组

new_df = df.str.extract('(?P<Category>.*\[edit\])?(?P<Name>.*)')\
.replace('\[edit\]', '', regex = True).ffill()\
.replace('', np.nan).dropna()

    Category    Name
1   Fruit   Apple
2   Fruit   Orange
3   Fruit   Banana
5   Vegetable   Celery
6   Vegetable   Beans
7   Vegetable   Kale

网友

3楼 · 编辑于 2024-04-25 02:23:09

用途：

#if necessary convert Series to DataFrame 
df = df.to_frame('Name')
#get rows with edit
mask = df['Name'].str.endswith('[edit]')
#remove edit
df.loc[mask, 'Name'] = df['Name'].str[:-6]
#create Category column
df.insert(0, 'Category', df['Name'].where(mask).ffill())
#remove rows with same values in columns
df = df[~mask].copy()
print (df)
    Category    Name
1      Fruit   Apple
2      Fruit  Orange
3      Fruit  Banana
5  Vegetable  Celery
6  Vegetable   Beans
7  Vegetable    Kale

相关问题更多 >

编程相关推荐

热门问题

热门文章

Python：基于另一列的条件函数和R创建新列

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >