根据当前行的值选择行

Restaurant contains: id, zone id Zone ... 0 11 H5X ... 1 12 H2A 2 13 H5X 3 14 H53 4 15 H21 ... Category contains: id, category id category ... 0 11 Sushi ... 1 12 Fast Food 2 13 Sandwich 3 13 Sushi 4 14 Noodle 5 14 Fast Food 6 15 Bakeries ...

id Zone intersection 0 11 H5X 1 (since there is one restaurant, id=13, that is in the same zone(H5X and have at least one category in common, Sushi) 1 12 H2A 0 3 13 H5X 1 (since there is one restaurant, id =11, that is in the same zone (h5x) andat least one category in common , sushi) 5 14 H53 0 6 15 H21 0

1条回答

网友

1楼 · 发布于 2024-06-16 09:46:23

import pandas as pd 

# create both datasets
df1 = pd.DataFrame({
    'id': [11, 12, 13, 14, 15],
    'zone': ['H5X', 'H2A', 'H5X', 'H53', 'H21']
})
df1.head()

df2 = pd.DataFrame({
    'id': [11, 12, 13, 13, 14, 14, 15],
    'category': ['Sushi', 'Fast food', 'Sandwich', 'Sushi', 'Noodle', 'Fats food', 'Bakeries']
})
df2.head()

# merge datasets based on restaurant id
df3 = pd.merge(df1, df2, how='left', on=['id'])
df3.reset_index(drop=True, inplace=True)
df3.head()

输出：

# count repeating zone / category
cnt = df3.groupby(['zone', 'category']).size().to_frame('count')
cnt.head(10)

输出：

# merge counts to first dataframe to achieve desired result
df4 = pd.merge(df1, cnt, how='left', on='zone')
df4['count'] = df4['count'].apply(lambda x: 0 if x <=1 else 1)
df4.rename(columns={'count': 'intersection'}, inplace=True)
df4.head()

输出：

相关问题更多 >

编程相关推荐

热门问题

热门文章