在csv python中折叠类别

2024-05-31 23:49:27 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个数据框“locations”,其中包含一些商店的类型,它非常混乱,有很多不同的类别,所以我想合并一些类别,这样就有越来越少的简单类别。我该怎么做

例如:

  store        type
 mcdonalds     fast-food
 nandos        sit-down-food
 wetherspoons  tech-pub
 southsider    pub-and-dine

我喜欢把快餐和坐着吃的食物合并成“食物”,把科技酒吧和酒馆、餐厅合并成“酒馆”。我该怎么做


Tags: 数据store类型foodtype类别商店食物
2条回答

可以使用由要替换为所需类型的类型键入的dict作为值。然后将列设置为列表,替换类型,但保留所需的类型

# Dict specifying the types to replace
type_dict = {'fast-food':'food','sit-down-food':'food',
             'tech-pub':'pub','pub-and-dine':'pub'}
# Replace types that are dict keys but keep the values that aren't dict keys
df['type'] = [type_dict.get(i,i) for i in df['type']]

我的第一反应是使用pandas apply函数来映射所需的值。大致如下:

import pandas as pd

def nameMapper(name):
    if "food" in name:
        return "food"
    elif "pub" in name:
        return "pub"
    else:
        return "something else"


data = [
     ["mcdonalds", "fast-food"], 
     ["nandos","sit-down-food"],
     ["wetherspoons","tech-pub"],
     ["southsider","pub-and-dine"]
     ]

df = pd.DataFrame(data, columns={"store", "type"})
print(df)

print("             -")

df["type"] = df["type"].apply(nameMapper)
print(df)

当我运行这个程序时,产生了以下输出

$ python3 answer.py 
          store           type
0     mcdonalds      fast-food
1        nandos  sit-down-food
2  wetherspoons       tech-pub
3    southsider   pub-and-dine
             -
          store  type
0     mcdonalds  food
1        nandos  food
2  wetherspoons   pub
3    southsider   pub

相关问题 更多 >