Pandas中如何用字典替换df行中的重复项

2024-06-17 12:44:51 发布

您现在位置:Python中文网/ 问答频道 /正文

你能帮我解决一个问题吗。我有一个数据帧

import pandas as pd
df = pd.DataFrame(
    data=[
        ['one',12],
        ['two two',2],
        ['three one',4],
        ['four two',1],
        ['number "five"',9],
        ['red',1],
        ['extra sample',1],
        ['yellow red',1],
        ['hard',4],
        ['soft hard',2],
        ['simple',3],
        ['sample' ,4],
        ['diff sample',1]
    ],
    columns=['object_name', 'amount']
)
print(df)
   object_name     amount
0   one            12
1   two two        2
2   three one      4
3   four two       1
4   number "five"  9
5   red            1
6   extra sample   1
7   yellow red     1
8   hard           4
9   soft hard      2
10  simple         3
11  sample         4
12  diff sample    1

我需要替换第1、3、2、4等图纸中的副本。我可以这样做:

def simple_func(name):
    if 'two' in name:
        return 'two'
    else:
        return name
df['object_name'] = df['object_name'].apply(simple_func)
print(df)
    object_name     amount
0   one             12
1   two             2
2   three one       4
3   two             1
4   number "five"   9
5   red             1
6   extra sample    1
7   yellow red      1
8   hard            4
9   soft hard       2
10  simple          3
11  sample          4
12  diff sample     1

但问题是,我有很多这样的重复和一些键有几个值。我想用字典代替它们。我做了这样的词典

some_dict = {'numbers':['one','two','five'], 'colors':'red', 'sample':'sample'}

我创造了这样的功能

def some_func(name):
    for key in some_dict:
        if type(some_dict[key]) is list:
            for value in some_dict[key]:
                if value in name:
                    return key
                else:
                    return name
        else:
            if some_dict[key] in name:
                    return key
            else:
                    return name

当我尝试使用它的时候

df['object_name'] = df['object_name'].apply(some_func)

只替换第一个键的第一个值。你知道吗

print(df)
    object_name     amount
0   numbers         12
1   two             2
2   numbers         4
3   two             1
4   number "five"   9
5   red             1
6   extra sample    1
7   yellow red      1
8   hard            4
9   soft hard       2
10  simple          3
11  sample          4
12  diff sample     1

因此,我想得到这样的东西

object_name amount
0   number  12
1   number  2
2   number  4
3   number  1
4   number  9
5   colors  1
6   sample  1
7   colors  1
8   hard    4
9   soft hard   2
10  simple  3
11  sample  4
12  sample  1

你能指出我的错误吗? 我会感谢你的帮助!你知道吗


Tags: samplekeynamenumberdfreturnobjectsome
2条回答

我想你也可以用^{}

for y,x in some_dict.items():
    if isinstance(x,list):
        for val in x:
            df.loc[df['object_name'].str.contains(val),'object_name']=y
    else:
           df.loc[df['object_name'].str.contains(x),'object_name']=y

print(df)

   object_name  amount
0      numbers      12
1      numbers       2
2      numbers       4
3      numbers       1
4      numbers       9
5       colors       1
6       sample       1
7       colors       1
8         hard       4
9    soft hard       2
10      simple       3
11      sample       4
12      sample       1

想法是删除else statements并添加return name以结束get original value,如果dict中不匹配:

def some_func(name):
    for k, v in some_dict.items():
        if isinstance(v, list):
            for value in v:
                if value in name:
                    return k
        else:
            if v in name:
                return k
    return name

df['object_name'] = df['object_name'].apply(some_func)
print (df)
   object_name  amount
0      numbers      12
1      numbers       2
2      numbers       4
3      numbers       1
4      numbers       9
5       colors       1
6       sample       1
7       colors       1
8         hard       4
9    soft hard       2
10      simple       3
11      sample       4
12      sample       1

你的职能应该改变:

def some_func(name):
    for key in some_dict:
        if type(some_dict[key]) is list:
            for value in some_dict[key]:
                if value in name:
                    return key

        else:
            if some_dict[key] in name:
                    return key
    return name

相关问题 更多 >