为什么python的apply函数有时可以,有时不能改变dataframe的值?

2024-05-13 01:18:17 发布

您现在位置:Python中文网/ 问答频道 /正文

def replace_name(row):
    if row['Country Name'] == 'Korea, Rep.':
        row['Country Name'] = 'South Korea'
    if row['Country Name'] == 'Iran, Islamic Rep.':
        row['Country Name'] = 'Iran'
    if row['Country Name'] == 'Hong Kong SAR, China':
        row['Country Name'] = 'Hong Kong'
    return row

GDP.apply(replace_name, axis = 1)

国内生产总值是pd.数据帧'

在这个时候,当我想找到'韩国',它没有工作,名称仍然是'韩国,代表'

但是如果我把代码的最后一行改成

GDP = GDP.apply(replace_name, axis = 1)

它起作用了。你知道吗

一开始,我认为原因是‘apply’函数不能改变GDP本身,但当我处理另一个数据帧时,它实际上起作用了。代码如下:

def change_name(row):
    if row['Country'] == "Republic of Korea":
        row['Country'] = 'South Korea'
    if row['Country'] == 'United States of America':
        row['Country'] = 'United States'
    if row['Country'] == 'United Kingdom of Great Britain and Northern Ireland':
        row['Country']  ='United Kingdom'
    if row['Country'] == 'China, Hong Kong Special Administrative Region':
        row['Country'] = 'Hong Kong'
    return row

energy.apply(change_name, axis = 1)

能源也是一种能源pd.数据帧'. 你知道吗

这一次,当我搜索“美国”,它的工作。原来的名字是“美利坚合众国”,所以它成功地改名了。你知道吗

能源和GDP之间的唯一区别是能源是从excel文件读取的,GDP是从CSV文件读取的。那么是什么导致了不同的结果呢?你知道吗


Tags: 数据nameifcountryreplaceunited能源row
1条回答
网友
1楼 · 发布于 2024-05-13 01:18:17

我认为最好使用^{}

d = {'Korea, Rep.':'South Korea', 'Iran, Islamic Rep.':'Iran', 
     'Hong Kong SAR, China':'Hong Kong'}
GDP['Country Name'] = GDP['Country Name'].replace(d, regex=True)

对于可能存在的差异,数据中的一些空白可能有帮助:

GDP['Country'] = GDP['Country'].str.strip()

样品:

GDP = pd.DataFrame({'Country Name':[' Korea, Rep. ','a','Iran, Islamic Rep.','United States of America','s','United Kingdom of Great Britain and Northern Ireland'],
                    'Country':     ['s','Hong Kong SAR, China','United States of America','Hong Kong SAR, China','s','f']})

#print (GDP)

d = {'Korea, Rep.':'South Korea', 'Iran, Islamic Rep.':'Iran', 
     'United Kingdom of Great Britain and Northern Ireland':'United Kingdom',
     'Hong Kong SAR, China':'Hong Kong', 'United States of America':'United States'}

#replace by columns
#GDP['Country Name'] = GDP['Country Name'].replace(d, regex=True)
#GDP['Country'] = GDP['Country'].replace(d, regex=True)

#replace multiple columns
GDP[['Country Name','Country']] = GDP[['Country Name','Country']].replace(d, regex=True)
print (GDP)
         Country    Country Name
0              s     South Korea
1      Hong Kong               a
2  United States            Iran
3      Hong Kong   United States
4              s               s
5              f  United Kingdom

相关问题 更多 >