我发布了一个“第1部分”的问题,让我找到了我需要的函数的答案here,但我认为这个问题是有道理的。如果没有,我就搬走。你知道吗
我想对一个数据帧应用一个函数,它将全州名替换为缩写(New York -> NY
)。然而,我注意到,在我的数据集中,如果一个国家是资本化的,它显然不会匹配的dicitonary。我试着解决它,但似乎无法破解代码:
import pandas as pd
import numpy as np
dfp = pd.DataFrame({'A' : [np.NaN,np.NaN,3,4,5,5,3,1,5,np.NaN],
'B' : [1,0,3,5,0,0,np.NaN,9,0,0],
'C' : ['Pharmacy of IDAHO','NY Pharma','NJ Pharmacy','Idaho Rx','CA Herbals','Florida Pharma','AK RX','Ohio Drugs','PA Rx','USA Pharma'],
'D' : [123456,123456,1234567,12345678,12345,12345,12345678,123456789,1234567,np.NaN],
'E' : ['Assign','Unassign','Assign','Ugly','Appreciate','Undo','Assign','Unicycle','Assign','Unicorn',]})
import us
statez = us.states.mapping('abbr', 'name')
inv_map = {v: k for k, v in statez.items()}
def replace_states(company):
# find all states that exist in the string
state_found = filter(lambda state: state.lower() in company.lower(), statez.values())
# replace each state with its abbreviation
for state in state_found:
print(state, inv_map[state])
company = company.replace(state, inv_map[state])
print("---" , company)
# return the modified string (or original if no states were found)
return company
dfp['C'] = dfp['C'].map(replace_states)
产出:注意“爱达荷州的药房”没有变化
Idaho ID
--- Pharmacy of IDAHO
Idaho ID
--- ID Rx
Florida FL
--- FL Pharma
Ohio OH
--- OH Drug
有没有办法使这个函数不区分大小写?你知道吗
用缩写替换状态名称(不区分大小写的矢量化解决方案):
结果:
说明:
用名称替换状态缩写(不区分大小写的矢量化解决方案):
我会找到它的索引,然后用它来替换它,不管大小写:
这将保留字符串所有其他部分的大小写。你知道吗
相关问题 更多 >
编程相关推荐