基于使用python在另一列中查找字符来更改列的值

1条回答

网友

1楼 · 发布于 2024-06-08 22:36:04

这是一种您可以尝试的高级方法。根据数据集的脏程度，它可能有用，但归根结底，这是一个NLP/AI问题

# using regular expressions _may_ make your life easier
import re

# these regex are for entertainment purposes only
# no warranty of efficacy or fitness for any specific purpose is implied
country_patterns = {
  "Canada" : re.compile(r'canada'),
  "USA": re.compile(r'(usa)|(us)|(united states)|(united states of america|america)'),
  "Japan": re.compile(r'(japan)|(nihon)',
   # etc...
}

for index, row in df.iterrows():
   # a double loop now, so we can check each country pattern against the city
   for country, pattern in country_patterns.items()
       # ignoring the case will simplify creating the regular expressions
       if re.match(pattern, row['city'], re.IGNORECASE):
           df.loc[index, 'country'] = country
           # move on to the next row, since we found a match
           break

您必须创建自己的正则表达式，这可能会成为它自己的一个难题

相关问题更多 >

编程相关推荐

热门问题

热门文章

基于使用python在另一列中查找字符来更改列的值

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >