如何根据每行中的条件向数据帧中的列添加多个字符串？

nonmatch["Problem"] = np.where(nonmatch['rent'] != nonmatch['rent_doc'], "rent doesn't match", nonmatch["Problem"] + "") nonmatch["Problem"] = np.where(nonmatch['1xdisc']!=nonmatch['1xdisc_doc']), " 1xdisc doesn't match.", "") print(nonmatch[['Resident','Problem']])

2条回答

网友

1楼 · 编辑于 2024-05-19 18:49:34

我的看法是：

def get_match(c):
    def match(x):
        return f'{c} doesn\'t match.' if x else ''
    return match

onex = (df['1xdisc'] != df['1xdisc_doc']).map(get_match('1xdisc'))
rent = (df['rent']   != df['rent_doc']  ).map(get_match('rent'))

df.assign(Problem=(['  '.join(filter(bool, tup)) for tup in zip(rent, onex)]))

   Resident     Tcode     MoveIn  1xdisc  1xdisc_doc  conpark  rent  rent_doc                                     Problem
0    Marcus  t0011009  3/16/2021     0.0      -500.0      0.0     0      1632  rent doesn't match.  1xdisc doesn't match.
1    Joshua  t0011124  3/20/2021     0.0         0.0      0.0  1642      1642                                            
2    Yvonne  t0010940  3/17/2021  -500.0      -500.0      0.0  1655      1655                                            
3  Mirabeau  t0011005  3/19/2021  -500.0      -500.0      0.0  1931      1990                         rent doesn't match.
4   Keyonna  t0011084  3/18/2021     0.0         0.0      0.0  1600      1600                                            
5     Ariel  t0010954  3/22/2021  -300.0         0.0      0.0  1300      1320  rent doesn't match.  1xdisc doesn't match.

广义的

docs = [s for s in [*df] if s.endswith('_doc')]
refs = [s.rsplit('_', 1)[0] for s in docs]

def col_match(c):
    return [f"{c.name} doesn't match" if x else "" for x in c]

problem_df = (df[refs] != df[docs].to_numpy()).apply(col_match)
problem = ['  '.join(filter(bool, tup)) for tup in zip(*map(problem_df.get, refs))]
df.assign(Problem=problem)

网友

2楼 · 编辑于 2024-05-19 18:49:34

您也可以尝试使用concat和groupby+agg。正如piR所说，这可能是过度设计的：

c1 = df['rent'].ne(df['rent_doc'])
c2 = df['1xdisc'].ne(df['1xdisc_doc'])
choices= ["rent doesn't match"," 1xdisc doesn't match."]

s = pd.concat((c1,c2),keys=choices).swaplevel()
out = (df.assign(Problem=
      pd.DataFrame.from_records(s[s].index).groupby(0)[1].agg(" ".join)))

print(out)

   Resident     Tcode     MoveIn  1xdisc  1xdisc_doc  conpark  rent  rent_doc  \
0    Marcus  t0011009  3/16/2021     0.0      -500.0      0.0     0      1632   
1    Joshua  t0011124  3/20/2021     0.0         0.0      0.0  1642      1642   
2    Yvonne  t0010940  3/17/2021  -500.0      -500.0      0.0  1655      1655   
3  Mirabeau  t0011005  3/19/2021  -500.0      -500.0      0.0  1931      1990   
4   Keyonna  t0011084  3/18/2021     0.0         0.0      0.0  1600      1600   
5     Ariel  t0010954  3/22/2021  -300.0         0.0      0.0  1300      1320   

                                     Problem  
0  rent doesn't match  1xdisc doesn't match.  
1                                        NaN  
2                                        NaN  
3                         rent doesn't match  
4                                        NaN  
5  rent doesn't match  1xdisc doesn't match.

相关问题更多 >

编程相关推荐

热门问题

热门文章

如何根据每行中的条件向数据帧中的列添加多个字符串？

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >