将列中的pandas值替换为来自不同datafram的查找

2024-05-13 23:02:49 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个参考数据帧,如下所示:

    Variables   Key Values  
0   GRTYPE      40  Total exclusions 4-year schools
1   GRTYPE      2   4-year institutions, Adjusted cohort
2   GRTYPE      3   4-year institutions, Completers 
41  CHRTSTAT    2   Revised cohort
42  CHRTSTAT    3   Exclusions
43  CHRTSTAT    4   Adjusted cohort 
57  SECTION     12  Bachelors/ equiv .
58  SECTION     23  Bachelors or equiv 2009 .

我想用参考数据框替换下面主数据框中的值:

^{pr2}$

最终结果将是:

    GRTYPE                                CHRTSTAT          SECTION
0   Total exclusions 4-year schools         Revised cohort       Bachelors/ equiv . 
1   4-year institutions, Adjusted cohort    Exclusions           Bachelors/ equiv .         
2   4-year institutions, Adjusted cohort    Adjusted cohort      Bachelors or equiv 2009 .      
3   4-year institutions, Completers         Revised cohort       Bachelors/ equiv . 
4   4-year institutions, Completers         Exclusions           Bachelors or equiv 2009 .  

在熊猫或Python身上最好的方法是什么?我尝试从第一个数据帧连接和提取变量,然后在第二个数据帧上循环,但是没有得到任何结果。在


Tags: or数据sectionyeartotalcohortrevisedadjusted
2条回答

使用map

您需要将VariablesKey设置为映射数据帧的索引,然后简单地对列使用map。在

mapping_df = mapping_df.set_index(['Variables', 'Key'])
df = df.apply(lambda x: x.map(mapping_df.loc[x.name]['Values']))

同:

^{pr2}$

输出:

                                 GRTYPE         CHRTSTAT                    SECTION
0       Total exclusions 4-year schools   Revised cohort         Bachelors/ equiv .
1  4-year institutions, Adjusted cohort       Exclusions         Bachelors/ equiv .
2  4-year institutions, Adjusted cohort  Adjusted cohort  Bachelors or equiv 2009 .
3       4-year institutions, Completers   Revised cohort         Bachelors/ equiv .
4       4-year institutions, Completers       Exclusions  Bachelors or equiv 2009 .

使用defualtdict

from collections import defaultdict

d = defaultdict(dict)
for i, k, v in df1.itertuples(index=False):
    d[i][k] = v

pd.DataFrame(dict(zip(df2, [[d[i][k] for k in df2[i]] for i in df2])), df2.index)

                                 GRTYPE         CHRTSTAT                    SECTION
0       Total exclusions 4-year schools   Revised cohort         Bachelors/ equiv .
1  4-year institutions, Adjusted cohort       Exclusions         Bachelors/ equiv .
2  4-year institutions, Adjusted cohort  Adjusted cohort  Bachelors or equiv 2009 .
3       4-year institutions, Completers   Revised cohort         Bachelors/ equiv .
4       4-year institutions, Completers       Exclusions  Bachelors or equiv 2009 .

apply

^{pr2}$

相关问题 更多 >