怎么会呢数据框groupby（'A'）.agg（'min'）转换为featuretools？

使用熊猫：

数据

df = pd.DataFrame({'A': [1, 1, 2, 2], 'B': [1, 2, 3, 4], 'C': [0.3, 0.2, 1.2, -0.5]})

分组和聚合

groupby_A(min_B) groupby_A(min_C) A 1 1 0.2 2 3 -0.5

合并

df_new = pd.merge(df,df_result,on='A') df_new

A B C groupby_A(min_B) groupby_A(min_C) 0 1 1 0.3 1 0.2 1 1 2 0.2 1 0.2 2 2 3 1.2 3 -0.5 3 2 4 -0.5 3 -0.5

尝试使用功能工具：

# ---- Import the Module ---- import featuretools as ft # ---- Make the Entity Set (the set of all tables) ---- es = ft.EntitySet() # ---- Make the Entity (the table) ---- es.entity_from_dataframe(entity_id = 'df', dataframe = df) # ---- Do the Deep Feature Synthesis (group, aggregate, and merge the features) ---- feature_matrix, feature_names = ft.dfs(entityset = es, target_entity = 'df', trans_primitives = ['cum_min']) feature_matrix

A B C CUM_MIN(A) CUM_MIN(B) CUM_MIN(C) index 0 1 1 0.3 1 1 0.3 1 1 2 0.2 1 1 0.2 2 2 3 1.2 1 1 0.2 3 2 4 -0.5 1 1 -0.5

使用Pandas的操作如何转化为featuretools（最好不添加另一个表）？在

我对featuretools的尝试没有给出正确的输出，但是我相信我使用的过程在某种程度上是正确的。在

1条回答

网友

1楼 · 发布于 2024-06-10 16:11:56

下面是在Featuretools中推荐的方法。您确实需要创建另一个表以使其完全按照您的需要工作。在

import featuretools as ft
import pandas as pd

df = pd.DataFrame({'A': [1, 1, 2, 2],
                   'B': [1, 2, 3, 4],
                   'C': [0.3, 0.2, 1.2, -0.5]})

es = ft.EntitySet()

es.entity_from_dataframe(entity_id="example",
                          index="id",
                          make_index=True,
                          dataframe=df)

es.normalize_entity(new_entity_id="a_entity",
                    base_entity_id="example",
                    index="A")

fm, fl = ft.dfs(target_entity="example",
                entityset=es,
                agg_primitives=["min"])

fm

这就回来了

^{pr2}$

如果不想创建额外的表，可以尝试使用cum_min原语，该原语按A分组后计算累计值

df = pd.DataFrame({'A': [1, 1, 2, 2],
                   'B': [1, 2, 3, 4],
                   'C': [0.3, 0.2, 1.2, -0.5]})

es = ft.EntitySet()

es.entity_from_dataframe(entity_id="example",
                          index="id",
                          make_index=True,
                          variable_types={
                              "A": ft.variable_types.Id
                          },
                          dataframe=df,)

fm, fl = ft.dfs(target_entity="example",
                entityset=es,
                groupby_trans_primitives=["cum_min"])

fm

这就回来了

    B    C  A  CUM_MIN(C) by A  CUM_MIN(B) by A
id                                             
0   1  0.3  1              0.3              1.0
1   2  0.2  1              0.2              1.0
2   3  1.2  2              1.2              3.0
3   4 -0.5  2             -0.5              3.0

使用熊猫：

数据

分组和聚合

合并

尝试使用功能工具：

相关问题更多 >

编程相关推荐

热门问题

热门文章