panda使用groupby和其他列值添加列条件

2024-04-27 03:12:49 发布

您现在位置:Python中文网/ 问答频道 /正文

我将pandas.DataFrame称为companysubset,但实际数据要长得多。在

        conm                       fyear    dvpayout    industry    firmycount  ipodate
46078   CAESARS ENTERTAINMENT CORP  2003    0.226813    Services    22  19891213.0
46079   CAESARS ENTERTAINMENT CORP  2004    0.226813    Services    22  19891213.0
46080   CAESARS ENTERTAINMENT CORP  2005    0.226813    Services    22  19891213.0
46091   CAESARS ENTERTAINMENT CORP  2016    0.226813    Services    22  19891213.0
114620  CAESARSTONE LTD 2010    0.487543    Manufacturing   10  20120322.0
114621  CAESARSTONE LTD 2011    0.487543    Manufacturing   10  20120322.0
114622  CAESARSTONE LTD 2012    0.487543    Manufacturing   10  20120322.0
114623  CAESARSTONE LTD 2013    0.487543    Manufacturing   10  20120322.0
114624  CAESARSTONE LTD 2014    0.487543    Manufacturing   10  20120322.0
114625  CAESARSTONE LTD 2015    0.487543    Manufacturing   10  20120322.0
114626  CAESARSTONE LTD 2016    0.487543    Manufacturing   10  20120322.0
132524  CAFEPRESS INC   2010    0.000000    Retail Trade    7   20120329.0
132525  CAFEPRESS INC   2011    0.000000    Retail Trade    7   20120329.0
132526  CAFEPRESS INC   2012    -0.000000   Retail Trade    7   20120329.0
132527  CAFEPRESS INC   2013    -0.000000   Retail Trade    7   20120329.0
132528  CAFEPRESS INC   2014    -0.000000   Retail Trade    7   20120329.0
132529  CAFEPRESS INC   2015    -0.000000   Retail Trade    7   20120329.0
132530  CAFEPRESS INC   2016    -0.000000   Retail Trade    7   20120329.0
120049  CAI INTERNATIONAL INC   2005    0.000000    Services    12  20070516.0
120050  CAI INTERNATIONAL INC   2006    0.000000    Services    12  20070516.0
3896    CALAMP CORP 1999    -0.000000   Manufacturing   23  NaN
3897    CALAMP CORP 2000    0.000000    Manufacturing   23  NaN
3898    CALAMP CORP 2001    0.000000    Manufacturing   23  NaN
3899    CALAMP CORP 2002    0.000000    Manufacturing   23  NaN
21120   CALATLANTIC GROUP INC   1995    -0.133648   Construction    22  NaN
21121   CALATLANTIC GROUP INC   1996    -0.133648   Construction    22  NaN
21122   CALATLANTIC GROUP INC   1997    -0.133648   Construction    22  NaN
21123   CALATLANTIC GROUP INC   1998    -0.133648   Construction    22  NaN
21124   CALATLANTIC GROUP INC   1999    -0.133648   Construction    22  NaN
21125   CALATLANTIC GROUP INC   2000    -0.133648   Construction    22  NaN
21126   CALATLANTIC GROUP INC   2001    -0.133648   Construction    22  NaN
21127   CALATLANTIC GROUP INC   2002    -0.133648   Construction    22  NaN
21128   CALATLANTIC GROUP INC   2003    -0.133648   Construction    22  NaN

1)我想按行业计算公司dvpayout的四分位数,并添加名为dv的列,并指出它在Q1Q2Q3或{}。在

我想出了这个代码,但它不起作用。在

^{pr2}$

2)如果有ipodate,我想添加名为age的列。该值将是一年中最大的fyear - ipodate。(例如2016 - 1989表示CAESARS ENTERTAINMENT COR

我想看到的结果数据框如下。在

        conm            fyear    dvpayout   industry    firmycount  ipodate  dv   age
46078   CAESARS ...     2003    0.226813    Services    22  19891213.0   Q2  27
46079   CAESARS ...     2004    0.226813    Services    22  19891213.0   Q2  27
46080   CAESARS ...     2005    0.226813    Services    22  19891213.0   Q2  27
46091   CAESARS ...     2016    0.226813    Services    22  19891213.0   Q2  27
114620  CAESARSTONE LTD 2010    0.487543    Manufacturing   10  20120322.0  Q3  4
114621  CAESARSTONE LTD 2011    0.487543    Manufacturing   10  20120322.0  Q3  4
114622  CAESARSTONE LTD 2012    0.487543    Manufacturing   10  20120322.0  Q3  4
114623  CAESARSTONE LTD 2013    0.487543    Manufacturing   10  20120322.0  Q3  4
114624  CAESARSTONE LTD 2014    0.487543    Manufacturing   10  20120322.0  Q3  4
114625  CAESARSTONE LTD 2015    0.487543    Manufacturing   10  20120322.0  Q3  4
114626  CAESARSTONE LTD 2016    0.487543    Manufacturing   10  20120322.0  Q3  4
132524  CAFEPRESS INC   2010    0.000000    Retail Trade    7   20120329.0  Q1  4
132525  CAFEPRESS INC   2011    0.000000    Retail Trade    7   20120329.0  Q1  4
132526  CAFEPRESS INC   2012    -0.000000   Retail Trade    7   20120329.0  Q1  4
132527  CAFEPRESS INC   2013    -0.000000   Retail Trade    7   20120329.0  Q1  4
132528  CAFEPRESS INC   2014    -0.000000   Retail Trade    7   20120329.0  Q1  4
132529  CAFEPRESS INC   2015    -0.000000   Retail Trade    7   20120329.0  Q1  4
132530  CAFEPRESS INC   2016    -0.000000   Retail Trade    7   20120329.0  Q1  4
120049  CAI INTERNATIONAL INC   2006    0.000000    Services    12  20070516.0 Q1  0
120050  CAI INTERNATIONAL INC   2007    0.000000    Services    12  20070516.0 Q1  0
3896    CALAMP CORP 1999    -0.000000   Manufacturing   23  NaN   Q1  Nan  
3897    CALAMP CORP 2000    0.000000    Manufacturing   23  NaN   Q1  Nan
3898    CALAMP CORP 2001    0.000000    Manufacturing   23  NaN   Q1  Nan
3899    CALAMP CORP 2002    0.000000    Manufacturing   23  NaN   Q1  Nan
21120   CALATLANTIC GROUP INC   1995    -0.133648   Construction    22  NaN   Q1  Nan
21121   CALATLANTIC GROUP INC   1996    -0.133648   Construction    22  NaN   Q1  Nan
21122   CALATLANTIC GROUP INC   1997    -0.133648   Construction    22  NaN   Q1  Nan
21123   CALATLANTIC GROUP INC   1998    -0.133648   Construction    22  NaN   Q1  Nan
21124   CALATLANTIC GROUP INC   1999    -0.133648   Construction    22  NaN   Q1  Nan
21125   CALATLANTIC GROUP INC   2000    -0.133648   Construction    22  NaN   Q1  Nan
21126   CALATLANTIC GROUP INC   2001    -0.133648   Construction    22  NaN   Q1  Nan
21127   CALATLANTIC GROUP INC   2002    -0.133648   Construction    22  NaN  Q1  Nan
21128   CALATLANTIC GROUP INC   2003    -0.133648   Construction    22  NaN  Q1  Nan

提前谢谢!!!!在


Tags: groupnaninctradecorpconstructionltdmanufacturing
1条回答
网友
1楼 · 发布于 2024-04-27 03:12:49

“年龄”列可以通过以下方式生成:

代码

df.set_index(['conm'], inplace=True)
df['age'] = df.groupby(level=0).apply(
    lambda x: max(x.fyear) - round(x.ipodate.iloc[0]/10000-0.5))

测试代码:

^{pr2}$

结果:

^{3}$

相关问题 更多 >