如何在pandas中基于行值创建新列

2024-04-16 02:44:41 发布

您现在位置:Python中文网/ 问答频道 /正文

我想遍历pandas中的行,并基于值创建一个新列。我的数据集在这里:

  Political Entity  Recipient ID           Recipient Recipient last name  \
0       Candidates          4350       Whelan, Susan              Whelan   
1       Candidates          4350       Whelan, Susan              Whelan   
2       Candidates          4350       Whelan, Susan              Whelan   
3       Candidates          4350       Whelan, Susan              Whelan   
4       Candidates         15453  Mastroianni, Steve         Mastroianni   

  Recipient first name Recipient middle initial Political Party of Recipient  \
0                Susan                      NaN      Liberal Party of Canada   
1                Susan                      NaN      Liberal Party of Canada   
2                Susan                      NaN      Liberal Party of Canada   
3                Susan                      NaN      Liberal Party of Canada   
4                Steve                      NaN      Liberal Party of Canada   

  Electoral District        Electoral event Fiscal/Election date  \
0              Essex  38th general election           2004-06-28   
1              Essex  38th general election           2004-06-28   
2              Essex  38th general election           2004-06-28   
3              Essex  38th general election           2004-06-28   
4  Windsor--Tecumseh  40th general election           2008-10-14   

        ...       Monetary amount Non-Monetary amount  \
0       ...                 800.0                 0.0   
1       ...                1280.0                 0.0   
2       ...                 250.0                 0.0   
3       ...                1000.0                 0.0   
4       ...                 800.0                 0.0   

我想创建一个新的专栏,其中包含政党和年份,并添加货币价值。例如:

^{2}$

我创建了几个函数来帮助入门:

def year_political_column(row):
    return row['Fiscal/Election date'][:4] + ' ' + row['Political Party of Recipient']


def monetary(row):
    return row['Monetary amount']

每当我查找我的解决方案时,似乎您必须已经设置了列。谁能把我引向正确的方向吗?在

样本输出应为:

  Political Entity  Recipient ID           Recipient Recipient last name  \
0       Candidates          4350       Whelan, Susan              Whelan   
1       Candidates          4350       Whelan, Susan              Whelan   
2       Candidates          4350       Whelan, Susan              Whelan   
3       Candidates          4350       Whelan, Susan              Whelan   
4       Candidates         15453  Mastroianni, Steve         Mastroianni   

  Recipient first name Recipient middle initial Political Party of Recipient  \
0                Susan                      NaN      Liberal Party of Canada   
1                Susan                      NaN      Liberal Party of Canada   
2                Susan                      NaN      Liberal Party of Canada   
3                Susan                      NaN      Liberal Party of Canada   
4                Steve                      NaN      Liberal Party of Canada   

  Electoral District        Electoral event Fiscal/Election date  \
0              Essex  38th general election           2004-06-28   
1              Essex  38th general election           2004-06-28   
2              Essex  38th general election           2004-06-28   
3              Essex  38th general election           2004-06-28   
4  Windsor--Tecumseh  40th general election           2008-10-14   

        ...       Monetary amount Non-Monetary amount  \
0       ...                 800.0                 0.0   
1       ...                1280.0                 0.0   
2       ...                 250.0                 0.0   
3       ...                1000.0                 0.0   
4       ...                 800.0                 0.0   

  Contribution given through Ontario first name Ontario last name  \
0                        NaN                J M            
1                        NaN                  J             
2                        NaN                  B            
3                        NaN                  H            
4                        NaN                  H            

   Ontario Address Ontario city Ontario Province Ontario Postal Code  \
0                

  Ontario Phone #  
0      
1      
2      
3      
4      

我要找的所有政治数据都附在右边。在


Tags: ofnamepartynangeneralcandidatespoliticalsusan
2条回答

这可以通过多种方式实现:

  • pivot
  • pivot_table
  • groupby

但是,大多数的文件都需要刷一下才能输出您需要的格式。如果您不想寻找聚合函数并且希望输入这些条目,那么只有数字2可以工作。在

def column_name(row):
    return '{} {}'.format(row['Fiscal/Election date'].year, row['initial Political Party of Recipient'])

df['Fiscal/Election date'] = pd.to_datetime(df['Fiscal/Election date'])

df['Column Name'] = df.apply(column_name, axis=1)

1)pivot_table

^{pr2}$

2)pivot

In [5]: (df[['Column Name', 'Monetary amount']]
   ...: .pivot(columns='Column Name', values='Monetary amount'))
Out[5]: 
Column Name  2004 Liberal Party of Canada  2008 Liberal Party of Canada
0                                   800.0                           NaN
1                                  1280.0                           NaN
2                                   250.0                           NaN
3                                  1000.0                           NaN
4                                     NaN                         800.0

3)groupby

In [6]: pd.DataFrame(df.groupby('Column Name')['Monetary amount'].sum()).transpo
   ...: se()
Out[6]: 
Column Name      2004 Liberal Party of Canada  2008 Liberal Party of Canada
Monetary amount                          3330                           800

使用选举年份和政党名称创建一个列,然后执行groupby并转置:

df['year_political'] = df['Fiscal/Election date'].astype(str).str.slice(0,4) + ' '+ df['Political Party of Recipient']
df.groupby('year_political')['Monetary amount'].sum().reset_index().transpose()

相关问题 更多 >