复制Pandas中的行并添加新的(月)列

2024-05-15 21:47:19 发布

您现在位置:Python中文网/ 问答频道 /正文

我确信这是一个简单的问题。我有一个数据帧,它大约有1000 rows个数据帧是唯一的

这将按类别和位置显示expenses for the year。每个位置都有相同的类别组

我想为每个站点的每个费用创建一个monthly budget

我还想创建一个“年初至今预算”列,将该年度的总预算除以12,得到一个月度数字。然后将该数字乘以月份(4月=1月),得到一个年初至今的值-例如,5月将是月度数字*2等

我正试图用熊猫来做这件事。我试过了

    pd.DataFrame(np.repeat(budget.values,12,axis=0)) #replicate each row by 12

我当时的计划是迭代每个组的每一行来添加月份,但我正在努力实现任何目标

任何帮助都将不胜感激

(抱歉,我无法正确粘贴表格-请参见图片)

当前

+------------+-------------+--------+
|  Location  |  Expense    | Amount |
+------------+-------------+--------+
| Sheffield  | Electricity |  10000 |
| Sheffield  | Gas         |  12000 |
| Manchester | Electricity |  15000 |
| Manchester | Electricity |  13000 |
+------------+-------------+--------+

渴望的

+------------+-------------+--------+--------+---------+-------+
|  Location  |  Expense    | Amount | Budget |  Month  |  YTD  |
+------------+-------------+--------+--------+---------+-------+
| Sheffield  | Electricity |  10000 |  10000 | April   |  1000 |
| Sheffield  | Electricity |  10000 |  10000 | May     |  2000 |
| Sheffield  | Electricity |  10000 |  10000 | June    |  3000 |
| Sheffield  | Electricity |  10000 |  10000 | July    |  4000 |
| Sheffield  | Electricity |  10000 |  10000 | August  |  5000 |
| Sheffield  | Electricity |  10000 |  10000 | Sep     |  6000 |
| Sheffield  | Electricity |  10000 |  10000 | Oct     |  7000 |
| Sheffield  | Electricity |  10000 |  10000 | Dec     |  8000 |
| Sheffield  | Electricity |  10000 |  10000 | Jan     |  9000 |
| Sheffield  | Electricity |  10000 |  10000 | Feb     | 10000 |
| Sheffield  | Electricity |  10000 |  10000 | March   | 11000 |
| Sheffield  | Gas         |  12000 |  20000 | April   |  2000 |
| Sheffield  | Gas         |  12000 |  20000 | May     |  4000 |
| Sheffield  | Gas         |  12000 |  20000 | June... |  6000 |
| Sheffield  | Gas         |  12000 |  20000 | ..March |  8000 |
| Manchester | Electricity |  15000 |  36000 | April   |  4000 |
| Manchester | Electricity |  15000 |  36000 | May     |  8000 |
+------------+-------------+--------+--------+---------+-------+

   

Tags: 数据数字location类别amountmaybudgetelectricity
1条回答
网友
1楼 · 发布于 2024-05-15 21:47:19

您可以创建特定月份表,将四月作为会计年度起始月份的数字1

import pandas as pd 
  
# intialise data from list. 
data = {'Month':['April', 'May', 'June', 'July', 'August', 'September', 'October', 'November', 'December', 'January', 'February', 'March'], \
        'Number':range(1,13), \
        'key': [1] * 12 }

  
# Create DataFrame 
df_months = pd.DataFrame(data)

生成如下表:

+      +    +  -+
| Month      | Number | key |
+      +    +  -+
| April      |  1     | 1   |
| May        |  2     | 1   |
| June       |  3     | 1   |
| July       |  4     | 1   |
| August     |  5     | 1   |
| September  |  6     | 1   |
| October    |  7     | 1   |
| November   |  8     | 1   |
| December   |  9     | 1   |
| January    |  10    | 1   |
| February   |  11    | 1   |
| March      |  12    | 1   |
+      +    +  -+

现在调整您的第二个表,我们将其命名为df_amounts,以拥有一个ficitonal键列(key),它将确保每个月加入到每个位置/费用组合中:

df_amounts['key'] = 1

df_金额:

+      +      -+    +  -+
|  Location  |  Expense    | Amount | key |
+      +      -+    +  -+
| Sheffield  | Electricity |  10000 | 1   |
| Sheffield  | Gas         |  12000 | 1   |
| Manchester | Electricity |  15000 | 1   |
| Manchester | Electricity |  13000 | 1   |
+      +      -+    +  -+

然后连接key上的表:

df = pd.merge(df_amounts, df_months, on="key", how="left")

要获取下表,请执行以下操作:

+      +      -+    +  -+   -+    +
|  Location  |  Expense    | Amount | key | Month | Number | 
+      +      -+    +  -+   -+    +

现在,将列number除以12,然后将该值乘以Amount得到新列YTD

df['YTD']= df['Amount'] * (df['Number'] / 12)

您的每月Budget专栏的工作原理与此类似:

df['Budget']= df['Amount'] / 12

相关问题 更多 >