正在尝试从索引数据帧中两行范围内的数据帧中查找值

2024-06-09 21:49:55 发布

您现在位置:Python中文网/ 问答频道 /正文

我在Python笔记本中有两个数据框-“种植者_moo”和“定价”,用于分析收获的作物和向种植者支付的价格

定价是索引数据框,grower_moo有各种独特的负载票据,其中包含每个负载的信息

如果该负荷的脂肪不大于下一个湿脂肪,我需要将每吨价格从定价指数中拉到负荷数据中的新列

下面是每个数据帧和我尝试的代码的.head()示例。我收到一个ValueError: Can only compare identically-labeled Series objects错误


pricing
    Price_Per_Ton   Wet_Fat
0       306            10
1       339            11
2       382            12
3       430            13
4       481            14
5       532            15
6       580            16
7       625            17
8       665            18
9       700            19
10      728            20
11      750            21
12      766            22
13      778            23
14      788            24
15      797            25

grower_moo
    Load Ticket     Net Fruit Weight  Net MOO  Percent_MOO  Fat
0   L2019000011817     56660           833     1.448872    21.92
1   L2019000011816     53680           1409    2.557679    21.12
2   L2019000011815     53560           1001    1.834644    21.36
3   L2019000011161     62320           2737    4.207080    21.41
4   L2019000011160     57940           1129    1.911324    20.06

grower_moo['price_per_ton'] = max(pricing[pricing['Wet_Fat'] < grower_moo['Fat']]['Price_Per_Ton'])

例如,13.60的产量小于14脂肪,因此每吨价格为430美元

grower_moo_with_price
    Load Ticket     Net Fruit Weight  Net MOO  Percent_MOO  Fat    price_per_ton
0   L2019000011817     56660           833     1.448872    21.92      750
1   L2019000011816     53680           1409    2.557679    21.12      750
2   L2019000011815     53560           1001    1.834644    21.36      750
3   L2019000011161     62320           2737    4.207080    21.41      750
4   L2019000011160     57940           1129    1.911324    20.06      728

Tags: 数据net价格fatprice定价pricingmoo
2条回答

这看起来像是“截至”合并的作业,pd.merge_asofdocumentation):

This is similar to a left-join except that we match on nearest key rather than equal keys. Both DataFrames must be sorted by the key.

For each row in the left DataFrame:

A "backward" search [the default] selects the last row in the right DataFrame whose ‘on’ key is less than or equal to the left’s key.

在下面的代码中,我使用您的示例输入,但是列名使用下划线_而不是空格

# Required by merge_asof: sort keys in left DataFrame
grower_moo = grower_moo.sort_values('Fat')

# Required by merge_asof: key column data types must match
pricing['Wet_Fat'] = pricing['Wet_Fat'].astype('float') 

# Perform the asof merge
res = pd.merge_asof(grower_moo, pricing, left_on='Fat', right_on='Wet_Fat')

# Print result
res
      Load_Ticket  Net_Fruit_Weight  Net_MOO  Percent_MOO    Fat  Price_Per_Ton  Wet_Fat
0  L2019000011160             57940     1129     1.911324  20.06            728     20.0
1  L2019000011816             53680     1409     2.557679  21.12            750     21.0
2  L2019000011815             53560     1001     1.834644  21.36            750     21.0
3  L2019000011161             62320     2737     4.207080  21.41            750     21.0
4  L2019000011817             56660      833     1.448872  21.92            750     21.0

# Optional: drop the key column from the right DataFrame
res.drop(columns='Wet_Fat')
      Load_Ticket  Net_Fruit_Weight  Net_MOO  Percent_MOO    Fat  Price_Per_Ton
0  L2019000011160             57940     1129     1.911324  20.06            728
1  L2019000011816             53680     1409     2.557679  21.12            750
2  L2019000011815             53560     1001     1.834644  21.36            750
3  L2019000011161             62320     2737     4.207080  21.41            750
4  L2019000011817             56660      833     1.448872  21.92            750
concat_df = pd.concat([grower_moo, pricing], axis)
cocnat_df = concat_df[concat_df['Wet_Fat'] < concat_df['Fat']]
del cocnat_df['Wet_Fat']

相关问题 更多 >