<p>这是我的解决办法</p>
<p>我使用<code>pd.Series()</code>和<code>.repeat()</code>函数创建列式值序列</p>
<p>顺便说一句,您忘记了使用<code>(df['Age']<12) & (df['Age']>1)</code>将babies\u count从child\u count中排除</p>
<pre><code>def fare_calc(x):
group_size = x.shape[0]
ticket_fare = pd.Series(x['Fare'].mean().repeat(group_size))
babies_count = x[x['Age']<1 ]['Age'].count()
child_count = x[(df['Age']<12) & (df['Age']>1)]['Age'].count()
adult_count = group_size - babies_count - child_count
adult_fare = (ticket_fare - babies_count * 2) / (adult_count + child_count * 0.5)
return adult_fare
</code></pre>
<p>最后,使用<code>.values</code>单独提取由<code>apply</code>函数创建的堆叠序列的值,以防止出现“不兼容索引”类型错误</p>
<pre><code>df['Fare2'] = df[df.TicketFreq>1].groupby(['TicketNum']).apply(fare_calc).values
print(df)
Age Fare TicketNum TicketFreq Fare2
0 0.5 17 1 3 10.0
1 5.0 17 1 3 10.0
2 20.0 17 1 3 10.0
3 21.0 40 2 4 10.0
4 22.0 40 2 4 10.0
5 23.0 40 2 4 10.0
6 24.0 40 2 4 10.0
</code></pre>
<p>编辑1:以前功能的更直观版本:</p>
<pre><code>import pandas as pd
df = pd.DataFrame({'Age': [0.5,5,20,21,22,23,24], 'Fare': [17,17,17,40,40,40,40], 'TicketNum': [1,1,1,2,2,2,2]})
df['TicketFreq'] = df.groupby('TicketNum')['TicketNum'].transform('count')
def fare_calc(x):
group_size = x.shape[0]
x['ticket_fare'] = x['Fare'].mean()
babies_count = x[x['Age']<1 ]['Age'].count()
child_count = x[(df['Age']<12) & (df['Age']>1)]['Age'].count()
adult_count = group_size - babies_count - child_count
x['adult_fare'] = (x['ticket_fare'] - babies_count * 2) / (adult_count + child_count * 0.5)
return x['adult_fare']
df['Fare2'] = df[df.TicketFreq>1].groupby(['TicketNum']).apply(fare_calc).values
print(df)
Age Fare TicketNum TicketFreq Fare2
0 0.5 17 1 3 10.0
1 5.0 17 1 3 10.0
2 20.0 17 1 3 10.0
3 21.0 40 2 4 10.0
4 22.0 40 2 4 10.0
5 23.0 40 2 4 10.0
6 24.0 40 2 4 10.0
</code></pre>
<p>编辑2:如果直接在函数内部创建“Fare2”,则更简单</p>
<pre><code>import pandas as pd
df = pd.DataFrame({'Age': [0.5,5,20,21,22,23,24], 'Fare': [17,17,17,40,40,40,40], 'TicketNum': [1,1,1,2,2,2,2]})
df['TicketFreq'] = df.groupby('TicketNum')['TicketNum'].transform('count')
def fare_calc(x):
group_size = x.shape[0]
ticket_fare = x['Fare'].mean()
babies_count = x[x['Age']<1 ]['Age'].count()
child_count = x[(df['Age']<12) & (df['Age']>1)]['Age'].count()
adult_count = group_size - babies_count - child_count
x['Fare2'] = (ticket_fare - babies_count * 2) / (adult_count + child_count * 0.5)
return x
df = df[df.TicketFreq>1].groupby(['TicketNum']).apply(fare_calc)
print(df)
Age Fare TicketNum TicketFreq Fare2
0 0.5 17 1 3 10.0
1 5.0 17 1 3 10.0
2 20.0 17 1 3 10.0
3 21.0 40 2 4 10.0
4 22.0 40 2 4 10.0
5 23.0 40 2 4 10.0
6 24.0 40 2 4 10.0
</code></pre>