擅长:python、mysql、java
<p>通过<code>groupby()</code>、<code>map()</code>、<code>agg()</code>和<code>join()</code>尝试:</p>
<pre><code>df['itinerary']= (df['ID'].map(df.groupby('ID')[['dep','arr']]
.agg(list)
.sum(1)
.agg(lambda x:list(set(x))[::-1]).str.join('-')))
</code></pre>
<p><code>df</code>的输出:</p>
<pre><code> dep arr ID step itinerary
0 NYC PAR idx1 1 NYC-PAR-SYD
1 PAR SYD idx1 2 NYC-PAR-SYD
2 MAD BCN idx2 1 MAD-BCN
</code></pre>
<h2>更新:</h2>
<p>我不认为这是有效的,但可以做到:</p>
<pre><code>#run the above code first
check=(df['ID'].map(df.groupby('ID')[['dep','arr']]
.agg(list)
.sum(1)
.agg(lambda x:list(x)[::-1]).str.join('-')))
splited=check.str.split('-')
#in this code we don't used set so basically it's for comparision
cond=(splited.str.len().eq(4)) & (splited.str[0]==splited.str[-1])
</code></pre>
<p>最后:</p>
<pre><code>df.loc[cond,'itinerary']=splited.str[-1]+'-'+df.loc[cond,'itinerary']
</code></pre>