擅长:python、mysql、java
<p>因为您的子列表中有重复项,所以这更像是一个<code>pivot</code>问题而不是<code>get_dummies</code>,但是您需要首先扩展您的子列表。你知道吗</p>
<p>您可以在这里使用<a href="https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.explode.html" rel="nofollow noreferrer">^{<cd3>}</a>后跟<a href="https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.crosstab.html" rel="nofollow noreferrer">^{<cd4>}</a>。你知道吗</p>
<hr/>
<pre><code>ii = df['items'].explode()
pd.crosstab(ii.index, ii)
</code></pre>
<p/>
<pre><code>items a b c d e f
row_0
0 1 0 0 0 0 0
1 1 1 0 0 0 0
2 0 0 0 1 1 2
3 0 0 0 1 1 1
4 1 1 1 0 0 0
</code></pre>
<hr/>
<p>性能</p>
<pre><code>df = pd.concat([df]*10_000, ignore_index=True)
In [91]: %timeit chris(df)
1.07 s ± 5.3 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
In [92]: %timeit user11871120(df)
15.8 s ± 124 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
In [93]: %timeit ricky_kim(df)
56.4 s ± 1.1 s per loop (mean ± std. dev. of 7 runs, 1 loop each)
</code></pre>