擅长:python、mysql、java
<p>可以使用正则表达式模式<code>str.extract</code>提取值:</p>
<pre><code>import re
s = pd.Series(ref_dict).explode()
# extract company
df['COMPANY'] = df['DESC_DETAIL'].str.extract(
f"({'|'.join(s.index.unique())})", flags=re.IGNORECASE)
# extract device
df['DEVICE'] = df['DESC_DETAIL'].str.extract(
f"({'|'.join(s)})", flags=re.IGNORECASE)
# fill missing company values based on device
df['COMPANY'] = df['COMPANY'].fillna(
df['DEVICE'].str.lower().map(dict(zip(s.str.lower(), s.index))))
df
</code></pre>
<p>输出:</p>
<pre><code> DESC_DETAIL COMPANY DEVICE
0 Probably task Company2 C2_Dev5 Company2 C2_Dev5
1 File system C3_Dev1 Company3 C3_Dev1
2 Weather subcutaneous Company2 Company2 NaN
3 Company1 Travesty C1_Dev3 Company1 C1_Dev3
4 Does not match anything NaN NaN
</code></pre>