擅长:python、mysql、java
<p>如有必要,使用<a href="http://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.str.contains.html" rel="nofollow noreferrer">^{<cd1>}</a>进行检查,使用<a href="https://docs.scipy.org/doc/numpy/reference/generated/numpy.where.html" rel="nofollow noreferrer">^{<cd2>}</a>进行按条件列的清理,然后使用<a href="http://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.str.replace.html" rel="nofollow noreferrer">^{<cd3>}</a>和callback进行仅由字典替换的必要行:</p>
<pre><code>pat = '|'.join(correct_domain.keys())
m = df['emails'].str.contains(pat, na=False)
df['result'] = np.where(m, 'email cleaned', 'no cleaning needed')
df.loc[m, 'emails'] = (df.loc[m, 'emails']
.str.replace(pat, lambda x: correct_domain[x.group()], regex=True))
print (df)
emails result
0 jim@gmail.com email cleaned
1 bob@gmail.com no cleaning needed
2 mary@gmail.com email cleaned
3 bobby@gmail.com no cleaning needed
</code></pre>