<p>基于这个问题,有两件事:</p>
<ol>
<li>从clearbit返回的数据中提取域</li>
<li>与熊猫合作</li>
</ol>
<hr/>
<ol>
<li>clearbitapi返回一个字典。您只需执行以下操作:</li>
</ol>
<p>像这样:</p>
<pre><code>data = clearbit.NameToDomain.find(name=n)
print(data) # Dictionary
print(data['domain']) # Domain value
</code></pre>
<ol start=“2”>
<li>对于熊猫,你不需要在数据上循环</li>
</ol>
<p>使用apply</p>
<pre><code>import pandas as pd
from urllib.parse import urlparse
def parse_url(x):
return 'unknown' if pd.isnull(x) else urlparse(x)[1]
df = pd.read_csv("./new.csv")
df['domain'] = df['Profile URL'].apply(parse_url)
df_new = df.loc[:, ['Company', 'domain']]
</code></pre>
<h2>编辑:</h2>
<p>clearbit解析器的实现方式如下(<em>我没有尝试过这段代码,但它应该可以工作</em>):</p>
<pre><code>import clearbit
def parse_url(x):
return 'unknown' if pd.isnull(x)
data = clearbit.NameToDomain.find(name=x)
return data.get('domain', 'Default value')
</code></pre>
<blockquote>
<p>This code imports data from the CSV provided. You may instead call the clearbit API in the parse_url method and return appropriate value.</p>
<p>This solution works on Python3. Please take it as a starting point and not as a copy-paste solution.</p>
</blockquote>