回答此问题可获得 20 贡献值,回答如果被采纳可获得 50 分。
<p>我有两个不同大小的数据帧。你知道吗</p>
<p><code>df1</code>有地址,没有zipcodes。
<code>df2</code>有地址和zipcodes。你知道吗</p>
<p>我正在尝试使用<code>np.where</code>匹配从<code>df1</code>到<code>df2</code>的地址,如果存在匹配,则将相应的zipcode带到<code>df1</code>。你知道吗</p>
<p>不过,我刚刚意识到,这不适用于不同大小的数据帧。你知道吗</p>
<p>第一个没有zipcodes的数据帧:</p>
<pre><code>df1 = pd.DataFrame({'address1':['1 o\'toole st','2 main st','3 high street','5 foo street','10 foo street'],
'address2':['town1',np.nan,np.nan,'Bartown',np.nan],
'address3':[np.nan,'village','city','county2','county3']})
df1['zipcode']=''
print(df1)
address1 address2 address3 zipcode
0 1 o'toole st town1 NaN
1 2 main st NaN village
2 3 high street NaN city
3 5 foo street Bartown county2
4 10 foo street NaN county3
</code></pre>
<p>要从中获取zipcodes的第二个数据帧:</p>
<pre><code>df2 = pd.DataFrame({'address1':['1 o\'toole st','2 main st','7 mill street','5 foo street','10 foo street','asda'],
'address2':['town1','village','city','Bartown','county3','efsefs'],
'address3':[np.nan,np.nan,np.nan,'county2','USA','asdasd'],
'zipcode': ['er45','qw23','rt67','yu89','yu83','aedsa']})
print(df2)
address1 address2 address3 zipcode
0 1 o'toole st town1 NaN er45
1 2 main st village NaN qw23
2 7 mill street city NaN rt67
3 5 foo street Bartown county2 yu89
4 10 foo street county3 USA yu83
5 asda efsefs asdasd aedsa
</code></pre>
<p>使用<code>np.where</code>填充<code>df1['zipcode']</code>列。如果两个地址都匹配,则返回<code>df2['zipcode']</code>否则<code>'no_match'</code>:</p>
<pre><code>df1['zipcode'] = np.where(df1['address1'].isin(df2['address1']), df2['zipcode'], 'no_match')
ValueError Traceback (most recent call last)
<ipython-input-176-499624d43d5c> in <module>
----> 1 df1['zipcode'] = np.where(df1['address1'].isin(df2['address1']), df2['zipcode'], 'no_match')
2 df1
ValueError: operands could not be broadcast together with shapes (5,) (6,) ()
</code></pre>
<p>有没有可能这样做np.哪里'和不同大小的数据帧?或者有没有更好的方法来搜索匹配项并跨越zipcode?你知道吗</p>