我看了一个例子 Vectorized look-up of values in Pandas dataframe
但不知怎么的,我的问题有点不同,我找不到正确的方法,这是一个简单的问题
所以我有一个数据帧
PoliceStations_raw=pd.DataFrame(
[['BAYVIEW' ,37.729732,-122.397981],
['CENTRAL' ,37.798732,-122.409919],
['INGLESIDE' ,37.724676,-122.446215],
['MISSION' ,37.762849,-122.422005],
['NORTHERN' ,37.780186,-122.432467],
['PARK' ,37.767797,-122.455287],
['RICHMOND' ,37.779928,-122.464467],
['SOUTHERN' ,37.772380,-122.389412],
['TARAVAL' ,37.743733,-122.481500],
['TENDERLOIN',37.783674,-122.412899]],columns=['PdDistrict','XX','YY'])
我还定义了
^{pr2}$然后我有另一个表df,它有一个列'PdDistrict',它包含一个分类变量,可以采用'BAYVIEW'、'CENTRAL'等值。。。在
我想要一个专栏 df['XX']它将为每一行df返回警察局中相应的条目。。。在
我找不到正确的语法。。。谢谢你的帮助
如果可能的话,我更喜欢使用PoliceStations_raw(而不是转置表)的语法,因为我认为这个表更“自然”。。。在
我试过了,但没用
df_raw['value'] = PoliceStations.lookup('XX',df_raw['PdDistrict'])
--------------------------------------------------------------------------- ValueError Traceback (most recent call last) in () ----> 1 df_raw['value'] = PoliceStations.lookup('XX',df_raw['PdDistrict'])
/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pandas/core/frame.pyc in lookup(self, row_labels, col_labels) 2641 n = len(row_labels) 2642 if n != len(col_labels): -> 2643 raise ValueError('Row labels must have same size as column labels') 2644 2645 thresh = 1000
ValueError: Row labels must have same size as column labels
虽然我认为我没有犯过标签错误
df_raw['PdDistrict'].cat.categories
Index([u'BAYVIEW', u'CENTRAL', u'INGLESIDE', u'MISSION', u'NORTHERN', u'PARK', u'RICHMOND', u'SOUTHERN', u'TARAVAL', u'TENDERLOIN'], dtype='object')
编辑:
我也在尝试以下方法:
PoliceStations_raw=pd.DataFrame(
[['BAYVIEW' ,37.729732,-122.397981],
['CENTRAL' ,37.798732,-122.409919],
['INGLESIDE' ,37.724676,-122.446215],
['MISSION' ,37.762849,-122.422005],
['NORTHERN' ,37.780186,-122.432467],
['PARK' ,37.767797,-122.455287],
['RICHMOND' ,37.779928,-122.464467],
['SOUTHERN' ,37.772380,-122.389412],
['TARAVAL' ,37.743733,-122.481500],
['TENDERLOIN',37.783674,-122.412899]],columns=['PdDistrict','XX','YY'])
df1=pd.DataFrame([[0,'CENTRAL'],[1,'TARAVAL'],[3,'CENTRAL'],[2,'BAYVIEW']])
df1.columns = ['Index','PdDistrict']
Index PdDistrict
0 0 CENTRAL
1 1 TARAVAL
2 3 CENTRAL
3 2 BAYVIEW
尽管输入了sort=False,但返回的对象已经合并了表,但使用PdDistrict作为索引,并更改了原始左dataframe的行的顺序。在
救命啊!在
pd.merge(df1,PoliceStations_raw,sort=False)
把这个还给我
Index PdDistrict XX YY
0 0 CENTRAL 37.798732 -122.409919
1 3 CENTRAL 37.798732 -122.409919
2 1 TARAVAL 37.743733 -122.481500
3 2 BAYVIEW 37.729732 -122.397981
目前没有回答
相关问题 更多 >
编程相关推荐