我有一个df
,由不同主题的XY坐标填充。我想创建一个新列,从这些主题中获取指定的XY坐标
当在'Person'
列中突出显示任何主题的名称时,就可以实现这一点。这将返回该主题在该索引处的XY坐标
import pandas as pd
import numpy as np
import random
AA = 10, 20
k = 5
N = 10
df = pd.DataFrame({
'John Doe_X' : np.random.uniform(k, k + 100 , size=N),
'John Doe_Y' : np.random.uniform(k, k + 100 , size=N),
'Kevin Lee_X' : np.random.uniform(k, k + 100 , size=N),
'Kevin Lee_Y' : np.random.uniform(k, k + 100 , size=N),
'Liam Smith_X' : np.random.uniform(k, k + -100 , size=N),
'Liam Smith_Y' : np.random.uniform(k, k + 100 , size=N),
'Event' : ['AA', 'nan', 'BB', 'nan', 'nan', 'CC', 'nan','CC', 'DD','nan'],
'Person' : ['nan','nan','John Doe','John Doe','nan','Kevin Lee','nan','Liam Smith','John Doe','John Doe']})
df['X'] = df.apply(lambda row: row.get(row['Person']+'_X') if pd.notnull(row['Person']) else np.nan, axis=1)
df['Y'] = df.apply(lambda row: row.get(row['Person']+'_Y') if pd.notnull(row['Person']) else np.nan, axis=1)
输出:
Event John Doe_X John Doe_Y Kevin Lee_X Kevin Lee_Y Liam Smith_X \
0 AA 75.047164 19.281168 28.064313 87.184248 -76.148559
1 nan 50.642782 68.308319 46.088057 64.132263 -83.109383
2 BB 9.965115 77.950894 48.864693 8.613132 0.106708
3 nan 44.726136 58.751520 69.904076 40.818433 -87.656064
4 nan 101.501119 99.156872 101.976300 93.539749 -57.026015
5 CC 87.778446 65.814911 7.302116 40.577156 -28.703879
6 nan 99.682139 91.715231 88.029451 82.309191 -66.444582
7 CC 38.248267 38.648960 76.065297 67.322639 -34.754868
8 DD 69.429353 61.252800 83.024358 58.038962 -62.001353
9 nan 9.522023 73.009883 41.873986 8.677565 -20.389939
Liam Smith_Y Person X Y
0 18.420494 nan NaN NaN
1 33.206289 nan NaN NaN
2 73.833204 John Doe 9.965115 77.950894
3 39.652071 John Doe 44.726136 58.751520
4 88.176561 nan NaN NaN
5 53.776995 Kevin Lee 7.302116 40.577156
6 95.025923 nan NaN NaN
7 26.851864 Liam Smith -34.754868 26.851864
8 102.771046 John Doe 69.429353 61.252800
9 28.633231 John Doe 9.522023 73.009883
我现在希望使用'Event'
列来优化新的['X','Y']
列。具体来说,当值'AA'
在'Event'
列中时,我想返回AA (10,20)
的坐标。此外,我喜欢得到相同的坐标,直到下一个坐标出现
所以输出看起来像:
Event John Doe_X John Doe_Y Kevin Lee_X Kevin Lee_Y Liam Smith_X \
0 AA 75.047164 19.281168 28.064313 87.184248 -76.148559
1 nan 50.642782 68.308319 46.088057 64.132263 -83.109383
2 BB 9.965115 77.950894 48.864693 8.613132 0.106708
3 nan 44.726136 58.751520 69.904076 40.818433 -87.656064
4 nan 101.501119 99.156872 101.976300 93.539749 -57.026015
5 CC 87.778446 65.814911 7.302116 40.577156 -28.703879
6 nan 99.682139 91.715231 88.029451 82.309191 -66.444582
7 CC 38.248267 38.648960 76.065297 67.322639 -34.754868
8 DD 69.429353 61.252800 83.024358 58.038962 -62.001353
9 nan 9.522023 73.009883 41.873986 8.677565 -20.389939
Liam Smith_Y Person X Y
0 18.420494 nan 10 20
1 33.206289 nan 10 20
2 73.833204 John Doe 9.965115 77.950894
3 39.652071 John Doe 44.726136 58.751520
4 88.176561 nan NaN NaN
5 53.776995 Kevin Lee 7.302116 40.577156
6 95.025923 nan NaN NaN
7 26.851864 Liam Smith -34.754868 26.851864
8 102.771046 John Doe 69.429353 61.252800
9 28.633231 John Doe 9.522023 73.009883
我试过写这样的东西:
for value in df['Event']:
if value == 'AA' :
df['X', 'Y'] = AA
但是得到一个ValueError:ValueError: Length of values does not match length of index
您的代码有一些错误(其中一个错误是Person和Player弄错了)。我想这是粘贴错误
但是,使用掩码并将元组AA应用于掩码使用的子集
df.loc
可以很容易地解决您的问题如果要遍历行,可以尝试:
结果:
相关问题 更多 >
编程相关推荐