I have a dataset with missing values that I want to impute, using the StdDev/Mean of the existing data within each feature, for each country, over time.
我想建立一个循环/迭代,使用groupby
和lambda或forloop遍历组,并迭代地输入缺失的值
(远不止3年,3个国家,3个特色)
[country, feature, year, value]
USA A 1995 8
USA B 1995 NaN
USA C 1995 326
USA A 1996 14
USA B 1996 42
USA C 1996 NaN
USA A 1997 NaN
USA B 1997 50
USA C 1997 400
CHN A 1995 6
CHN B 1995 34
CHN C 1995 NaN
CHN A 1996 NaN
CHN B 1996 NaN
CHN C 1996 381
CHN A 1997 23
CHN B 1997 54
CHN C 1997 412
grp = df.groupby(['country', 'series'])
for country, group in grp:
return ????Some Iteration????
Expected output would return the df with the NaN values now imputed as the StdDev values for each country, with respect to each feature.
Not the StdDev of the all the features/all the countries combined as a whole.
感谢所有的意见
目前没有回答
相关问题 更多 >
编程相关推荐