我的数据集有几个有趣的列,我想聚合这些列,并因此创建一个度量,我可以使用它进行更多的分析
我写的算法大约需要3秒钟才能完成,所以我想知道是否有更有效的方法来完成这项工作
def financial_score_calculation(df, dictionary_of_parameters):
for parameter in dictionary_of_parameters:
for i in dictionary_of_parameters[parameter]['target']:
index = df.loc[df[parameter] == i].index
for i in index:
old_score = df.at[i, 'financialliteracyscore']
new_score = old_score + dictionary_of_parameters[parameter]['score']
df.at[i, 'financialliteracyscore'] = new_score
for i in df.index:
old_score = df.at[i, 'financialliteracyscore']
new_score = (old_score/27.0)*100 #converting score to percent value
df.at[i, 'financialliteracyscore'] = new_score
return df
以下是\u参数字典\u的截断版本:
dictionary_of_parameters = {
# money management parameters
"SatisfactionLevelCurrentFinances": {'target': [8, 9, 10], 'score': 1},
"WillingnessFinancialRisk": {'target': [8, 9, 10], 'score': 1},
"ConfidenceLevelToEarn2000WithinMonth": {'target': [1], 'score': 1},
"DegreeOfWorryAboutRetirement": {'target': [1], 'score': 1},
"GoodWithYourMoney?": {'target': [7], 'score': 1}
}
编辑:为df生成玩具数据
df = pd.DataFrame(columns = dictionary_of_parameters.keys())
df['financialliteracyscore'] = 0
for i in range(10):
df.loc[i] = dict(zip(df.columns,2*i*np.ones(6)))
请注意,在Pandas中,您可以使用
at
以非元素方式进行索引。在下面的四行中,index
是一个列表,然后可以用它来索引loc
这是一个参考,虽然我个人从来没有发现它在我的编程早期有用https://pandas.pydata.org/pandas-docs/stable/indexing.html
相关问题 更多 >
编程相关推荐