使用其中一个数据帧作为键将Python中的数据帧组合到字典

**df1** UNIQUE CODES Rank 12/8/2017 12/9/2017 .... 1/3/2018 1 Code_1 Code_3 Code_4 2 Code_2 Code_1 Code_2 ... 1000 Code_5 Code_6 Code_7 **df2** NAMES Rank 12/8/2017 12/9/2017 .... 1/3/2018 1 Jon Maria Peter 2 Brian Jon Maria ... 1000 Chris Tim Charles **df3** SCORES Rank 12/8/2017 12/9/2017 .... 1/3/2018 1 10 20 30 2 15 10 40 ... 1000 25 15 20

# reshape to get dates into rows hashtags_reshaped = pd.melt(hashtags, id_vars = ['Rank'], value_vars = hashtags.columns, var_name = 'Date', value_name = 'Code').drop('Rank', axis = 1) # reshape to get dates into rows players_reshaped = pd.melt(players, id_vars = ['Rank'], value_vars = hashtags.columns, var_name = 'Date', value_name = 'Name').drop('Rank', axis = 1) # reshape to get the dates into rows trophies_reshaped = pd.melt(trophies, id_vars = ['Rank'], value_vars = hashtags.columns, var_name = 'Date', value_name = 'Score').drop('Rank', axis = 1) # merge the three together. # This _assumes_ that the dfs are all in the same order and that all the data matches up. merged_df = pd.DataFrame([hashtags_reshaped['Date'], hashtags_reshaped['Code'], players_reshaped['Name'], trophies_reshaped['Score']]).T print(merged_df) # group by code, name, and date; sum the scores together if multiple exist for a given code-name-date grouping grouped_df = merged_df.groupby(['Code', 'Name', 'Date']).sum().sort_values('Score', ascending = False) print(grouped_df) summed_df = merged_df.drop('Date', axis = 1) \ .groupby(['Code', 'Name']).sum() \ .sort_values('Score', ascending = False).reset_index() summed_df['li'] = list(zip(summed_df.Name, summed_df.Score)) print(summed_df)

0 (MandiBralaX, 996871590076253) 1 (Arso_C, 9955130513430) 2 (ThatRainbowGuy, 9946) 3 (fabi, 9940) 4 (Dogão, 991917) 5 (Hierbo, 99168) 6 (Clyde, 9916156180128) 7 (.A.R.M.I.N., 9916014310187143) 8 (keftedokofths, 9900) 9 (⚽AngelSosa⚽, 990) 10 (Totoo98, 99)

Code Name Score \ 0 #JL2J02LY MandiBralaX 996871590076253 1 #80JQ90VC Arso_C 9955130513430 2 #9GGC2CUQ ThatRainbowGuy 9946 3 #8LL989QV fabi 9940 4 #9PPC89L Dogão 991917 5 #2JPLQ8JP8 Hierbo 99168

1条回答

网友

1楼 · 发布于 2024-06-10 05:52:55

这会让你有更多的路要走。我没有按照您指定的那样在末尾创建字典；虽然您可能需要这种格式，但最终会得到嵌套的字典或列表，因为每个代码都有一个名称，但可能有许多日期和分数与之关联。你想要怎样的录音单、录音等？你知道吗

下面的代码返回一个分组的数据帧；您可以将其直接输出到dict（如图所示），但是您可能需要详细指定格式，尤其是在需要有序字典的情况下。（字典本来就不是有序的；如果您真的需要一个有序的字典，您必须from collections import OrderedDict并查看文档。你知道吗

import pandas as pd

#create the dfs; note that 'Code' is set up as a string
df1 = pd.DataFrame({'Rank': [1, 2], '12/8/2017': ['1', '2'], '12/9/2017': ['3', '1']})
df1.set_index('Rank', inplace = True)

# reshape to get dates into rows
df1_reshaped = pd.melt(df1, id_vars = ['Rank'], 
                       value_vars = df1.columns, 
                       var_name = 'Date', 
                       value_name = 'Code').drop('Rank', axis = 1)
#print(df1_reshaped)

# create the second df
df2 = pd.DataFrame({'Rank': [1, 2], '12/8/2017': ['Name_1', 'Name_2'], '12/9/2017': ['Name_3', 'Name_1']})
df2.set_index('Rank', inplace = True)

# reshape to get dates into rows
df2_reshaped = pd.melt(df2, id_vars = ['Rank'], 
                       value_vars = df1.columns, 
                       var_name = 'Date', 
                       value_name = 'Name').drop('Rank', axis = 1)
#print(df2_reshaped)

# create the third df
df3 = pd.DataFrame({'Rank': [1, 2], '12/8/2017': ['10', '20'], '12/9/2017': ['30', '10']})
df3.set_index('Rank', inplace = True)

# reshape to get the dates into rows
df3_reshaped = pd.melt(df3, id_vars = ['Rank'], 
                       value_vars = df1.columns, 
                       var_name = 'Date', 
                       value_name = 'Score').drop('Rank', axis = 1)
#print(df3_reshaped)

# merge the three together. 
# This _assumes_ that the dfs are all in the same order and that all the data matches up.
merged_df = pd.DataFrame([df1_reshaped['Date'], df1_reshaped['Code'], df2_reshaped['Name'], df3_reshaped['Score']]).T
print(merged_df)

# group by code, name, and date; sum the scores together if multiple exist for a given code-name-date grouping
grouped_df = merged_df.groupby(['Code', 'Name', 'Date']).sum().sort_values('Score', ascending = False)
print(grouped_df)

summed_df = merged_df.drop('Date', axis = 1) \
    .groupby(['Code', 'Name']).sum() \
    .sort_values('Score', ascending = False).reset_index()
summed_df['li'] = list(zip(summed_df.Name, summed_df.Score))
print(summed_df)

未排序的词典：

d = dict(zip(summed_df.Code, summed_df.li))
print(d)

当然，您可以直接进行订购，并且应该：

from collections import OrderedDict
d2 = OrderedDict(zip(summed_df.Code, summed_df.li))
print(d2)

summed_df：

  Code    Name  Score            li
0    3  Name_3     30  (Name_3, 30)
1    1  Name_1     20  (Name_1, 20)
2    2  Name_2     20  (Name_2, 20)

d：

{'3': ('Name_3', 30), '1': ('Name_1', 20), '2': ('Name_2', 20)}

d2，排序：

OrderedDict([('3', ('Name_3', 30)), ('1', ('Name_1', 20)), ('2', ('Name_2', 20))])

它以元组的形式返回你的（名字，分数），不是列表，而是。。。应该有更多的路要走。你知道吗

相关问题更多 >

编程相关推荐

热门问题

热门文章