Pandasconcat字典到datafram

2024-04-25 21:05:26 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个现有的数据帧,我正在尝试连接一个字典,其中字典的长度与数据帧不同

>>> df
         A        B        C
0  0.46324  0.32425  0.42194
1  0.10596  0.35910  0.21004
2  0.69209  0.12951  0.50186
3  0.04901  0.31203  0.11035
4  0.43104  0.62413  0.20567
5  0.43412  0.13720  0.11052
6  0.14512  0.10532  0.05310

以及

test = {"One": [0.23413, 0.19235, 0.51221], "Two": [0.01293, 0.12235, 0.63291]}

我正在尝试将test添加到df,同时将键更改为"D""C",我已经看到了

http://pandas.pydata.org/pandas-docs/stable/merging.htmlhttp://pandas.pydata.org/pandas-docs/stable/generated/pandas.concat.html

这表示我应该能够将字典连接到数据帧

我试过:

pd.concat([df, test], axis=1, ignore_index=True, keys=["D", "E"])
pd.concat([df, test], axis=1, ignore_index=True)

但我运气不好,我想达到的结果是

df
         A        B        C        D        E
0  0.46324  0.32425  0.42194  0.23413  0.01293  
1  0.10596  0.35910  0.21004  0.19235  0.12235
2  0.69209  0.12951  0.50186  0.51221  0.63291
3  0.04901  0.31203  0.11035      NaN      NaN
4  0.43104  0.62413  0.20567      NaN      NaN 
5  0.43412  0.13720  0.11052      NaN      NaN
6  0.14512  0.10532  0.05310      NaN      NaN

Tags: 数据orgtesthttpdocspandasdf字典
2条回答

唯一的办法是:

df.join(pd.DataFrame(test).rename(columns={'One':'D','Two':'E'}))

          A       B       C       D       E
0   0.46324 0.32425 0.42194 0.23413 0.01293
1   0.10596 0.35910 0.21004 0.19235 0.12235
2   0.69209 0.12951 0.50186 0.51221 0.63291
3   0.04901 0.31203 0.11035     NaN     NaN
4   0.43104 0.62413 0.20567     NaN     NaN
5   0.43412 0.13720 0.11052     NaN     NaN
6   0.14512 0.10532 0.05310     NaN     NaN

因为正如@Alexander正确地提到的,连接的行数应该匹配。否则,与您的情况一样,丢失的行将用NaN填充

假设要将它们添加为行:

>>> pd.concat([df, pd.DataFrame(test.values(), columns=df.columns)], ignore_index=True)
         A        B        C
0  0.46324  0.32425  0.42194
1  0.10596  0.35910  0.21004
2  0.69209  0.12951  0.50186
3  0.04901  0.31203  0.11035
4  0.43104  0.62413  0.20567
5  0.43412  0.13720  0.11052
6  0.14512  0.10532  0.05310
7  0.01293  0.12235  0.63291
8  0.23413  0.19235  0.51221

如果作为新列添加:

df_new = pd.concat([df, pd.DataFrame(test.values()).T], ignore_index=True, axis=1)
df_new.columns = \
    df.columns.tolist() + [{'One': 'D', 'Two': 'E'}.get(k) for k in test.keys()]

>>> df_new
         A        B        C        E        D
0  0.46324  0.32425  0.42194  0.01293  0.23413
1  0.10596  0.35910  0.21004  0.12235  0.19235
2  0.69209  0.12951  0.50186  0.63291  0.51221
3  0.04901  0.31203  0.11035      NaN      NaN
4  0.43104  0.62413  0.20567      NaN      NaN
5  0.43412  0.13720  0.11052      NaN      NaN
6  0.14512  0.10532  0.05310      NaN      NaN

字典中不保证顺序(例如test),因此新列名实际上需要映射到键。

相关问题 更多 >