使用嵌套字典创建多索引“DataFrame”

2024-05-28 06:30:35 发布

您现在位置:Python中文网/ 问答频道 /正文

这个问题与this one有关。这次我想更进一步。给你一本字典,比如:

dd = {0: {"russell": {"score": numpy.random.rand(), "ping": numpy.random.randint(10, 100)},
          "cantor": {"score": numpy.random.rand(), "ping": numpy.random.randint(10, 100)},
          "godel": {"score": numpy.random.rand(), "ping": numpy.random.randint(10, 100)}},

      1: {"russell": {"score": numpy.random.rand(), "ping": numpy.random.randint(10, 100)},
          "cantor": {"score": numpy.random.rand(), "ping": numpy.random.randint(10, 100)},
          "godel": {"score": numpy.random.rand(), "ping": numpy.random.randint(10, 100)}}}

或者像这样的列表:

ll = [{"russell": {"score": numpy.random.rand(), "ping": numpy.random.randint(10, 100)},
          "cantor": {"score": numpy.random.rand(), "ping": numpy.random.randint(10, 100)},
          "godel": {"score": numpy.random.rand(), "ping": numpy.random.randint(10, 100)}},

      {"russell": {"score": numpy.random.rand(), "ping": numpy.random.randint(10, 100)},
          "cantor": {"score": numpy.random.rand(), "ping": numpy.random.randint(10, 100)},
          "godel": {"score": numpy.random.rand(), "ping": numpy.random.randint(10, 100)}}]

我想构建一个DataFrame像:

                          russell                            godel                        cantor
                    score    ping                    score    ping                 score    ping
0     0.17473916938994682      40       0.3443303845926545      47   0.43576522521017247      42
1      0.7341005512329682      22      0.14682222267827938      81    0.5662517436162526      59

我们可以看到列索引是MultiIndex。有没有办法做到这一点?如果我尝试pandas.DataFrame.from_dict(dd, orient="index")pandas.DataFrame(ll),那么我得到:

                                      russell                                       godel                                      cantor
0  {'score': 0.17473916938994682, 'ping': 40}   {'score': 0.3443303845926545, 'ping': 47}  {'score': 0.43576522521017247, 'ping': 42}
1   {'score': 0.7341005512329682, 'ping': 22}  {'score': 0.14682222267827938, 'ping': 81}   {'score': 0.5662517436162526, 'ping': 59}

这不是我想要的。你知道吗


Tags: numpydataframepandasrandomthispingonedd
2条回答

现在它更复杂了,但是^{}^{}^{}^{}可以帮助:

df = pd.Panel(dd).transpose(2,0,1).to_frame().unstack()
print (df)
      cantor           godel           russell          
minor   ping     score  ping     score    ping     score
major                                                   
0       69.0  0.050641  51.0  0.765994    20.0  0.935196
1       91.0  0.398624  33.0  0.408681    75.0  0.464876

这也会起作用。请注意,嵌套的dict并不是真正嵌套的,以便于翻译。你知道吗

 pd.concat({key:pd.DataFrame(dd[key]) for key in dd.keys()}).unstack()
Out[104]: 
  cantor           godel           russell          
    ping     score  ping     score    ping     score
0   73.0  0.463084  94.0  0.954662    76.0  0.732291
1   28.0  0.778905  81.0  0.984285    36.0  0.094173

简而言之,用concat创建多索引df非常简单。你只需要一个数据帧字典

相关问题 更多 >