Python无法创建不均匀的多索引

2024-04-20 00:49:56 发布

您现在位置:Python中文网/ 问答频道 /正文

我有以下代码

IDX_VALS_BANKNOTER_PATRIMONY = [['PATRIMONY'],['GOLD']]
IDX_VALS_BANKNOTER_ASSETS = [['ASSETS'],['DEPOSITS', 'ADVANCES']]
IDX_VALS_BANKNOTER_LIABILITIES = [['LIABILITIES'], ['CLIENTS', 'SUPPLIERS']]

IDX_BANKNOTER_PATRIMONY = pd.MultiIndex.from_product(IDX_VALS_BANKNOTER_PATRIMONY)
IDX_BANKNOTER_ASSETS = pd.MultiIndex.from_product(IDX_VALS_BANKNOTER_ASSETS)
IDX_BANKNOTER_LIABILITIES = pd.MultiIndex.from_product(IDX_VALS_BANKNOTER_LIABILITIES)

IDX_BANKNOTER = IDX_BANKNOTER_PATRIMONY.append(IDX_BANKNOTER_ASSETS).append(IDX_BANKNOTER_LIABILITIES)

print(IDX_BANKNOTER)

它将打印以下索引:

MultiIndex([(  'PATRIMONY',      'GOLD'),
            (     'ASSETS',  'DEPOSITS'),
            (     'ASSETS',  'ADVANCES'),
            ('LIABILITIES',   'CLIENTS'),
            ('LIABILITIES', 'SUPPLIERS')],
           )

(我使用了.from_product(),因为我希望最终添加更多标签) 我的问题是:我想在第三列上扩展这个多重索引,这样我得到的多重索引如下所示:

'PATRIMONY', 'GOLD',
'ASSETS', 'DEPOSITS',
'ASSETS', 'ADVANCES',
'LIABILITIES', 'CLIENTS', 'Dr. Foo'
'LIABILITIES', 'CLIENTS', 'Dr. House'
'LIABILITIES', 'CLIENTS', 'Richard'
'LIABILITIES', 'SUPPLIERS', 'PORT1',
'LIABILITIES', 'SUPPLIERS', 'PORT2'

这意味着多重指数将是不均衡的,第三级指数仅用于“负债”,根据客户名称或供应商名称,客户和供应商的指数不同。我已尝试附加以下索引:

IDX_FIRST_EXTENSION_NAMES = [['LIABILITIES'], ['CLIENTS'], ['Dr. Foo', 'Dr. House', 'Richard']]
IDX_FIRST_EXTENSION = pd.MultiIndex.from_product(IDX_FIRST_EXTENSION_NAMES)
IDX_SECOND_EXTENSION_NAMES = [['LIABILITIES'], ['SUPPLIERS'], ['PORT1', 'PORT2']]
IDX_SECOND_EXTENSION = pd.MultiIndex.from_product(IDX_SECOND_EXTENSION_NAMES)
DESIRED_RESULT = IDX_BANKNOTER.append(IDX_FIRST_EXTENSION).append(IDX_SECOND_EXTENSION)

但我得到的回报是:

MultiIndex([(  'PATRIMONY',      'GOLD'),
            (     'ASSETS',  'DEPOSITS'),
            (     'ASSETS',  'ADVANCES'),
            ('LIABILITIES',   'CLIENTS'),
            ('LIABILITIES',   'CLIENTS'),
            ('LIABILITIES',   'CLIENTS'),
            ('LIABILITIES', 'SUPPLIERS'),
            ('LIABILITIES', 'SUPPLIERS')],
           )

我对使用熊猫相当陌生,关于多索引的文档也没有什么帮助(初始化多索引的示例数量相当有限,没有不均匀多索引的示例)。有人有指针吗?我制作这个多索引是为了方便地处理相应的数据,例如,可以使用

df['LIABILITIES']['CLIENTS']['(CLIENT NAME)']

或者能够获得['CLIENTS']下所有值的总和。理想情况下,我希望保留数据框的列作为时间标签

非常感谢您的帮助,谢谢


Tags: fromextensionproductclientspdassetsidxvals
1条回答
网友
1楼 · 发布于 2024-04-20 00:49:56

代码:

import pandas as pd

IDX_VALS_BANKNOTER_PATRIMONY = [['PATRIMONY'],['GOLD'], ['']]
IDX_VALS_BANKNOTER_ASSETS = [['ASSETS'],['DEPOSITS', 'ADVANCES'], ['']]

IDX_BANKNOTER_PATRIMONY = pd.MultiIndex.from_product(IDX_VALS_BANKNOTER_PATRIMONY)
IDX_BANKNOTER_ASSETS = pd.MultiIndex.from_product(IDX_VALS_BANKNOTER_ASSETS)

IDX_BANKNOTER = IDX_BANKNOTER_PATRIMONY.append(IDX_BANKNOTER_ASSETS)

IDX_FIRST_EXTENSION_NAMES = [['LIABILITIES'], ['CLIENTS'], ['Dr. Foo', 'Dr. House', 'Richard']]
IDX_FIRST_EXTENSION = pd.MultiIndex.from_product(IDX_FIRST_EXTENSION_NAMES)
IDX_SECOND_EXTENSION_NAMES = [['LIABILITIES'], ['SUPPLIERS'], ['PORT1', 'PORT2']]
IDX_SECOND_EXTENSION = pd.MultiIndex.from_product(IDX_SECOND_EXTENSION_NAMES)
WANTED_RESULT = IDX_BANKNOTER.append(IDX_FIRST_EXTENSION).append(IDX_SECOND_EXTENSION)

print(WANTED_RESULT)

输出:

MultiIndex([(  'PATRIMONY',      'GOLD',          ''),
            (     'ASSETS',  'DEPOSITS',          ''),
            (     'ASSETS',  'ADVANCES',          ''),
            ('LIABILITIES',   'CLIENTS',   'Dr. Foo'),
            ('LIABILITIES',   'CLIENTS', 'Dr. House'),
            ('LIABILITIES',   'CLIENTS',   'Richard'),
            ('LIABILITIES', 'SUPPLIERS',     'PORT1'),
            ('LIABILITIES', 'SUPPLIERS',     'PORT2')],
           )

相关问题 更多 >