数据帧列分隔 - 问答 - Python中文网

数据帧列分隔

2024-05-17 00:10:04 发布

您现在位置：Python中文网/ 问答频道 /正文

男 | 程序猿一只，喜欢编程写python代码。

我有一个很大的数据帧，其中只有一列包含所有值。我需要把数据分成更多的列。经过反复试验，我放弃了，寻求你的帮助。你知道吗

数据帧的头部如下所示：行是一个序列对象。不是价值观

                                                        column1
    ---------------------------------------------------------------------
    MultiIndex1  | 1.00   2.00   3.00   4.00   5.00   6.00   7.00
                 | 1.00   2.00   3.00   4.00   5.00   6.00   7.00
                 | 1.00   2.00   3.00   4.00   5.00   6.00   7.00
                 | 1.00   2.00   3.00   4.00   5.00   6.00   7.00
                 | 1.00   2.00   3.00   4.00   5.00   6.00   7.00
                 | 1.00   2.00   3.00   4.00   5.00   6.00   7.00

我期望的输出应该如下所示：

                 column1|column2|column3|column4|column5|column6|column7
    ---------------------------------------------------------------------
    MultiIndex1  | 1.00 |  2.00 |  3.00 |  4.00 |  5.00 |  6.00 |  7.00
                 | 1.00 |  2.00 |  3.00 |  4.00 |  5.00 |  6.00 |  7.00
                 | 1.00 |  2.00 |  3.00 |  4.00 |  5.00 |  6.00 |  7.00
                 | 1.00 |  2.00 |  3.00 |  4.00 |  5.00 |  6.00 |  7.00
                 | 1.00 |  2.00 |  3.00 |  4.00 |  5.00 |  6.00 |  7.00
                 | 1.00 |  2.00 |  3.00 |  4.00 |  5.00 |  6.00 |  7.00

我试着：测向列=['col1'、'col2'、'col3'、'col4'、'col5'…]

我试着把它变成一个系列，然后再回到df。你知道吗

我试过申请。结构拆分功能。你知道吗

很多切片和浓缩，但没有成功。你知道吗

我们将非常感谢您的帮助。谢谢！你知道吗

以下是我的数据集的前几行，作为示例：

日期和AALR3是行多索引

2019-01-02；AALR3；00000000 20；0000000000 13.300000；000000000000000 100；10:00:04.961；1；2019-01-02；000086597137782；000000000 310091；2；2019-01-02；000086597142909；000000000 310092；1；0；00000072；00000174 2019-01-02；AALR3；0000000010；000000000013.310000；0000000000000003000；10:00:04.961；1；2019-01-02；000086597135827；000000000310088；2；2019-01-02；000086597142909；000000000310089；1；0；00000120；00000174 2019-01-02；AALR3；0000000050；000000000013.390000；000000000000000200；10:11:40.214；1；2019-01-02；000086597182855；000000000400273；1；2019-01-02；000086597151579；000000000400274；2；0；000000058 2019-01-02；AALR3；0000000040；000000000013.380000；000000000000000100；10:11:40.214；1；2019-01-02；000086597182855；000000000400271；1；2019-01-02；000086597151578；000000000400272；2；0；00000058；00000174 2019-01-02；AALR3；0000000030；000000000013.380000；000000000000000100；10:11:40.214；1；2019-01-02；000086597182855；000000000400269；1；2019-01-02；000086597151189；000000000400270；2；0；00000058；00000308

我读它的时候带着：

    pd.read_csv('//path_to_file', sep=';')

我想这样命名列。你知道吗

    df.columns = ['Session Date','Instrument Symbol','Trade Number','Trade Price','Traded Quantity',
          'Trade Time','Trade Indicator','Buy Order Date','Sequential Buy Order Number',
          'Secondary Order ID - Buy Order','Aggressor Buy Order Indicator','Sell Order Date',
         'Sequential Sell Order Number','Secondary Order ID - Sell Order','Aggressor Sell Order Indicator',
          'Cross Trade Indicator','Buy Member','Sell Member']

更新：

解决方案很有效，非常感谢。你知道吗

I is almost the way i want it. Is there a way to make the duplicate indexes a MultiIndex as well? I managed to make the dates, but not the symbol. Thanks

Tags： the to 数据 number df date order buy

2条回答

网友

1楼 · 编辑于 2024-05-17 00:10:04

你看到的是MultiIndex Dataframe，你要找的是SingleIndex dataframe，试试看

df = df.reset_index()
df.columns = ['col1','col2','col3','col4','col5','col6','col7']

网友

2楼 · 编辑于 2024-05-17 00:10:04

试试这个-

your_df = pd.DataFrame(df.column1.str.split(' ',1).tolist(), columns = ['col1','col2','col3','col4','col5','col6','col7'])
print(your_df)

相关问题更多 >

编程相关推荐

热门问题

热门文章