如何将一列数据框拆分为多列?

2024-05-15 04:00:29 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个像这样的数据框

    0
0   SPY  SPDR S&P 500 ETF                                     -2.30%    4.96%   -2.60%    8.76%   35.50%   63.81%
1   IVV  iShares Core S&P 500 ETF                             -2.32%    4.93%   -2.66%    8.96%   36.10%   63.76%
2   VTI  Vanguard Total Stock Market ETF                      -2.20%    5.41%   -3.05%    7.67%   33.37%   59.23%
3   VOO  Vanguard S&P 500 ETF                                 -2.33%    4.95%   -2.72%    8.76%   35.66%   64.25%
4   QQQ  Invesco QQQ                                          -1.06%    5.29%   14.83%   31.81%   80.45%  134.61%
5   AGG  iShares Core U.S. Aggregate Bond ETF                 -0.02%    0.48%    5.76%    8.87%   15.65%   21.26%
6   VEA  Vanguard FTSE Developed Markets ETF                  -2.06%    7.97%  -10.14%   -1.83%    3.42%   12.27%
7   IEFA iShares Core MSCI EAFE ETF                           -1.74%    8.25%  -10.05%   -1.59%    3.67%   12.70%
8   GLD  SPDR Gold Trust                                      -0.62%   -1.27%   13.76%   27.88%   36.22%   43.45%
9   VUG  Vanguard Growth ETF                                  -1.30%    5.52%   10.38%   24.19%   62.45%   96.12%
10  VWO  Vanguard FTSE Emerging Markets ETF                   -2.07%    6.80%  -10.50%   -1.82%    6.16%   10.85%
11  BND  Vanguard Total Bond Market ETF                        0.02%    0.58%    5.86%    9.18%   15.91%   22.82%
12  IWF  iShares Russell 1000 Growth ETF                      -1.40%    5.25%    8.68%   22.23%   65.17%  102.32%

问题是,只有一列,要进行任何有用的分析,我需要为每个特性分别设置列。我尝试拆分功能并命名这些功能,如下所示

foo = lambda x: pd.Series([i for i in reversed(df.split(' '))])
rev = df['symbol', 'name', 'one_week_return', 'four_week_return', 'ytd', '1Y', '3Y', '5Y'].apply(foo)

这是抛出一个“关键错误”。这里有人知道如何拆分列并命名它们吗?谢谢


Tags: core功能foomarket命名totalgrowthvanguard
2条回答

我想你需要先对文件进行预处理。用逗号分隔每一列,

import pandas as pd
with open('input.csv', 'r') as f, open('output.csv', 'w') as w:
    for line in f:
        l = [item for item in line.split('  ') if item]
        w.write(','.join(l))

现在,如果使用pd.read_csv读取output.csv,则可以将它们作为单独的列来读取

我使用以下内容作为我的input.csv文件:link to file

试试这个:

df1 = df[0].str.rsplit(n=6, expand=True)
df2 = df1.pop(0).str.split(n=1, expand=True)

df = pd.concat([df2, df1], axis=1)
df.columns =['symbol', 'name', 'one_week_return', 'four_week_return', 'ytd', '1Y', '3Y', '5Y']
print(df)

输出:

   symbol                                  name one_week_return four_week_return      ytd      1Y      3Y       5Y
0     SPY                      SPDR S&P 500 ETF          -2.30%            4.96%   -2.60%   8.76%  35.50%   63.81%
1     IVV              iShares Core S&P 500 ETF          -2.32%            4.93%   -2.66%   8.96%  36.10%   63.76%
2     VTI       Vanguard Total Stock Market ETF          -2.20%            5.41%   -3.05%   7.67%  33.37%   59.23%
3     VOO                  Vanguard S&P 500 ETF          -2.33%            4.95%   -2.72%   8.76%  35.66%   64.25%
4     QQQ                           Invesco QQQ          -1.06%            5.29%   14.83%  31.81%  80.45%  134.61%
5     AGG  iShares Core U.S. Aggregate Bond ETF          -0.02%            0.48%    5.76%   8.87%  15.65%   21.26%
6     VEA   Vanguard FTSE Developed Markets ETF          -2.06%            7.97%  -10.14%  -1.83%   3.42%   12.27%
7    IEFA            iShares Core MSCI EAFE ETF          -1.74%            8.25%  -10.05%  -1.59%   3.67%   12.70%
8     GLD                       SPDR Gold Trust          -0.62%           -1.27%   13.76%  27.88%  36.22%   43.45%
9     VUG                   Vanguard Growth ETF          -1.30%            5.52%   10.38%  24.19%  62.45%   96.12%
10    VWO    Vanguard FTSE Emerging Markets ETF          -2.07%            6.80%  -10.50%  -1.82%   6.16%   10.85%
11    BND        Vanguard Total Bond Market ETF           0.02%            0.58%    5.86%   9.18%  15.91%   22.82%
12   1IWF       iShares Russell 1000 Growth ETF          -1.40%            5.25%    8.68%  22.23%  65.17%  102.32%

相关问题 更多 >

    热门问题