将字符串拆分为数据帧

2024-05-23 21:46:43 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个字符串与下面完全相同,我的目标是这样分裂成一个数据帧,但我发现很难让它工作。我试过在堆栈上搜索,但一无所获。你知道吗

'Position             Players   Average Form\nGoalkeeper        Manuel Neuer  4.17017132535\n  Defender         Diego Godin  4.14973163459\n  Defender   Giorgio Chiellini  4.10115207373\n  Defender        Thiago Silva  3.93318274318\n  Defender     Andrea Barzagli  3.85132973289\nMidfielder        Arjen Robben  4.80556193806\nMidfielder     Alexander Meier  4.51037598508\nMidfielder       Franck Ribery  4.48063714064\nMidfielder         David Silva  3.76028050109\n   Forward   Cristiano Ronaldo  7.87909462636\n   Forward  Zlatan Ibrahimovic  6.85401665065'

有没有办法把它转换成一个数据帧,以一种可复制的方式,这样我就可以用其他字符串来做呢?你知道吗

我的目标数据框如下所示:

Position    name                Average
Goalkeeper  Manuel              4.17017132535
Defender    Diego               4.14973163459
Defender    Giorgio             4.10115207373
Defender    Thiago              3.93318274318
Defender    Andrea              3.85132973289
Midfielder  Arjen               4.80556193806
Midfielder  Alexander           4.51037598508
Midfielder  Franck              4.48063714064
Midfielder  David               3.76028050109
Forward     Cristiano           7.87909462636
Forward     Hnery               6.85401665065

我是新来的熊猫,所以任何帮助将不胜感激


Tags: 数据字符串目标positionsilvaforwardaveragedefender
2条回答

这是一种方法。你知道吗

import pandas as pd

mystr = 'Position             Players   Average Form\nGoalkeeper        Manuel Neuer  4.17017132535\n  Defender         Diego Godin  4.14973163459\n  Defender   Giorgio Chiellini  4.10115207373\n  Defender        Thiago Silva  3.93318274318\n  Defender     Andrea Barzagli  3.85132973289\nMidfielder        Arjen Robben  4.80556193806\nMidfielder     Alexander Meier  4.51037598508\nMidfielder       Franck Ribery  4.48063714064\nMidfielder         David Silva  3.76028050109\n   Forward   Cristiano Ronaldo  7.87909462636\n   Forward  Zlatan Ibrahimovic  6.85401665065'

lst = mystr.split()
data = [lst[pos:pos+4] for pos in range(0, len(lst), 4)]

df = pd.DataFrame(data[1:], columns=data[0])

print(df)

#       Position    Players      Average           Form
# 0   Goalkeeper     Manuel        Neuer  4.17017132535
# 1     Defender      Diego        Godin  4.14973163459
# 2     Defender    Giorgio    Chiellini  4.10115207373
# 3     Defender     Thiago        Silva  3.93318274318
# 4     Defender     Andrea     Barzagli  3.85132973289
# 5   Midfielder      Arjen       Robben  4.80556193806
# 6   Midfielder  Alexander        Meier  4.51037598508
# 7   Midfielder     Franck       Ribery  4.48063714064
# 8   Midfielder      David        Silva  3.76028050109
# 9      Forward  Cristiano      Ronaldo  7.87909462636
# 10     Forward     Zlatan  Ibrahimovic  6.85401665065

这种方法在以下情况下并不完美:

  1. 列名中的空格,如上所述。在这种情况下,需要重新定义列名。你知道吗
  2. 玩家名称中的空格。所提供的数据似乎没有问题。你知道吗

下面是你将如何解决这个问题。你知道吗

import pandas as pd
from io import StringIO
data  = StringIO('Position             Players   Average Form\nGoalkeeper        Manuel Neuer  4.17017132535\n  Defender         Diego Godin  4.14973163459\n  Defender   Giorgio Chiellini  4.10115207373\n  Defender        Thiago Silva  3.93318274318\n  Defender     Andrea Barzagli  3.85132973289\nMidfielder        Arjen Robben  4.80556193806\nMidfielder     Alexander Meier  4.51037598508\nMidfielder       Franck Ribery  4.48063714064\nMidfielder         David Silva  3.76028050109\n   Forward   Cristiano Ronaldo  7.87909462636\n   Forward  Zlatan Ibrahimovic  6.85401665065')
df = pd.read_csv(data, sep="\n")
print(df)

输出:

      Position             Players   Average Form
0    Goalkeeper        Manuel Neuer  4.17017132535
1      Defender         Diego Godin  4.14973163459
2      Defender   Giorgio Chiellini  4.10115207373
3      Defender        Thiago Silva  3.93318274318
4      Defender     Andrea Barzagli  3.85132973289
5    Midfielder        Arjen Robben  4.80556193806
6    Midfielder     Alexander Meier  4.51037598508
7    Midfielder       Franck Ribery  4.48063714064
8    Midfielder         David Silva  3.76028050109
9       Forward   Cristiano Ronaldo  7.87909462636
10     Forward  Zlatan Ibrahimovic  6.85401665065

相关问题 更多 >