PYTHON数据帧将一列数字[0,0]数据帧拆分为两列

2024-05-29 07:22:02 发布

您现在位置:Python中文网/ 问答频道 /正文

在have a dataframe from AMAZON DATASET中,数据集有一个'help'列,如下所示:'help':[0,0],其中第一个元素是'yes'投票,第二个元素是'total'投票。你知道吗

我想使用PANDAS(PYTHON)将这些列分成两列。 第一列必须只包含第一个元素。最后一个是第二个元素

import pandas as pd


df.head(5)

reviewerID     asin       reviewerName  helpful
0 A2VNYWOPJ13AFP 0981850006 "Customer"     [0,0]
0 A20DWVV8HML3AW 0923587406 "Customer"     [1,3]
0 A3VMADADA13AFP 0981587706 "Customer"     [0,0]
0 A28XY55TP3Q90O 0541217906 "Customer"     [2,4]
0 A5RTTREES110V3 0265478006 "Customer"     [0,0]
0 A2VNYWOPJ13AFP 0565777106 "Customer"     [1,5]


Index(['reviewerID', 'asin', 'reviewerName', 'helpful'],
      dtype='object')

df.helpful[1][0] = 1
df.helpful[1][1] = 3

Do that for all columns

pd.DataFrame(ratings['helpful'], columns = ['Yes','Vote'])

reviewerID     asin       reviewerName  helpful
0 A2VNYWOPJ13AFP 0981850006 "Customer"     [0,0]
0 A20DWVV8HML3AW 0923587406 "Customer"     [1,3]
0 A3VMADADA13AFP 0981587706 "Customer"     [0,0]
0 A28XY55TP3Q90O 0541217906 "Customer"     [2,4]
0 A5RTTREES110V3 0265478006 "Customer"     [0,0]
0 A2VNYWOPJ13AFP 0565777106 "Customer"     [1,5]

helpful dtype=obect

THE GOAL - EXPECTED RESULT

  reviewerID     asin       reviewerName  YES      TOTAL VOTE
0 A2VNYWOPJ13AFP 0981850006 "Customer"     0        0
0 A20DWVV8HML3AW 0923587406 "Customer"     1        3
0 A3VMADADA13AFP 0981587706 "Customer"     0        0
0 A28XY55TP3Q90O 0541217906 "Customer"     2        4
0 A5RTTREES110V3 0265478006 "Customer"     0        0
0 A2VNYWOPJ13AFP 0565777106 "Customer"     1        5

Tags: 元素dfhelpcustomer投票pdhelpfulasin
1条回答
网友
1楼 · 发布于 2024-05-29 07:22:02

你可以这样分开它们:

df[['first','second']]=pd.DataFrame(df['helpful'].tolist(),columns=['first','second'])

输出:

  helpful  first  second
0  [0, 0]      0       0
1  [1, 3]      1       3
2  [0, 0]      0       0

这是假设有用的条目是列表

编辑-如果列实际上是字符串,即“[0,1]”

df['helpful'] = [eval(h) for h in df['helpful'].values]
df[['first','second']]=pd.DataFrame(df['helpful'].tolist(),columns=['first','second'])

相同的输出

或者

df['first'] = [str(h).replace('[','').replace(']','').split(',')[0] for h in df['helpful'].values]
df['second'] = [str(h).replace('[','').replace(']','').split(',')[1] for h in df['helpful'].values]

相同的输出

相关问题 更多 >

    热门问题