在Python pandas中，如何将一列中包含多个句子的文本拆分成多行？

import numpy as np import pandas as pd survey_data = pd.read_csv("Food_Dummy.csv") survey_text = survey_data[['Id','Team','Food_Text']] # Getting s as pandas series which has split on full stop and new sentence a new line s = survey_text["Food_Text"].str.split('.').apply(pd.Series,1).stack() s.index = s.index.droplevel(-1) # to line up with df's index s.name = 'Food_Text' # needs a name to join # There are blank or emplty cell values after above process. Removing them s.replace('', np.nan, inplace=True) s.dropna(inplace=True) x=s.to_frame(name='Food_Text1') x.head(10) # Joining should ideally get me proper output. But I am getting original dataframe instead of split one. survey_text.join(x) survey_text.head(10)

1条回答

网友

1楼 · 发布于 2024-05-14 06:59:25

在您放入代码的示例中，join的结果已打印出来，因此如果您想更改调查文本的值，代码应该是：

survey_text = survey_text.join(x)

或者，如果您想简化代码，下面的代码就可以了：

import numpy as np
import pandas as pd

survey_data = pd.read_csv("Food_Dummy.csv")
survey_text = survey_data[['Id','Team','Food_Text']]

# Getting s as pandas series which has split on full stop and new sentence a new line
s = survey_text["Food_Text"].str.split('.').apply(pd.Series,1).stack()
s.index = s.index.droplevel(-1) # to line up with df's index
s.name = 'Food_Text' # needs a name to join

# There are blank or emplty cell values after above process. Removing them
s.replace('', np.nan, inplace=True)
s.dropna(inplace=True)

# Joining should ideally get me proper output. But I am getting original dataframe instead of split one.
del survey_text['Food_Text']
survey_text = survey_text.join(s)
survey_text.head(10)

这样你就不会在yout数据框中有多个“Food_Text”列。在

相关问题更多 >

编程相关推荐

热门问题

热门文章