如何使用python2.7整理这个数据集

2024-04-25 20:08:25 发布

您现在位置:Python中文网/ 问答频道 /正文

我想整理一下下面的数据集。你知道吗

Review Title : Very poor
Upvotes : 1
Downvotes : 0
Review Content :
Hank all time this device ... fews day speakar sound not clear output
Review Title : Don't waste your money
Upvotes : 1
Downvotes : 1
Review Content :
Don't buy this product , its not good .just a waste of money.it starts showing small defects from starting few months of use and then after one year after warranty is over its mother was not working .and u can .ever fix it
  Sorry I didn't like this phone

我想使用python将这些数据格式化为以下格式。你知道吗

Review Title : Very poor
Upvotes : 1
Downvotes : 0
Review Content : Hank all time this device ... fews day speakar sound not clear output

Review Title : Don't waste your money
Upvotes : 1
Downvotes : 1
Review Content : Don't buy this product , its not good .just a waste of money.it starts showing small defects from starting few months of use and then after one year after warranty is over its mother was not working .and u can .ever fix it Sorry I didn't like this phone

我想把文字移到冒号后面,但我不知道怎么做。你知道吗


Tags: andoftitlenotitcontentthisreview
1条回答
网友
1楼 · 发布于 2024-04-25 20:08:25
import re

text = '''your_text_here'''

text = re.sub("Review Content :\s+", "Review Content : ", text)
text = re.sub("Review Title : ", "\n\nReview Title : ", text)
text = text.strip()

print(text)

使用re library可以更轻松地对字符串进行操作:

  • 第一个sub将“Review Content”后面的空白字符链替换为1个空格。感谢你有内容在同一行作为“审查内容”标签
  • 第二个sub在“Review Title”标签前添加2个换行符
  • strip()删除字符串开头和结尾的空白,这有效地删除了前一步中第一个“审阅标题”前面添加的两个换行符

相关问题 更多 >