Pandas将行合并在一起以获得相同长度的字符串

2024-06-01 02:21:42 发布

您现在位置:Python中文网/ 问答频道 /正文

我想合并连续的行,这样每一行的文本长度(以单词为单位)都是相似的

考虑数据文件:

import pandas as pd

df = pd.DataFrame(data=[['Tyger Tyger burning.', 3],
                    ['bright.', 1],
                   ['In the forests of the night.', 6],
                   ['What immortal.', 2],  
                   ['hand or eye could frame thy fearful symmetry.',  8],
                   ['In what distant deeps or skies.', 6],
                   ['Burnt the fire of thine eyes.', 6]
                    ],columns=['SENTENCE','NO_WORDS'])
                                        SENTENCE  NO_WORDS
0                           Tyger Tyger burning.         3
1                                        Bright.         1
2                   In the forests of the night.         6
3                                 What immortal.         2
4  Hand or eye could frame thy fearful symmetry.         8
5                In what distant deeps or skies.         6
6                  Burnt the fire of thine eyes.         6

期望的输出是每行至少有10个字,但一旦满足此条件,就不应再合并其他行。句子不能分开

                                            SENTENCE  NO_WORDS
0  Tyger Tyger burning. Bright. In the forests of...        10
1  What immortal. Hand or eye could frame thy fea...        10
2  In what distant deeps or skies. Burnt the fire...        12

先谢谢你


Tags: oroftheinwhatframecouldeye