转换我的数据帧，其中每一行包含每个句子的元组列表问题的回答

转换我的数据帧，其中每一行包含每个句子的元组列表

回答此问题可获得 20 贡献值，回答如果被采纳可获得 50 分。

我想用python读取<code>.dat</code>文件，我尝试了不同的读取方法，最后我得到了以下代码： <pre><code>datContent = open("..\\data\\train.dat.abs", 'r') MyList=[] for line in datContent: print(line) </code></pre> 将打开此表单中的内容： <pre><code>1 Should O 2 students O 3 be O 4 taught O 5 to O 6 compete O 7 or O 8 to O 9 cooperate O 10 ? O ------------------> THIS SHOWS, STARTING OF THE NEXT SENTENCES 1 It O 2 is O 3 always O 4 said O 5 that O 6 competition O 7 can O 8 effectively O 9 promote O 10 the O 11 development O 12 of O 13 economy O 14 . O </code></pre> 但是我想提取第一列和第二列作为元组列表： <pre><code>[(Should, O), (students,O), (be,O), (taught O), (to,O), (compete,O), (or,O), (to,O), (cooperate,O), (? O)] </code></pre> 每个句子（句子以原始格式用空格签名）是数据帧的一行。我试过分开。我已通过以下方式完成此项工作： <pre><code>datContent = open("..\\data\\train.dat.abs", 'r', encoding='utf-8' ) MyList=[] for line in datContent: a=line.split() print(a) </code></pre> 结果是： <pre><code>['1', 'Should', 'O'] ['2', 'students', 'O'] ['3', 'be', 'O'] ['4', 'taught', 'O'] ['5', 'to', 'O'] ['6', 'compete', 'O'] ['7', 'or', 'O'] ['8', 'to', 'O'] ['9', 'cooperate', 'O'] ['10', '?', 'O'] [] ['1', 'It', 'O'] ['2', 'is', 'O'] ['3', 'always', 'O'] ['4', 'said', 'O'] ['5', 'that', 'O'] ['6', 'competition', 'O'] ['7', 'can', 'O'] ['8', 'effectively', 'O'] ['9', 'promote', 'O'] ['10', 'the', 'O'] ['11', 'development', 'O'] ['12', 'of', 'O'] ['13', 'economy', 'O'] ['14', '.', 'O'] </code></pre> 正如我告诉你的，我想保存： <pre><code>[(Should, O), (students,O), (be,O), (taught O), (to,O), (compete,O), (or,O), (to,O), (cooperate,O), (? O)] </code></pre> 作为一行数据帧（基本上是上面每个列表的第2、3项）和您看到的<code>[]</code>分隔发送的 df <pre><code>row 1= [(Should, O), (students,O), (be,O), (taught O), (to,O), (compete,O), (or,O), (to,O), (cooperate,O), (? O)] row 2= ... </code></pre> 等等

0 条评论
分类：Python问答

默认排序时间排序

1 个回答

匿名 1天前

　擅长：python、mysql、java

转换我的数据帧，其中每一行包含每个句子的元组列表

1 个回答

相关Python问题