如何从列表中过滤特定的POS标签并将其分离为列表？

data = [[('User', 'NNP'), ('is', 'VBZ'), ('not', 'RB'), ('able', 'JJ'), ('to', 'TO'), ('order', 'NN'), ('products', 'NNS'), ('from', 'IN'), ('iShopCatalog', 'NN'), ('Coala', 'NNP'), ('excluding', 'VBG'), ('articles', 'NNS'), ('from', 'IN'), ('VWR', 'NNP')], [('Arfter', 'NNP'), ('transferring', 'VBG'), ('the', 'DT'), ('articles', 'NNS'), ('from', 'IN'), ('COALA', 'NNP'), ('to', 'TO'), ('SRM', 'VB'), ('the', 'DT'), ('Category', 'NNP'), ('S9901', 'NNP'), ('Dummy', 'NNP'), ('is', 'VBZ'), ('maintained', 'VBN')], [('Due', 'JJ'), ('to', 'TO'), ('this', 'DT'), ('the', 'DT'), ('user', 'NN'), ('is', 'VBZ'), ('not', 'RB'), ('able', 'JJ'), ('to', 'TO'), ('order', 'NN'), ('the', 'DT'), ('product', 'NN')], [('All', 'DT'), ('other', 'JJ'), ('users', 'NNS'), ('can', 'MD'), ('order', 'NN'), ('these', 'DT'), ('articles', 'NNS')], [('She', 'PRP'), ('can', 'MD'), ('order', 'NN'), ('other', 'JJ'), ('products', 'NNS'), ('from', 'IN'), ('a', 'DT'), ('POETcatalog', 'NNP'), ('without', 'IN'), ('any', 'DT'), ('problems', 'NNS')], [('Furtheremore', 'IN'), ('she', 'PRP'), ('is', 'VBZ'), ('able', 'JJ'), ('to', 'TO'), ('order', 'NN'), ('products', 'NNS'), ('from', 'IN'), ('the', 'DT'), ('Vendor', 'NNP'), ('VWR', 'NNP'), ('through', 'IN'), ('COALA', 'NNP')], [('But', 'CC'), ('articles', 'NNS'), ('from', 'IN'), ('all', 'DT'), ('other', 'JJ'), ('suppliers', 'NNS'), ('are', 'VBP'), ('not', 'RB'), ('orderable', 'JJ')], [('I', 'PRP'), ('already', 'RB'), ('spoke', 'VBD'), ('to', 'TO'), ('anic', 'VB'), ('who', 'WP'), ('maintain', 'VBP'), ('the', 'DT'), ('catalog', 'NN'), ('COALA', 'NNP'), ('and', 'CC'), ('they', 'PRP'), ('said', 'VBD'), ('that', 'IN'), ('the', 'DT'), ('reason', 'NN'), ('should', 'MD'), ('be', 'VB'), ('the', 'DT'), ('assignment', 'NN'), ('of', 'IN'), ('the', 'DT'), ('plant', 'NN')], [('User', 'NNP'), ('is', 'VBZ'), ('a', 'DT'), ('assinged', 'JJ'), ('to', 'TO'), ('Universitaet', 'NNP'), ('Regensburg', 'NNP'), ('in', 'IN'), ('Scout', 'NNP'), ('but', 'CC'), ('in', 'IN'), ('P17', 'NNP'), ('table', 'NN'), ('YESRMCDMUSER01', 'NNP'), ('she', 'PRP'), ('is', 'VBZ'), ('assigned', 'VBN'), ('to', 'TO'), ('company', 'NN'), ('001500', 'CD'), ('Merck', 'NNP'), ('KGaA', 'NNP')], [('Please', 'NNP'), ('find', 'VB'), ('attached', 'JJ'), ('some', 'DT'), ('screenshots', 'NNS')]]

['User', 'Coala', 'VWR', 'Arfter', 'COALA', 'Category', 'S9901', 'Dummy', 'POETcatalog', 'Vendor', 'VWR', 'COALA', 'COALA', 'User', 'Universitaet', 'Regensburg', 'Scout', 'P17', 'YESRMCDMUSER01', 'Merck', 'KGaA', 'Please']

[['User', 'Coala', 'VWR'] ['Arfter', 'COALA', 'Category', 'S9901', 'Dummy'] [], [], ['POETcatalog'], ['Vendor', 'VWR', 'COALA'], [], ['COALA'], ['User', 'Universitaet', 'Regensburg', 'Scout', 'P17', 'YESRMCDMUSER01', 'Merck', 'KGaA'], ['Please']]

2条回答

网友

1楼 · 编辑于 2024-05-16 23:51:35

您可以使用以下列表理解：

[[j[0] for j in i if j[-1]=="NNP"] for i in data]

输出：

^{pr2}$

网友

2楼 · 编辑于 2024-05-16 23:51:35

列表理解是一种方法。但是@麦迪的回答可能有点难以理解。在

下面是一个更易于阅读的解决方案：

document = [[('User', 'NNP'), ('is', 'VBZ'), ('not', 'RB'), ('able', 'JJ'), ('to', 'TO'), ('order', 'NN'), ('products', 'NNS'), ('from', 'IN'), ('iShopCatalog', 'NN'), ('Coala', 'NNP'), ('excluding', 'VBG'), ('articles', 'NNS'), ('from', 'IN'), ('VWR', 'NNP')], [('Arfter', 'NNP'), ('transferring', 'VBG'), ('the', 'DT'), ('articles', 'NNS'), ('from', 'IN'), ('COALA', 'NNP'), ('to', 'TO'), ('SRM', 'VB'), ('the', 'DT'), ('Category', 'NNP'), ('S9901', 'NNP'), ('Dummy', 'NNP'), ('is', 'VBZ'), ('maintained', 'VBN')], [('Due', 'JJ'), ('to', 'TO'), ('this', 'DT'), ('the', 'DT'), ('user', 'NN'), ('is', 'VBZ'), ('not', 'RB'), ('able', 'JJ'), ('to', 'TO'), ('order', 'NN'), ('the', 'DT'), ('product', 'NN')], [('All', 'DT'), ('other', 'JJ'), ('users', 'NNS'), ('can', 'MD'), ('order', 'NN'), ('these', 'DT'), ('articles', 'NNS')], [('She', 'PRP'), ('can', 'MD'), ('order', 'NN'), ('other', 'JJ'), ('products', 'NNS'), ('from', 'IN'), ('a', 'DT'), ('POETcatalog', 'NNP'), ('without', 'IN'), ('any', 'DT'), ('problems', 'NNS')], [('Furtheremore', 'IN'), ('she', 'PRP'), ('is', 'VBZ'), ('able', 'JJ'), ('to', 'TO'), ('order', 'NN'), ('products', 'NNS'), ('from', 'IN'), ('the', 'DT'), ('Vendor', 'NNP'), ('VWR', 'NNP'), ('through', 'IN'), ('COALA', 'NNP')], [('But', 'CC'), ('articles', 'NNS'), ('from', 'IN'), ('all', 'DT'), ('other', 'JJ'), ('suppliers', 'NNS'), ('are', 'VBP'), ('not', 'RB'), ('orderable', 'JJ')], [('I', 'PRP'), ('already', 'RB'), ('spoke', 'VBD'), ('to', 'TO'), ('anic', 'VB'), ('who', 'WP'), ('maintain', 'VBP'), ('the', 'DT'), ('catalog', 'NN'), ('COALA', 'NNP'), ('and', 'CC'), ('they', 'PRP'), ('said', 'VBD'), ('that', 'IN'), ('the', 'DT'), ('reason', 'NN'), ('should', 'MD'), ('be', 'VB'), ('the', 'DT'), ('assignment', 'NN'), ('of', 'IN'), ('the', 'DT'), ('plant', 'NN')], [('User', 'NNP'), ('is', 'VBZ'), ('a', 'DT'), ('assinged', 'JJ'), ('to', 'TO'), ('Universitaet', 'NNP'), ('Regensburg', 'NNP'), ('in', 'IN'), ('Scout', 'NNP'), ('but', 'CC'), ('in', 'IN'), ('P17', 'NNP'), ('table', 'NN'), ('YESRMCDMUSER01', 'NNP'), ('she', 'PRP'), ('is', 'VBZ'), ('assigned', 'VBN'), ('to', 'TO'), ('company', 'NN'), ('001500', 'CD'), ('Merck', 'NNP'), ('KGaA', 'NNP')], [('Please', 'NNP'), ('find', 'VB'), ('attached', 'JJ'), ('some', 'DT'), ('screenshots', 'NNS')]]
output = [[word for word, pos in sentence if pos=='NNP'] for sentence in document]

如果您喜欢更简洁的代码，并且可以围绕嵌套列表理解进行思考，https://stackoverflow.com/a/3633145/610569：

^{pr2}$

相关问题更多 >

编程相关推荐

热门问题

热门文章