从另一个列表中的两个列表中查找2个项目的Pythonic方法

happy_set = [":)",":-)","=)",":D",":-D","=D"] sad_set = [":(",":-(","=("] happy = [tweet.split() for tweet in data for face in happy_set if face in tweet] sad = [tweet.split() for tweet in data for face in sad_set if face in tweet]

3条回答

网友

1楼 · 编辑于 2024-05-21 00:47:29

您可以尝试使用集合，特别是set.isdisjoint。检查happy tweet中的令牌集是否与sad_set不相交。如果是这样，它肯定属于happy：

happy_set = set([":)",":-)","=)",":D",":-D","=D"])
sad_set = set([":(",":-(","=("])

# happy is your existing set of potentially happy tweets. To remove any tweets with sad tokens...
happy = [tweet for tweet in happy if sad_set.isdisjoint(set(tweet.split()))]

网友

2楼 · 编辑于 2024-05-21 00:47:29

我会使用lambdas：

>>> is_happy = lambda tweet: any(map(lambda x: x in happy_set, tweet.split()))
>>> is_sad = lambda tweet: any(map(lambda x: x in sad_set, tweet.split()))

>>> data = ["Hi, I am sad :( but don't worry =D", "Happy day :-)", "Boooh :-("]
>>> filter(lambda tweet: is_happy(tweet) and not is_sad(tweet), data)
['Happy day :-)']
>>> filter(lambda tweet: is_sad(tweet) and not is_happy(tweet), data)
['Boooh :-(']

这将避免创建data的中间副本。你知道吗

如果data真的很大，你可以用包itertools中的ifilter替换filter，得到迭代器而不是列表。你知道吗

网友

3楼 · 编辑于 2024-05-21 00:47:29

是你要找的吗？你知道吗

happy_set = set([":)",":-)","=)",":D",":-D","=D"])
sad_set = set([":(",":-(","=("])

happy_maybe_sad = [tweet.split() for tweet in data for face in happy_set if face in tweet]
sad_maybe_happy = [tweet.split() for tweet in data for face in sad_set if face in tweet]

happy = [item for item in happy_maybe_sad if not in sad_maybe_happy]
sad = [item for item in sad_maybe_happy if not in happy_maybe_sad]

对于happy...和sad...，我坚持使用列表解决方案，因为项目的顺序可能是相关的。如果不是，那么使用^{}来实现性能可能更好。是加法，集合已经提供了basic sets operations（并集、交集等）

相关问题更多 >

编程相关推荐

热门问题

热门文章